Nov 302005
 

FreshPorts not only keeps track of changes to the FreeBSD ports tree, it also keeps track of when ports move around and any special notes regarding upgrading. This information is obtained from MOVED and UPDATING respectively.

Tonight I was chatting with Edwin Groothuis about MOVED. We got to talking how FreshPorts parsed this file. In short, it does this:

EmptyMoved($dbh);

parsefile($dbh);

Yes, it deletes everything from the table and adds everything in from the file. That’s simple to code, but not very efficient when it comes to database IO. The table in question is the ports_moved table:

freshports.org=# \d ports_moved
                              Table "public.ports_moved"
    Column    |  Type   |                          Modifiers
--------------+---------+-------------------------------------------------------------
 id           | integer | not null default nextval('public.ports_moved_id_seq'::text)
 from_port_id | integer | not null
 to_port_id   | integer |
 date         | date    | not null
 reason       | text    | not null
Indexes:
    "ports_moved_pkey" primary key, btree (id)
Foreign-key constraints:
    "$2" FOREIGN KEY (to_port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE
    "$1" FOREIGN KEY (from_port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=#

As you can see, there’s only four elements in there. It’s pretty simple to read. And only about 2000 rows at the time of writing.

It should be fairly each to modify the script to become more efficient. Here’s what I typed in the IRC channel with Edwin:

Should be easy enough. Read all the data first. Then then parsing the file, look in the cache to see what’s there. If it’s already there, update… and mark it in the cache as being processed. If not there, insert. Anything not marked in the cache after processing should be deleted from the table.

The same thing could be applied to the ports_updating table:

freshports.org=# \d ports_updating
                           Table "public.ports_updating"
 Column  |  Type   |                           Modifiers
---------+---------+----------------------------------------------------------------
 id      | integer | not null default nextval('public.ports_updating_id_seq'::text)
 date    | date    | not null
 affects | text    | not null
 author  | text    |
 reason  | text    | not null
Indexes:
    "ports_updating_pkey" primary key, btree (id)

freshports.org=#

Hmmm, I’m glad I wrote this. One day I’ll use this to speed things up a bit.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive