May 252012
 

Yesterday, a commit came along which made FreshPorts quite upset. To be realistic, it wasn’t upset at all. It processed everything just nicely. Just not the right way.

The commit in question (201205250144.q4P1i4Hc009503@repoman.freebsd.org) contained some pathnames such as this:

./databases/qt4-sql  pkg-plist 
./devel/dbus-qt4     Makefile pkg-plist 
./devel/qmake4       Makefile pkg-plist 
./devel/qmake4/files Makefile.bsd 
./devel/qt4          distinfo 
./devel/qt4/files    patch-configure 
./devel/qt4-assistant Makefile 
./devel/qt4-corelib  pkg-message pkg-plist 

FreshPorts interpreted the . as the category name. There is no code in FreshPorts to normalize pathnames found in the commit message.

The short term solution was to manually edit the incoming email commit message to change ./ to / and resubmit the email into the queue. This worked.

However, the database contains some lingering artifacts which I discovered when other statistical gathering functions started kicking in. Namely: a category without commits. Hmmm.

freshports.org=# SELECT categories.id AS category_id, count(ports_active.id) AS
freshports.org-# count, max(commit_log.commit_date) AS updated FROM categories, ports_active left
freshports.org-# outer join commit_log on ( ports_active.last_commit_id = commit_log.id ) WHERE
freshports.org-# categories.id = ports_active.category_id AND categories.is_primary GROUP BY
freshports.org-# categories.id, categories.name, categories.description, is_primary,
freshports.org-# categories.element_id ;
 category_id | count |        updated
-------------+-------+------------------------
          32 |   857 | 2012-05-25 13:37:26+00
          36 |    94 | 2012-05-20 12:04:26+00
          10 |  4060 | 2012-05-25 14:00:06+00
           8 |  1234 | 2012-05-25 14:03:08+00
           7 |   578 | 2012-05-25 02:15:21+00
          44 |    46 | 2012-05-25 02:15:21+00
          27 |   154 | 2012-05-22 19:16:10+00
          41 |   172 | 2012-05-25 02:15:21+00
           5 |   963 | 2012-05-25 02:15:21+00
           3 |  1115 | 2012-05-25 10:50:07+00
          30 |    42 | 2012-05-09 20:13:43+00
          28 |    60 | 2012-05-25 02:15:21+00
          35 |   103 | 2012-05-25 01:50:36+00
          14 |   182 | 2012-05-22 19:16:10+00
          82 |    25 | 2012-05-25 02:15:21+00
          15 |   381 | 2012-05-25 02:15:21+00
          98 |   120 | 2012-05-25 01:50:36+00
          34 |   149 | 2012-05-23 11:32:46+00
          23 |   209 | 2012-05-25 02:15:21+00
          52 |   387 | 2012-05-25 13:05:50+00
          20 |  1047 | 2012-05-25 10:44:36+00
          92 |   322 | 2012-05-25 12:14:15+00
          37 |    43 | 2012-05-25 02:15:21+00
         102 |    81 | 2012-05-25 11:09:45+00
         120 |     9 |
          39 |   146 | 2012-05-25 07:45:08+00

Let’s see more on that category:

freshports.org=# select * from categories where id = 120;
 id  | is_primary | element_id | name | description
-----+------------+------------+------+-------------
 120 | t          |     248112 | .    |
(1 row)

Oh. The category name is .

Look at the element:

freshports.org=# select * from element where id = 248112;
   id   | name | parent_id | directory_file_flag | status
--------+------+-----------+---------------------+--------
 248112 | .    |         1 | D                   | A
(1 row)

OK, what children exist under this new category?

freshports.org=# select * from element where parent_id = 248112;
   id   |     name     | parent_id | directory_file_flag | status
--------+--------------+-----------+---------------------+--------
 428448 | x11          |    248112 | D                   | A
 428440 | x11-toolkits |    248112 | D                   | A
 428426 | www          |    248112 | D                   | A
 428423 | net          |    248112 | D                   | A
 428420 | multimedia   |    248112 | D                   | A
 428413 | misc         |    248112 | D                   | A
 428410 | graphics     |    248112 | D                   | A
 428375 | devel        |    248112 | D                   | A
 428372 | databases    |    248112 | D                   | A
 248113 | Mk           |    248112 | D                   | A
(10 rows)

Oh oh. We have to clean this up.

Are these elements referred to by any commits?

freshports.org=# select * from commit_log_elements where element_id in (select id from element where parent_id = 248112);
 id | commit_log_id | element_id | revision_name | change_type
----+---------------+------------+---------------+-------------
(0 rows)

No. Good start.

What is in those directories?

freshports.org=# select * from element where parent_id in (select id from element where parent_id = 248112);
   id   |       name       | parent_id | directory_file_flag | status
--------+------------------+-----------+---------------------+--------
 428421 | qt4-phonon       |    428420 | D                   | A
 248114 | bsd.sites.mk     |    248113 | F                   | A
 428411 | qt4-imageformats |    428410 | D                   | A
 428424 | qt4-network      |    428423 | D                   | A
 428406 | qt4              |    428375 | D                   | A
 428403 | qt4-script       |    428375 | D                   | A
 428400 | qt4-qvfb         |    428375 | D                   | A
 428397 | qt4-qdoc3        |    428375 | D                   | A
 428395 | qt4-linguist     |    428375 | D                   | A
 428393 | qt4-designer     |    428375 | D                   | A
 428391 | qt4-declarative  |    428375 | D                   | A
 428386 | qt4-corelib      |    428375 | D                   | A
 428384 | qt4-assistant    |    428375 | D                   | A
 428379 | qmake4           |    428375 | D                   | A
 428376 | dbus-qt4         |    428375 | D                   | A
 428427 | qt4-webkit       |    428426 | D                   | A
 428418 | qt4-qtdemo       |    428413 | D                   | A
 428416 | qt4-l10n         |    428413 | D                   | A
 428414 | qt4-doc          |    428413 | D                   | A
 428441 | qt4-gui          |    428440 | D                   | A
 428373 | qt4-sql          |    428372 | D                   | A
 428449 | qt4-opengl       |    428448 | D                   | A
(22 rows)

Oh. Let’s check those and see if they are in any commits.

freshports.org=# select * from commit_log_elements where element_id in (select id from element where parent_id in (select id from element where parent_id = 248112));
   id   | commit_log_id | element_id | revision_name | change_type
--------+---------------+------------+---------------+-------------
 802882 |        226132 |     248114 | 1.373         | M
(1 row)

Oh oh. We do have one commit…

freshports.org=# select * from commit_log where id = 226132;
   id   |                   message_id                    |      message_date      |           message_subject           |          date_added           |      comit_date       | committer |                             description                              | system_id | encoding_losses
--------+-------------------------------------------------+------------------------+-------------------------------------+-------------------------------+-----------------------+-----------+----------------------------------------------------------------------+-----------+-----------------
 226132 | 200607021527.k62FRncw031061@repoman.freebsd.org | 2006-07-02 15:27:49+00 | cvs commit: ports/./Mk bsd.sites.mk | 2006-07-02 16:06:07.165396+00 | 2006-0702 15:27:49+00 | arved     | Move ftp.gnupg.org to the bottom of MASTER_SITE_GNUPG, as it is very |         1 | f
                                                                                                                                                                                             : slow
(1 row)

Eh? 2006? You can view that commit here.

OK, what we need to do is alter that commit to refer to /Mk/bsd.sites.mk instead of ./Mk/bsd.sites.mk

Yes, in a filesystem, those two are identical, relative to the ports directory. However, the FreshPorts scripts do not normalize the incoming pathnames found in commit messages. I see that I have underestimated things. :)

Taking the element_id from two queries ago, we can see the pathname:

freshports.org=# select *, element_pathname(id) from element where id = 248114;
   id   |     name     | parent_id | directory_file_flag | status |     element_pathname
--------+--------------+-----------+---------------------+--------+--------------------------
 248114 | bsd.sites.mk |    248113 | F                   | A      | /ports/./Mk/bsd.sites.mk
(1 row)

OK, what element_id should we be using instead of 248114?

freshports.org=# select *, element_pathname(id) from element where name = 'bsd.sites.mk';
   id   |     name     | parent_id | directory_file_flag | status |                element_pathname
--------+--------------+-----------+---------------------+--------+-------------------------------------------------
  16449 | bsd.sites.mk |      4105 | F                   | A      | /ports/x11-toolkits/gtkstep-pastel/bsd.sites.mk
  77711 | bsd.sites.mk |     77070 | F                   | A      | /ports/Mk/bsd.sites.mk
 248114 | bsd.sites.mk |    248113 | F                   | A      | /ports/./Mk/bsd.sites.mk
(3 rows)

There, we should change all instances of 248114 to 77711.

freshports.org=# update commit_log_elements set element_id = 77711 where element_id = 248114;
ERROR:  insert or update on table "commit_log_elements" violates foreign key constraint "$2"
DETAIL:  Key (element_id,revision_name)=(77711,1.373) is not present in table "element_revision".

Oh. I’d completely forgotten that the element_revision table exists. This table records all revision that occurs for a given element. Entries in the commit_log_element table must have a corresponding entry in that table.

The solution follows:

freshports.org=# begin;
BEGIN
freshports.org=# update element_revision set element_id = 77711 where element_id = 248114;
UPDATE 1
freshports.org=# select * from commit_log_elements where element_id = 248114;
 id | commit_log_id | element_id | revision_name | change_type
----+---------------+------------+---------------+-------------
(0 rows)

freshports.org=# select * from commit_log_elements where element_id = 77711 and revision_name = '1.373';
   id   | commit_log_id | element_id | revision_name | change_type
--------+---------------+------------+---------------+-------------
 802882 |        226132 |      77711 | 1.373         | M
(1 row)

freshports.org=# commit;
COMMIT
freshports.org=#

Yes, updating the element_revision triggered a change on the commit_log_elements table. You can see that trigger here:

freshports.org=# \d element_revision
   Table "public.element_revision"
    Column     |  Type   | Modifiers
---------------+---------+-----------
 element_id    | integer | not null
 revision_name | text    | not null
Indexes:
    "element_revision_pkey" PRIMARY KEY, btree (element_id, revision_name)
Foreign-key constraints:
    "$1" FOREIGN KEY (element_id) REFERENCES element(id) ON UPDATE CASCADE ON DELETE CASCADE
Referenced by:
    TABLE "system_branch_element_revision" CONSTRAINT "$1" FOREIGN KEY (element_id, 
       revision_name) REFERENCES element_revision(element_id, revision_name) ON UPDATE CASCADE ON 
       DELETE CASCADE
    TABLE "commit_log_elements" CONSTRAINT "$2" FOREIGN KEY (element_id, revision_name) 
       REFERENCES element_revision(element_id, revision_name) ON UPDATE CASCADE ON DELETE 
       CASCADE
freshports.org=#

The above changes were run on my dev system. They will be repeated on beta and production.

NOTE: given that element is a recursive table, we need to check for other references to files under . before we delete it. The following query goes one layer down:

freshports.org=# select * from element where parent_id in (select id from element where parent_id in (select id from element where parent_id = 248112));
   id   |    name     | parent_id | directory_file_flag | status
--------+-------------+-----------+---------------------+--------
 428419 | pkg-plist   |    428418 | F                   | A
 428394 | Makefile    |    428393 | F                   | A
 428417 | pkg-plist   |    428416 | F                   | A
 428422 | Makefile    |    428421 | F                   | A
 428396 | Makefile    |    428395 | F                   | A
 428383 | pkg-plist   |    428379 | F                   | A
 428381 | files       |    428379 | D                   | A
 428380 | Makefile    |    428379 | F                   | A
 428447 | pkg-plist   |    428441 | F                   | A
 428446 | pkg-message |    428441 | F                   | A
 428443 | files       |    428441 | D                   | A
 428442 | Makefile    |    428441 | F                   | A
 428385 | Makefile    |    428384 | F                   | A
 428399 | pkg-plist   |    428397 | F                   | A
 428398 | Makefile    |    428397 | F                   | A
 428451 | pkg-plist   |    428449 | F                   | A
 428450 | Makefile    |    428449 | F                   | A
 428415 | pkg-plist   |    428414 | F                   | A
 428425 | pkg-plist   |    428424 | F                   | A
 428392 | pkg-plist   |    428391 | F                   | A
 428378 | pkg-plist   |    428376 | F                   | A
 428377 | Makefile    |    428376 | F                   | A
 428401 | files       |    428400 | D                   | A
 428439 | pkg-plist   |    428427 | F                   | A
 428429 | files       |    428427 | D                   | A
 428428 | Makefile    |    428427 | F                   | A
 428412 | pkg-plist   |    428411 | F                   | A
 428374 | pkg-plist   |    428373 | F                   | A
 428390 | pkg-plist   |    428386 | F                   | A
 428389 | pkg-message |    428386 | F                   | A
 428387 | files       |    428386 | D                   | A
 428404 | files       |    428403 | D                   | A
 428408 | files       |    428406 | D                   | A
 428407 | distinfo    |    428406 | F                   | A
(34 rows)

Most of these are files (‘F’) but there are a few directories (‘D’). But let’s look at this level for commits:

freshports.org=# select * from commit_log_elements where element_id in (select id from element where parent_id in (select id from element where parent_id in (select id from element where parent_id = 248112)));
 id | commit_log_id | element_id | revision_name | change_type
----+---------------+------------+---------------+-------------
(0 rows)

Nothing. Good. Now look one level lower:

freshports.org=# select * from element where parent_id in (select id from element where parent_id in (select id from element where parent_id in (select id from element where parent_id = 248112)));
   id   |                                         name                                         | parent_id | directory_file_flag | status
--------+--------------------------------------------------------------------------------------+-----------+---------------------+--------
 428382 | Makefile.bsd                                                                         |    428381 | F                   | A
 428444 | patch-src__gui__image__qnativeimage.cpp                                              |    428443 | F                   | D
 428445 | patch-src__gui__text__qtextlayout.cpp                                                |    428443 | F                   | A
 428402 | patch-tools__qvfb__qvfb.pro                                                          |    428401 | F                   | A
 428388 | patch-configure                                                                      |    428387 | F                   | A
 428405 | patch-src__3rdparty__javascriptcore__JavaScriptCore__interpreter__Interpreter.cpp    |    428404 | F                   | A
 428430 | patch-src-3rdparty-javascriptcore-JavaScriptCore-jit-JITStubs.cpp                    |    428429 | F                   | D
 428431 | patch-src-3rdparty-webkit-JavaScriptCore-jit-JITStubs.cpp                            |    428429 | F                   | D
 428432 | patch-src-3rdparty-webkit-JavaScriptCore-runtime_JSValue.h                           |    428429 | F                   | D
 428433 | patch-src__3rdparty__javascriptcore__JavaScriptCore__jit__JITStubs.cpp               |    428429 | F                   | A
 428434 | patch-src__3rdparty__webkit__JavaScriptCore__wtf__Platform.h                         |    428429 | F                   | D
 428435 | patch-src__3rdparty__webkit__Source__JavaScriptCore__runtime__JSValueInlineMethods.h |    428429 | F                   | A
 428436 | patch-src__3rdparty__webkit__Source__WebKit__qt__QtWebKit.pro                        |    428429 | F                   | A
 428437 | patch-src__3rdparty__webkit__WebCore__WebCore.pro                                    |    428429 | F                   | D
 428438 | patch-src__3rdparty__webkit__WebCore__bindings__js__JSDOMBinding.h                   |    428429 | F                   | D
 428409 | patch-configure                                                                      |    428408 | F                   | A
(16 rows)

Good. All files No need to go lower. But are there any commits referring to those elements?

freshports.org=# select * from commit_log_elements where element_id in (select id from element where parent_id in (select id from element where parent_id in (select id from element where parent_id in (select id from element where parent_id = 248112))));
 id | commit_log_id | element_id | revision_name | change_type
----+---------------+------------+---------------+-------------
(0 rows)

Ahh, good. Now I can start deleting. You will recall that we want to delete the . category, which is element = 248112.

freshports.org=# select * from element where name = '.'; id | name | parent_id | directory_file_flag | status --------+------+-----------+---------------------+-------- 3119 | . | 3118 | F | A 48031 | . | 48025 | F | A 48595 | . | 7191 | F | A 48723 | . | 1887 | F | D 52061 | . | 52060 | F | A 64818 | . | 429 | F | A 70260 | . | 68759 | F | A 71078 | . | 71077 | F | A 74973 | . | 7621 | F | A 76168 | . | 7404 | F | A 248112 | . | 1 | D | A (11 rows)

Oh no. There are 11 elements in the tree which name = .

I see I have more work to do. Do those other elements have children?

freshports.org=# select * from element where parent_id in (select id from element where name = '.' and id != 248112);
 id | name | parent_id | directory_file_flag | status
----+------+-----------+---------------------+--------
(0 rows)

Phew! No, they do not. I think I can just delete away.

freshports.org=# begin;
BEGIN
freshports.org=# select count(*) from element;
 count
--------
 424617
(1 row)

freshports.org=# select count(*) from ports;
 count
-------
 31182
(1 row)

freshports.org=# select count(*) from categories;
 count
-------
   108
(1 row)

freshports.org=# delete from element where name = '.';
DELETE 11
freshports.org=# select count(*) from element;
 count
--------
 424524
(1 row)

freshports.org=# select count(*) from ports;
 count
-------
 31173
(1 row)

freshports.org=# select count(*) from categories;
 count
-------
   107
(1 row)

freshports.org=# commit;
COMMIT
freshports.org=#

I did some math before I hit commit. We have deleted 9 ports. That matches up with the 9 directories found in the fourth query listed in this post. We have also deleted one port. That was the goal.

Done.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive