Dan Langille

Oct 062018
 

From IRC:

[Oct 05 22:41] <@dvl> zi: we have conflict matches: https://dev.freshports.org/security/openssl/
[Oct 06 09:08] <@zi> dvl: cute. possible to add in a clickable link to take you to that matching port? :)
[Oct 06 10:50] <@dvl> zi: That matching port, it’s just a package name, sometimes there is a matching port, sometimes there is not… But yeah, let’s see.

So let’s see what happens.

I started off with this:

freshports.dev=# select id, packagename(id), element_pathname(element_id) from ports where package_name = 'libressl';
  id   | packagename |             element_pathname             
-------+-------------+------------------------------------------
 48916 | libressl    | /ports/branches/2018Q2/security/libressl
 40767 | libressl    | /ports/branches/2017Q1/security/libressl
 43361 | libressl    | /ports/branches/2017Q2/security/libressl
 34863 | libressl    | /ports/head/security/libressl
(4 rows)

What about just head?

Let’s join to the element_pathname table and use LIKE to get what we need.

freshports.dev=# SELECT P.id, packagename(P.id), element_pathname(P.element_id)
freshports.dev-#   FROM ports P JOIN element_pathname EP ON P.element_id = EP.element_id
freshports.dev-#  WHERE package_name = 'libressl'
freshports.dev-#    AND EP.pathname LIKE '/ports/head/%';
  id   | packagename |       element_pathname        
-------+-------------+-------------------------------
 34863 | libressl    | /ports/head/security/libressl
(1 row)

That’s better. But we need category and port name too.

freshports.dev=# SELECT P.id AS port_id, C.name AS category, packagename(P.id) as port, element_pathname(P.element_id)
  FROM ports P JOIN element_pathname EP ON P.element_id  = EP.element_id
               JOIN categories       C  ON P.category_id = C.id
 WHERE package_name = 'libressl'
   AND EP.pathname LIKE '/ports/head/%';
 port_id | category |   port   |       element_pathname        
---------+----------+----------+-------------------------------
   34863 | security | libressl | /ports/head/security/libressl
(1 row)

That’s it. Let’s create a datatype for that:

  DROP TYPE IF EXISTS GetPortFromPackageName_record CASCADE;
CREATE TYPE GetPortFromPackageName_record AS (
        port_id                 integer,
        category                text,
        port                    text,
        element_pathname        text
);

And next, the function to get it all:

CREATE OR REPLACE FUNCTION GetPortFromPackageName(text) RETURNS SETOF GetPortFromPackageName_RECORD AS $$
DECLARE
        a_PackageName   ALIAS for $1;

        r               GetPortFromPackageName_RECORD%rowtype;

BEGIN
        FOR r IN
                SELECT P.id AS port_id, C.name AS category, packagename(P.id) as port, element_pathname(P.element_id)
                  FROM ports P JOIN element_pathname EP ON P.element_id  = EP.element_id
                               JOIN categories       C  ON P.category_id = C.id
                 WHERE package_name = a_PackageName
                   AND EP.pathname LIKE '/ports/head/%'

        LOOP
                RETURN NEXT r;
        END LOOP;
END $$ LANGUAGE plpgsql;

Now we can get that information via:

freshports.dev=# select * from GetPortFromPackageName('libressl');
 port_id | category |   port   |       element_pathname        
---------+----------+----------+-------------------------------
   34863 | security | libressl | /ports/head/security/libressl
(1 row)

freshports.dev=#

Next step: alter the website to use this data to create the hyperlink.

The existing website code was:

SELECT DISTINCT PackageName(PCM.port_id) as package_name
  FROM ports_conflicts_matches PCM JOIN ports_conflicts PC ON PCM.ports_conflicts_id = PC.id
 WHERE PC.port_id = ' . $this ->{'id'};

The first attempt to use the new function was:

freshports.dev=# SELECT DISTINCT PackageName(PCM.port_id) as package_name, GetPortFromPackageName(PackageName(PCM.port_id))
freshports.dev-#   FROM ports_conflicts_matches PCM JOIN ports_conflicts PC ON PCM.ports_conflicts_id = PC.id
freshports.dev-#  WHERE PC.port_id = 34863;
  package_name  |                       getportfrompackagename                        
----------------+---------------------------------------------------------------------
 libressl-devel | (37855,security,libressl-devel,/ports/head/security/libressl-devel)
 openssl        | (954,security,openssl,/ports/head/security/openssl)
 openssl-devel  | (38174,security,openssl-devel,/ports/head/security/openssl-devel)
 openssl111     | (49920,security,openssl111,/ports/head/security/openssl111)
(4 rows)

freshports.dev=# 

That’s not quite what I needed. I got on IRC and asked. xocolatl provided.

SELECT DISTINCT PackageName(PCM.port_id) as package_name, f.*
  FROM ports_conflicts_matches PCM 
  JOIN ports_conflicts PC ON PCM.ports_conflicts_id = PC.id
  JOIN GetPortFromPackageName(PackageName(PCM.port_id)) AS f ON true
 WHERE PC.port_id = 34863;

That query is now on the dev website.

This is the corresponding change to the display part of the website:

@@ -663,6 +663,12 @@ class port_display {
                        }
 
                        $HTML .= "\n";
+
+                       $HTML .= "Conflicts Matches:\n
    "; + foreach($port->conflicts_matches as $match) { + $HTML .= "
  • conflicts with " . freshports_link_to_port($match['category'], $match['port']) . '
  • '; + } + $HTML .= '
'; }

You can see the results at https://dev.freshports.org/security/openssl/

Soon, this will go into production. Please review any other ports which you know have CONFLICTS.

Oct 042018
 

Today I updated the fp-listen and freshports daemons. These are the scripts which invalidate the cache and process incoming commits, respectively.

No code changes were made, just the installations directories.

Previously, this code when into /var/service but now they go into /usr/local/libexec/freshports-service and I create a symlink from /var/service as as required.

This confirms to the standard method described at https://cr.yp.to/daemontools/faq/create.html for adding/removing services.

Sep 262018
 

This post serves two goals:

  1. document the CONFLICTS implementation within FreshPorts
  2. show how you can use triggers to update a secondary table

The FreeBSD ports tree / package collection has a method for resolving conflicts. This feature allows a port maintainer to specify what, if any, packages conflict with a given port. The reasons why two ports might conflict is outside the scope of this article. For background reading:

Obtaining the values from the Makefile

To get started, let’s look at what I already have in the development instance of the FreshPorts database.

freshports.dev=# SELECT id, packagename(id), conflicts, conflicts_build, conflicts_install FROM ports WHERE coalesce(conflicts, conflicts_build, conflicts_install) IS NOT NULL and packagename(id) like 'bacula%' order by last_commit_id desc limit 10
;
  id   |      packagename      |            conflicts             | conflicts_build | conflicts_install 
-------+-----------------------+----------------------------------+-----------------+-------------------
 45084 | bacula9-docs          | bacula-* bacula5-*               |                 | 
 45086 | bacula9-server        | bacula5-server-* bacula-server-* |                 | 
 45087 | bacula9-bat           | bacula5-server-* bacula-server-* |                 | 
 45083 | bacula9-client        | bacula5-server-* bacula-server-* |                 | 
 45085 | bacula9-client-static | bacula5-server-* bacula-server-* |                 | 
(5 rows)

freshports.dev=# 

The packagename corresponds to the following FreeBSD ports:

  1. bacula9-docs
  2. bacula9-server
  3. bacula9-bat
  4. bacula9-client
  5. bacula9-client-static

In this case, bacula9 ports conflict with older versions of Bacula. How can we get those values from the Makefile? Let’s get the values for bacula9-server. Assume these commands are done in the /var/db/freshports/ports-jail/var/db/repos/PORTS-head/sysutils/bacula9-server directory, which is a non-standard location for ports files, but it suites the needs of the website.

$ grep CONFLICTS *
Makefile:CONFLICTS?=	bacula5-server-* bacula-server-*

NOTE: grep is not an appropriate method for obtaining the values because the values may be contingent upon other configuration settings.

This is the proper method:

$ make -V CONFLICTS -V CONFLICTS_BUILD -V CONFLICTS_INSTALL
bacula5-server-* bacula-server-*

NOTE: those two blank lines correspond to the empty values contained in those files.

globs not regex

CONFLICTS can involve wildcards, such as the * shown above. It is critical to realize that these are globs, meant for matching disk filenames, as opposed to regular expressions.

From the documentation

# CONFLICTS             - A list of package name patterns that the port conflicts
#                                 with, separated by blanks.  The names may include shell
#                                 pattern meta-characters "*", "?", "[", "]", and "!".
#                                 Example: apache*-1.2* apache*-1.3.[012345] apache-*+ssl_*

PostgreSQL has many regex functions. Let’s see about converting globs to regex.

glob to regex

Perl to the rescue. I received some help, as many people do, from RhodiumToad, who supplied this function:

CREATE TRUSTED PROCEDURAL LANGUAGE plperl
    HANDLER plperl_call_handler
    INLINE plperl_inline_handler
    VALIDATOR plperl_validator;
    
COMMENT ON LANGUAGE plperl
    IS 'PL/Perl procedural language';

CREATE FUNCTION public.glob_to_regex(text) RETURNS text
    LANGUAGE plperl
    AS $_$
    my $w = quotemeta shift;
    $w =~ s{\\([*?]|\\(\\?.)|\[(?:\\\!)?(?:\\\])?[^]]*\\\])}
                   {($1 eq '*' ? '.*' :
                         $1 eq '?' ? '.'  :
                         defined($2) ? $2 :
                         do { my $p = $1; $p =~ s/\\([][^-])/$1/g; $p =~ s/^\[\\!/[^/; $p; })}eg;
    return $w;
$_$;

On FreeBSD, I needed to add the databases/postgresql10-server package to provide support for Perl language extensions.

This will take a glob and translate it into a regular expression.

Here are a few examples:

freshports.dev=# select glob_to_regex('bacula-*');
 glob_to_regex 
---------------
 bacula\-.*
(1 row)

freshports.dev=# 


freshports.dev=# select glob_to_regex('postgresql9[^4]*');
   glob_to_regex   
-------------------
 postgresql9[^4].*
(1 row)

freshports.dev=# select glob_to_regex('clang-devel-3.[1234567]*');
        glob_to_regex         
------------------------------
 clang\-devel\-3\.[1234567].*
(1 row)

freshports.dev=# 

The secondary table

As part of earlier work, the ports table was upgraded to add three new fields:

ALTER TABLE ports ADD COLUMN conflicts text;
ALTER TABLE ports ADD COLUMN conflicts_build text;
ALTER TABLE ports ADD COLUMN conflicts_install text;

These columns will store the raw values as obtained directly from the Makefile. The corresponding regex values will be store, one per row, in another table. That table looks like this:

CREATE TABLE ports_conflicts
(
    id BIGSERIAL,
    port_id integer          NOT NULL,
    conflicts_type conflicts NOT NULL,
    item_num integer         NOT NULL,
    item_value text          NOT NULL,
    CONSTRAINT ports_conflicts_pkey PRIMARY KEY (id),
    CONSTRAINT ports_conflicts_port_id FOREIGN KEY (port_id)
        REFERENCES public.ports (id) MATCH SIMPLE
        ON UPDATE CASCADE
        ON DELETE CASCADE
);

What’s that column type on line 5 you might ask.

CREATE TYPE public.conflicts AS ENUM
    ('conflicts', 'conflicts_build', 'conflicts_install');

The item_num column will help to order the values so they correspond to what we find in the the source value. Feedback regarding the accuracy of that claim is most welcome. The ordering is not vital, but it does help to make sure it is consistent when we pull back the values later.

The Trigger

This is the trigger which will populate the ports_conflicts table.

CREATE OR REPLACE FUNCTION ports_conflicts_set() RETURNS TRIGGER AS $$
  BEGIN
    DELETE FROM ports_conflicts WHERE port_id = NEW.id;

    INSERT INTO ports_conflicts(port_id, conflicts_type, item_num, item_value)
      SELECT NEW.id                         AS port_id, 
             'conflicts'::conflicts         AS conflict_type,
             row_number() OVER (ORDER BY item_value),
             glob_to_regex(R.item_value)
        FROM regexp_split_to_table(NEW.conflicts, E'\\s+') AS R(item_value)

      UNION

      SELECT NEW.id                         AS port_id, 
             'conflicts_build'::conflicts   AS conflict_type,
             row_number() OVER (ORDER BY item_value),
             glob_to_regex(R.item_value)
        FROM regexp_split_to_table(NEW.conflicts_build, E'\\s+') AS R(item_value)

      UNION

      SELECT NEW.id                         AS port_id, 
             'conflicts_install'::conflicts AS conflict_type,
             row_number() OVER (ORDER BY item_value),
             glob_to_regex(R.item_value)
        FROM regexp_split_to_table(NEW.conflicts_install, E'\\s+') AS R(item_value);

    RETURN NEW;
  END
$$ LANGUAGE 'plpgsql';


  DROP TRIGGER IF EXISTS ports_conflicts_insert ON ports;
CREATE TRIGGER ports_conflicts_insert
    AFTER INSERT on ports
    FOR EACH ROW
    EXECUTE PROCEDURE ports_conflicts_set();

  DROP TRIGGER IF EXISTS ports_conflicts_update ON ports;
CREATE TRIGGER ports_conflicts_update
    AFTER UPDATE on ports
    FOR EACH ROW
    EXECUTE PROCEDURE ports_conflicts_set();

Line 3 : deletes existing entries for this port

Lines 6-10 : compiles a list of conflicts values

Lines 14-18 : does the same thing for conflicts_build values

Lines 22-26 : same again, for conflicts_install

The UNIONs tie them all together and line 5 puts it into the ports_conflicts table.

Details on the trigger

This section explains the trigger in a bit more detail.

The regexp_split_to_table function is a wonderful solution to break each CONFLICTS value into separate rows, like this:

freshports.dev=# select regexp_split_to_table('bacula5-server-* bacula-server-*',  E'\\s+');
 regexp_split_to_table 
-----------------------
 bacula5-server-*
 bacula-server-*
(2 rows)

freshports.dev=# 

The glob_to_regex(R.item_value) call changes each of the above items from a blog to a regex.

row_number() OVER (ORDER BY item_value) will sort individual value within a conflict and give us a value for the ports_conflicts.item_num field.

Matching against the regex

The list of conflicts for a given port will not change until the next commit for a port.

However, what that conflict matches against might change when another port is updated.

Thus, the ports_conflicts table only updates based upon a trigger on the ports table.

We will store the actual matches in the ports_conflicts_matches table.

CREATE TABLE ports_conflicts_matches
(
    ports_conflicts_id bigint NOT NULL,
    port_id integer NOT NULL,
    CONSTRAINT ports_conflicts_matches_pkey PRIMARY KEY (ports_conflicts_id, port_id),
    CONSTRAINT ports_conflicts_matches_conflicts_id FOREIGN KEY (ports_conflicts_id)
        REFERENCES public.ports_conflicts (id) MATCH SIMPLE
        ON UPDATE CASCADE
        ON DELETE CASCADE,
    CONSTRAINT ports_conflicts_matches_port_id FOREIGN KEY (port_id)
        REFERENCES public.ports (id) MATCH SIMPLE
        ON UPDATE CASCADE
        ON DELETE CASCADE
);

How do we populate that table? How can it be quickly done?

That is the part which took the longest time to figure out.

Simple population

Here are some sample CONFLICTS values from the early testing (as recorded in an older gist):

freshports.dev=# select * from ports_conflicts;
 id | port_id |  conflicts_type   | item_num |          item_value          
----+---------+-------------------+----------+------------------------------
 21 |   43574 | conflicts_install |        4 | mysqlwsrep.*
 22 |   43574 | conflicts_install |        3 | mysql[0-9].*\-server\-.*
 23 |   43574 | conflicts_install |        1 | mariadb10[013-9]\-server\-.*
 24 |   43574 | conflicts_install |        5 | percona[0-9].*\-server\-.*
 25 |   43574 | conflicts_install |        2 | mariadb5.*\-server\-.*
(5 rows)

freshports.dev=# 

Given the regex in ports_conflicts.item_value, let’s find the matching values from the commit_log_ports table. I can concatenate all the regex values into one search term. This approach works rather well:

freshports.dev=# explain analyse
freshports.dev-# SELECT distinct CLP.port_id, CLP.port_name_revision 
  FROM commit_log_ports CLP 
 WHERE CLP.port_name_revision ~ '^(mysqlwsrep.*|mysql[0-9].*\-server\-.*|mariadb10[013-9]\-server\-.*|percona[0-9].*\-server\-.*|mariadb5.*\-server\-.*)';
                                                                                     QUERY PLAN                                                                                     
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=19482.04..19482.68 rows=85 width=22) (actual time=497.597..497.856 rows=364 loops=1)
   ->  Sort  (cost=19482.04..19482.25 rows=85 width=22) (actual time=497.596..497.658 rows=852 loops=1)
         Sort Key: port_id, port_name_revision
         Sort Method: quicksort  Memory: 91kB
         ->  Gather  (cost=1000.00..19479.32 rows=85 width=22) (actual time=107.422..497.400 rows=852 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               ->  Parallel Seq Scan on commit_log_ports clp  (cost=0.00..18470.82 rows=35 width=22) (actual time=108.076..493.249 rows=284 loops=3)
                     Filter: (port_name_revision ~ '^(mysqlwsrep.*|mysql[0-9].*\-server\-.*|mariadb10[013-9]\-server\-.*|percona[0-9].*\-server\-.*|mariadb5.*\-server\-.*)'::text)
                     Rows Removed by Filter: 307469
 Planning time: 0.417 ms
 Execution time: 498.383 ms
(12 rows)

freshports.dev=# 

The results are something like this:

SELECT distinct CLP.port_id, CLP.port_name_revision 
  FROM commit_log_ports CLP 
 WHERE CLP.port_name_revision ~ '^(mysqlwsrep.*|mysql[0-9].*\-server\-.*|mariadb10[013-9]\-server\-.*|percona[0-9].*\-server\-.*|mariadb5.*\-server\-.*)';
 
  port_id |      port_name_revision       
---------+-------------------------------
   16965 | mysql51-server-5.1.11
   16965 | mysql51-server-5.1.12
   16965 | mysql51-server-5.1.14
   16965 | mysql51-server-5.1.15
   16965 | mysql51-server-5.1.16
   16965 | mysql51-server-5.1.17
   16965 | mysql51-server-5.1.18
   16965 | mysql51-server-5.1.19
   16965 | mysql51-server-5.1.20
   16965 | mysql51-server-5.1.21
   16965 | mysql51-server-5.1.22
...

Only a few rows of the results are shown above. The full list of matches is in that gist.

Yes, this works, however, that solution takes nearly half a second and caters for just set of one match. There must be a better way. If you keep reading in that gist, you will see the other methods I tried. I want to thank those on IRC who helped me find different ways to do this.

Next, I tried an SRF (Set Returning Function) and documented that in this gist. This one is fast; the test I ran taking 7.137 ms, but it did not scale, and exploded to 4825.074 ms when I tried multiple rows.

SRF within a function

My next attempt involved a plpgsql function which took only 13.180 ms, a huge speed increase. This turned out to be the solution I implemented.

Let’s review slightly using the final gist. My original function was:

CREATE OR REPLACE FUNCTION port_conflicts4(bigint, text) RETURNS SETOF ports_conflicts_matches AS $$
  SELECT DISTINCT PC.id, CLP.port_id
    FROM commit_log_ports CLP JOIN ports_conflicts PC 
                              ON (PC.id = $1 AND CLP.port_name_revision ~ $2);
$$ LANGUAGE SQL STRICT;

Then I created a SRF using plpgsql, which is incredibly fast:

CREATE OR REPLACE FUNCTION port_conflicts(bigint, text) RETURNS SETOF ports_conflicts_matches AS $$
  DECLARE
    a_ports_conflicts_id ALIAS FOR $1;
    a_regex              ALIAS FOR $2;

    l_regex text;
    r       ports_conflicts_matches%rowtype;
  BEGIN
    l_regex := '^(' || a_regex || ')';
    
    FOR r in SELECT distinct a_ports_conflicts_id, port_id FROM commit_log_ports WHERE port_name_revision ~ l_regex
    LOOP
      RETURN NEXT r;
    END LOOP;
  END;

$$ LANGUAGE 'plpgsql' STABLE;

Now let’s try that function out:

freshports.dev=# explain analyse SELECT PC4.*                                                                                
  FROM ports_conflicts PC, LATERAL (SELECT * FROM port_conflicts5(PC.id, '^' || PC.item_value || '$')) PC4;
                                                              QUERY PLAN                                                               
---------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.26..35052.76 rows=1751000 width=12) (actual time=0.594..2208.947 rows=10700 loops=1)
   ->  Seq Scan on ports_conflicts pc  (cost=0.00..32.51 rows=1751 width=26) (actual time=0.015..0.838 rows=1751 loops=1)
   ->  Function Scan on port_conflicts5  (cost=0.26..10.26 rows=1000 width=12) (actual time=1.258..1.259 rows=6 loops=1751)
 Planning time: 0.112 ms
 Execution time: 2209.598 ms
(5 rows)

freshports.dev=# 

2.2 seconds, to calculate conflicts for 1751 rows, resulting in 10700 rows. That’s fast enough. We could do that on a periodic basis if required.

From my personal development ethics, I don’t like the idea of having a function and a table with the same / very similar name (port_conflicts function vs ports_conflicts table). I might rename the function to find_matches_for_conflict. I think that better describes the purpose of the function.

Populate the destination table

freshports.dev=# insert into ports_conflicts_matches SELECT PC4.*                                                                                
  FROM ports_conflicts PC, LATERAL (SELECT * FROM port_conflicts(PC.id, '^' || PC.item_value || '$')) PC4;
INSERT 0 10700

Now let’s have a look at some of the results:

freshports.dev=# SELECT PackageName(PC.port_id), 'conflicts with', PackageName(PCM.port_id)
freshports.dev-#   FROM ports_conflicts_matches PCM JOIN ports_conflicts PC ON PCM.ports_conflicts_id = PC.id
freshports.dev-#  WHERE PC.port_id = 45086;
  packagename   |    ?column?    |     packagename     
----------------+----------------+---------------------
 bacula9-server | conflicts with | bacula5-server
 bacula9-server | conflicts with | bacula-server
 bacula9-server | conflicts with | bacula-server-devel
(3 rows)

freshports.dev=# 

You can find that magical 45086 value near the top of this post.

Let’s try openssl:

freshports.dev=# SELECT PackageName(PC.port_id), 'conflicts with', PackageName(PCM.port_id)
  FROM ports_conflicts_matches PCM JOIN ports_conflicts PC ON PCM.ports_conflicts_id = PC.id
 WHERE PC.port_id = 954;
 packagename |    ?column?    |  packagename   
-------------+----------------+----------------
 openssl     | conflicts with | openssl111
 openssl     | conflicts with | openssl-devel
 openssl     | conflicts with | openssl-devel
 openssl     | conflicts with | openssl-devel
 openssl     | conflicts with | openssl-devel
 openssl     | conflicts with | openssl-devel
 openssl     | conflicts with | libressl-devel
 openssl     | conflicts with | libressl-devel
 openssl     | conflicts with | libressl-devel
 openssl     | conflicts with | libressl-devel
 openssl     | conflicts with | libressl
 openssl     | conflicts with | libressl
 openssl     | conflicts with | libressl
 openssl     | conflicts with | libressl
(14 rows)

Or heimdal:

freshports.dev=# SELECT PackageName(PC.port_id), 'conflicts with', PackageName(PCM.port_id)
  FROM ports_conflicts_matches PCM JOIN ports_conflicts PC ON PCM.ports_conflicts_id = PC.id
 WHERE PC.port_id = 934;
 packagename |    ?column?    | packagename  
-------------+----------------+--------------
 heimdal     | conflicts with | wu-ftpd
 heimdal     | conflicts with | wu-ftpd+ipv6
 heimdal     | conflicts with | srp
 heimdal     | conflicts with | krb5-maint
 heimdal     | conflicts with | krb5
 heimdal     | conflicts with | krb5
 heimdal     | conflicts with | krb5-114
 heimdal     | conflicts with | krb5-111
 heimdal     | conflicts with | krb5-113
 heimdal     | conflicts with | krb5
 heimdal     | conflicts with | krb5-112
 heimdal     | conflicts with | krb5-116
 heimdal     | conflicts with | krb5-115
 heimdal     | conflicts with | krb4
(14 rows)

freshports.dev=# 

Now partially in production

The work is partially implemented in production. The idea came from this issue suggested by zi0r.

If you have a look at the Bacula port, you will find the following under Conflicts:

CONFLICTS:

  • bacula5-server-*
  • bacula-server-*

Future work

The next job is to automatically populate the ports_conflicts_matches table. I think that needs to be done on a periodic basis, perhaps every 10 minutes if there has been a recent commit.

Just thinking about that now:

  1. If port A conflicts with port B, then clearly port B conflicts with port A. I am not sure if there is a requirement to specify CONFLICTS in both ports, but that is something easily confirmed by the ports_conlicts table.
  2. I tried to think of commits which would not change ports_conflicts_matches. Since the matches can include PORTVERSION, any given commit which bumps that value can affect matches.
  3. Perhaps I could exclude commits to UPDATING, MOVED, or Mk, but they are relatively few compared to the rest.
  4. I think it’s just safer to recompute all matches periodically. There does not seem to be a safe and straight forward way to do it based upon a smaller unit of work.
Aug 122018
 

Today, FreshPorts broke. No commits were being processed. nullfs mounts were disappearing. No idea why.

Let’s follow the clues.

All times are local to me (Philadelphia).

12:46 pm

Rene contacted me on IRC, presumably about a failed sanity test I had seen earlier in the day:

To: rene@FreeBSD.org
From: FreshPorts Sanity Daemon <FreshPorts@FreshPorts.org>
Cc: dan@langille.org
Subject: FreshPorts sanity checking
Date: Fri, 10 Aug 2018 16:39:24 +0000
Message-ID: <20180810_163924_037160.FreshPorts@FreshPorts.org>

This message was generated by the FreshPorts Daemon.

You recently made this commit, which FreshPorts FreshPorts had trouble 
processing:

MessageID: 201808101638.w7AGcsUL078691@repo.freebsd.org
Subject  : svn commit: r476841 - head/www/feedjack

The following is a list of the ports which had errors:

http://www.freshports.org/www/feedjack/


The exact errors appear below.

www/feedjack:

This command (FreshPorts code 1):

/usr/local/bin/sudo /usr/sbin/chroot -u freshports
/var/db/freshports/ports-jail /make-port.sh /var/db/repos/PORTS-head
www/feedjack
2>/tmp/FreshPorts.www.feedjack.make-error.2018.8.10.16.39.24.59977

produced this error:

Error message is: chroot: /make-port.sh: No such file or directory



You are receiving this message as you are a FreeBSD
committer who has opted into this service.  Please
see https://www.FreshPorts.org/committer-opt-in.php
for more information.

-- 
FreshPorts Daemon

You can look at Rene’s commit.

Looking back, Rene’s was not the first commit to encounter this issue, but I didn’t notice the other failures. It wasn’t until I got a “FAILED: MASTER_PORT” notice that I knew something was wrong. MASTER_PORT is a routine test to exercise the code which determines the MASTER_PORT value. If it gets the wrong answer for sysutils/bacula-server, I know something is broken.

Now let’s examine the command mentioned in the sanity test failure email:

/usr/local/bin/sudo /usr/sbin/chroot -u freshports /var/db/freshports/ports-jail \
    /make-port.sh /var/db/repos/PORTS-head www/feedjack

A chroot is invoked on the /var/db/freshports/ports-jail directory as the freshports user. The command /make-port.sh /var/db/repos/PORTS-head www/feedjack is run.

If you want to read more about ports-jail and it’s script, try this search.

I tried running the command manually:

$ /usr/local/bin/sudo /usr/sbin/chroot -u freshports /var/db/freshports/ports-jail 
chroot: /bin/csh: No such file or directory

I thought: Why does it care about /bin/csh? I’m not running that, nor do I need it.

What’s in the jail?

$ ls -l /var/db/freshports/ports-jail
total 50
drwxr-xr-x  2 root        freshports     2 Oct 30  2017 bin
-r-xr-xr-x  1 root        wheel        617 Aug  3 04:31 cat-descr.sh
drwxr-xr-x  2 root        freshports     2 Oct 30  2017 dev
drwxr-xr-x  2 root        wheel          3 Aug  7 13:57 etc
drwxr-xr-x  2 root        freshports     2 Oct 30  2017 lib
drwxr-xr-x  2 root        freshports     2 Oct 30  2017 libexec
-r-xr-xr-x  1 root        wheel        593 Aug  3 04:31 make-category-comment.sh
-r-xr-xr-x  1 root        wheel        707 Aug  3 04:31 make-generate-plist.sh
-r-xr-xr-x  1 root        wheel        640 Aug  3 04:31 make-master-port-test.sh
-r-xr-xr-x  1 root        wheel        627 Aug  3 04:31 make-master-sites-all.sh
-r-xr-xr-x  1 root        wheel       1536 Aug  3 04:31 make-port.sh
-r-xr-xr-x  1 root        wheel        606 Aug  3 04:31 make-showconfig.sh
-r-xr-xr-x  1 root        wheel        601 Aug  3 04:31 realpath.sh
drwxr-xr-x  2 root        freshports     2 Oct 30  2017 sbin
drwxr-xr-x  7 root        wheel          7 Oct 30  2017 usr
drwxr-xr-x  3 freshports  freshports     3 Nov  1  2017 var
drwxr-xr-x  3 root        wheel          3 Oct  6  2017 var.original
-r-xr-xr-x  1 root        wheel        196 Aug  3 04:31 vars.sh
-r-xr-xr-x  1 root        wheel        196 Aug  3 04:31 vars.sh.sample

OK, make-port.sh is there. What about the bin directory?

$ ls -l /var/db/freshports/ports-jail/bin
total 0

Oh. That should not be empty. This is the problem.

iocage

The FreshPorts production server uses iocage for jail management. The extra nullfs mounts are lists below and are explained in this blog post.

[dan@x8dtu:~] $ sudo iocage fstab -l x8dtu-ingress01
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| INDEX |                                                                                  FSTAB ENTRY                                                                                  |
+=======+===============================================================================================================================================================================+
| 0     |                                                                                                                                                                               |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1     | /iocage/jails/x8dtu-ingress01/root/bin                          /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/bin           nullfs  ro,nosuid         0   0 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2     | none                                                            /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/dev           devfs   rw                0   0 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 3     | /iocage/jails/x8dtu-ingress01/root/lib                          /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/lib           nullfs  ro,nosuid         0   0 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 4     | /iocage/jails/x8dtu-ingress01/root/libexec                      /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/libexec       nullfs  ro,nosuid         0   0 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 5     | /iocage/jails/x8dtu-ingress01/root/sbin                         /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/sbin          nullfs  ro,nosuid         0   0 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 6     | /iocage/jails/x8dtu-ingress01/root/usr/share/mk                 /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/share/mk  nullfs  ro,nosuid,noexec  0   0 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 7     | /iocage/jails/x8dtu-ingress01/root/usr/sbin                     /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/sbin      nullfs  ro,nosuid         0   0 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 8     | /iocage/jails/x8dtu-ingress01/root/usr/bin                      /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/bin       nullfs  ro,nosuid         0   0 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 9     | /iocage/jails/x8dtu-ingress01/root/usr/lib                      /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/lib       nullfs  ro,nosuid         0   0 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 10    | /iocage/jails/x8dtu-ingress01/root/usr/libexec                  /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/libexec   nullfs  ro,nosuid         0   0 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
[dan@x8dtu:~] $

The above is what should be mounted. Compare to below:

[dan@x8dtu:~] $ mount | grep x8dtu-ingress01
main_tank/iocage/jails/x8dtu-ingress01 on /iocage/jails/x8dtu-ingress01 (zfs, local, noatime, nfsv4acls)
main_tank/iocage/jails/x8dtu-ingress01/root on /iocage/jails/x8dtu-ingress01/root (zfs, local, noatime, nfsv4acls)
zroot/data/ingress01-testing on /iocage/jails/x8dtu-ingress01/root/usr/home/dan/tmp-fast (zfs, local, noatime, nfsv4acls)
main_tank/data/freshports/backend/cache on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/cache (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/backend/cache/html on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/cache/html (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/backend/cache/spooling on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/cache/spooling (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/backend/queues/archive on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/message-queues/archive (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/backend/queues/recent on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/message-queues/recent (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/backend/queues/retry on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/message-queues/retry (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail (zfs, local, noatime, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var/db on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var/db/repos on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db/repos (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/data/freshports/repo/PORTS-head on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db/repos/PORTS-head (zfs, local, noatime, nfsv4acls)
main_tank/data/freshports/backend/queues/incoming on /iocage/jails/x8dtu-ingress01/root/var/db/ingress/message-queues/incoming (zfs, local, noatime, noexec, nosuid, nfsv4acls)
devfs on /iocage/jails/x8dtu-ingress01/root/dev (devfs, local, multilabel)
fdescfs on /iocage/jails/x8dtu-ingress01/root/dev/fd (fdescfs)

Yeah, that’s hard to read.

How about this instead:

[dan@x8dtu:~] $ mount | grep /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail
main_tank/data/freshports/ports-jail on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail (zfs, local, noatime, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var/db on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var/db/repos on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db/repos (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/data/freshports/repo/PORTS-head on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db/repos/PORTS-head (zfs, local, noatime, nfsv4acls)
[dan@x8dtu:~] $

Yeah, stuff is missing. Let’s just restart the jail

[dan@x8dtu:~] $ sudo iocage restart x8dtu-ingress01
* Stopping x8dtu-ingress01
  + Running prestop OK
  + Stopping services OK
  + Removing jail process OK
  + Running poststop OK
* Starting x8dtu-ingress01
  + Started OK
  + Starting services OK

Now we see them all:

[dan@x8dtu:~] $ mount | grep /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail
main_tank/data/freshports/ports-jail on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail (zfs, local, noatime, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var/db on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var/db/repos on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db/repos (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/data/freshports/repo/PORTS-head on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db/repos/PORTS-head (zfs, local, noatime, nfsv4acls)
/iocage/jails/x8dtu-ingress01/root/bin on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/bin (nullfs, local, nosuid, read-only)
devfs on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/dev (devfs, local, multilabel)
/iocage/jails/x8dtu-ingress01/root/lib on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/lib (nullfs, local, nosuid, read-only)
/iocage/jails/x8dtu-ingress01/root/libexec on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/libexec (nullfs, local, nosuid, read-only)
/iocage/jails/x8dtu-ingress01/root/sbin on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/sbin (nullfs, local, nosuid, read-only)
/iocage/jails/x8dtu-ingress01/root/usr/share/mk on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/share/mk (nullfs, local, noexec, nosuid, read-only)
/iocage/jails/x8dtu-ingress01/root/usr/sbin on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/sbin (nullfs, local, nosuid, read-only)
/iocage/jails/x8dtu-ingress01/root/usr/bin on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/bin (nullfs, local, nosuid, read-only)
/iocage/jails/x8dtu-ingress01/root/usr/lib on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/lib (nullfs, local, nosuid, read-only)
/iocage/jails/x8dtu-ingress01/root/usr/libexec on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/libexec (nullfs, local, nosuid, read-only)
[dan@x8dtu:~] $

I went to check on the jail. It was empty

[dan@x8dtu:~] $ ls /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/bin
[dan@x8dtu:~] $

What? It’s mounted.

No, it’s not mounted:

[dan@x8dtu:/var/log] $ mount | grep /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail
main_tank/data/freshports/ports-jail on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail (zfs, local, noatime, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var/db on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db (zfs, local, noatime, noexec, nosuid, nfsv4acls)
main_tank/data/freshports/ports-jail/var/db/repos on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db/repos (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/data/freshports/repo/PORTS-head on /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/var/db/repos/PORTS-head (zfs, local, noatime, nfsv4acls)
[dan@x8dtu:/var/log] $

In the short time between restarting the jail and checking the directory, the nullfs mounts were gone.

What is doing this? Nothing in logs, nothing to be found anywhere.

The story so far

  • The FreshPorts server consists of several jails, one of which uses a chroot, which I call a ports-jail, to differentiate it from a FreeBSD jail.
  • The ports-jail uses 10 nullfs mounts.
  • The nullfs mounts are present when the jail starts.
  • The nullfs mounts disappear sometime later.
  • What is removing these mounts? There was nothing in any logs.

This whole situation had me very confused.

2:20 pm

I amended /etc/fstab to contain all the mounts required by the x8dtu-ingress01 jail. This allowed me to easily test the issue by getting everything remounted easily via mount -a.

I kept monitoring the number of mountpoints via mount | wc -l. 82 was wrong. 92 was right.

I was looking at my script and trying to figure out what step was causing the issue. It didn’t happen every time. I was playing around with sleep 5 and other variations, trying to figure out if it was a timing issue.

I didn’t get anywhere with that.

2:45 pm

By this time, I was convinced something odd was going on with the server, so I rebooted it. Nothing was making sense.

2:59 pm

I open a ticket with the ISP because I can’t get to the FreshPorts server and my IPMI link is not working.

I was thinking I had message up the bootcode when I recently upgraded the zpool. I thought I had missed doing:

gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot …

3:31 pm

After a few short discussions, my ISP showed me this screen shot of the FreshPorts server.

/etc/fstab issues

/etc/fstab issues

OK, so that’s my fault. My edits at 2:20 pm to /etc/fstab have come back to haunt me.

3:41 pm

IPMI access has been restored. I remove my modification from /etc/fstab and reboot the server.

3:45 pm

The server is back online, but FreshPorts is still not healthy. The nullfs mounts are still disappearing and I have no idea why.

4:03 pm

A suggestion on IRC arrives:

[Aug 10 16:03] <@markj> dvl: if you have root access, you can try running 
"dtrace -n 'fbt::nullfs_unmount:entry {printf("%s (ppid %d)", curpsinfo->pr_psargs, curpsinfo->pr_ppid);}'"
to get some idea of where the unmount is coming from

I run the command. A short time later:

[dan@x8dtu:~] $ sudo dtrace -n 'fbt::nullfs_unmount:entry {printf("%s (ppid %d)", curpsinfo->pr_psargs, curpsinfo->pr_ppid);}'
dtrace: description 'fbt::nullfs_unmount:entry ' matched 1 probe
CPU     ID                    FUNCTION:NAME
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 23475)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 23475)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 23475)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 23475)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 23475)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 23475)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 23475)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 23475)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 23475)
 13  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-nginx01/fstab (ppid 23492)
 13  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-nginx01/fstab (ppid 23492)

While markj and I are discussing this, more entries appear:

  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 24858)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 24858)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 24858)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 24858)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 24858)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 24858)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 24858)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 24858)
  6  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-ingress01/fstab (ppid 24858)
 15  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-nginx01/fstab (ppid 24882)
 15  66394             nullfs_unmount:entry umount -a -F /iocage/jails/x8dtu-nginx01/fstab (ppid 24882)

I am really suspecting iocage now. The fstab files mentioned above are present on disk:

[dan@x8dtu:/var/log] $ ls -l  /iocage/jails/*/fstab
-rw-r--r--  1 root  wheel     0 Oct  3  2017 /iocage/jails/mx-ingress01/fstab
-rw-r--r--  1 root  wheel  1768 Aug 10 18:20 /iocage/jails/x8dtu-ingress01/fstab
-rw-r--r--  1 root  wheel   368 Oct 29  2017 /iocage/jails/x8dtu-nginx01/fstab
-rw-r--r--  1 root  wheel     0 Sep  9  2017 /iocage/jails/x8dtu-pg01/fstab
-rw-r--r--  1 root  wheel     0 Aug  7 21:03 /iocage/jails/x8dtu-pg02.vpn.unixathome.org/fstab

I can see they are used by both x8dtu-ingress01 and x8dtu-nginx01. The file contents is what you saw for iocage fstab -l earlier in this post.

NOTE: pid 23475 was no longer around, but knowing what that was would have helped us.

4:14pm

markj has another idea:

[Aug 10 16:11] <@markj> oh, dwatch can get the process tree. try "dwatch -R fbt::nullfs_mount:entry'" instead
[Aug 10 16:14] <@markj> dvl: the dwatch invocation should show more info about the ancestors of the umount process. hopefully 
enough to figure out what's going on

A short while later:

[dan@x8dtu:~] $ sudo dwatch -R fbt::nullfs_mount:entry
INFO Watching 'fbt::nullfs_mount:entry' ...
2018 Aug 10 20:16:48 0.0 mount_nullfs[26186]: mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/bin /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/bin
 -+= 26131 0.0 /bin/sh /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh
  \-+= 26175 0.0 /usr/local/bin/python3.6 /usr/local/bin/iocage chroot x8dtu-ingress01 file /bin/sh
    \-+= 26185 0.0 mount -a -F /iocage/jails/x8dtu-ingress01/fstab
      \-+= 26186 0.0 mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/bin /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/bin
2018 Aug 10 20:16:48 0.0 mount_nullfs[26187]: mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/lib /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/lib
 -+= 26131 0.0 /bin/sh /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh
  \-+= 26175 0.0 /usr/local/bin/python3.6 /usr/local/bin/iocage chroot x8dtu-ingress01 file /bin/sh
    \-+= 26185 0.0 mount -a -F /iocage/jails/x8dtu-ingress01/fstab
      \-+= 26187 0.0 mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/lib /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/lib
2018 Aug 10 20:16:48 0.0 mount_nullfs[26188]: mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/libexec /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/libexec
 -+= 26131 0.0 /bin/sh /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh
  \-+= 26175 0.0 /usr/local/bin/python3.6 /usr/local/bin/iocage chroot x8dtu-ingress01 file /bin/sh
    \-+= 26185 0.0 mount -a -F /iocage/jails/x8dtu-ingress01/fstab
      \-+= 26188 0.0 mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/libexec /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/libexec
2018 Aug 10 20:16:48 0.0 mount_nullfs[26189]: mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/sbin /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/sbin
 -+= 26131 0.0 /bin/sh /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh
  \-+= 26175 0.0 /usr/local/bin/python3.6 /usr/local/bin/iocage chroot x8dtu-ingress01 file /bin/sh
    \-+= 26185 0.0 mount -a -F /iocage/jails/x8dtu-ingress01/fstab
      \-+= 26189 0.0 mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/sbin /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/sbin
2018 Aug 10 20:16:48 0.0 mount_nullfs[26190]: mount_nullfs -o ro -o nosuid -o noexec /iocage/jails/x8dtu-ingress01/root/usr/share/mk /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/share/mk
 -+= 26131 0.0 /bin/sh /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh
  \-+= 26175 0.0 /usr/local/bin/python3.6 /usr/local/bin/iocage chroot x8dtu-ingress01 file /bin/sh
    \-+= 26185 0.0 mount -a -F /iocage/jails/x8dtu-ingress01/fstab
      \-+= 26190 0.0 mount_nullfs -o ro -o nosuid -o noexec /iocage/jails/x8dtu-ingress01/root/usr/share/mk /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/share/mk
2018 Aug 10 20:16:48 0.0 mount_nullfs[26191]: mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/usr/sbin /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/sbin
 -+= 26131 0.0 /bin/sh /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh
  \-+= 26175 0.0 /usr/local/bin/python3.6 /usr/local/bin/iocage chroot x8dtu-ingress01 file /bin/sh
    \-+= 26185 0.0 mount -a -F /iocage/jails/x8dtu-ingress01/fstab
      \-+= 26191 0.0 mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/usr/sbin /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/sbin
2018 Aug 10 20:16:48 0.0 mount_nullfs[26192]: mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/usr/bin /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/bin
 -+= 26131 0.0 /bin/sh /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh
  \-+= 26175 0.0 /usr/local/bin/python3.6 /usr/local/bin/iocage chroot x8dtu-ingress01 file /bin/sh
    \-+= 26185 0.0 mount -a -F /iocage/jails/x8dtu-ingress01/fstab
      \-+= 26192 0.0 mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/usr/bin /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/bin
2018 Aug 10 20:16:48 0.0 mount_nullfs[26193]: mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/usr/lib /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/lib
 -+= 26131 0.0 /bin/sh /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh
  \-+= 26175 0.0 /usr/local/bin/python3.6 /usr/local/bin/iocage chroot x8dtu-ingress01 file /bin/sh
    \-+= 26185 0.0 mount -a -F /iocage/jails/x8dtu-ingress01/fstab
      \-+= 26193 0.0 mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/usr/lib /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/lib
2018 Aug 10 20:16:48 0.0 mount_nullfs[26194]: mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/usr/libexec /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/libexec
 -+= 26131 0.0 /bin/sh /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh
  \-+= 26175 0.0 /usr/local/bin/python3.6 /usr/local/bin/iocage chroot x8dtu-ingress01 file /bin/sh
    \-+= 26185 0.0 mount -a -F /iocage/jails/x8dtu-ingress01/fstab
      \-+= 26194 0.0 mount_nullfs -o ro -o nosuid /iocage/jails/x8dtu-ingress01/root/usr/libexec /iocage/jails/x8dtu-ingress01/root/var/db/freshports/ports-jail/usr/libexec
2018 Aug 10 20:16:49 0.0 mount_nullfs[26209]: mount_nullfs -o ro -o nosuid -o noexec /iocage/jails/x8dtu-ingress01/root/var/db/freshports/cache/html /iocage/jails/x8dtu-nginx01/root/var/db/freshports/cache/html
 -+= 26131 0.0 /bin/sh /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh
  \-+= 26198 0.0 /usr/local/bin/python3.6 /usr/local/bin/iocage chroot x8dtu-nginx01 file /bin/sh
    \-+= 26208 0.0 mount -a -F /iocage/jails/x8dtu-nginx01/fstab
      \-+= 26209 0.0 mount_nullfs -o ro -o nosuid -o noexec /iocage/jails/x8dtu-ingress01/root/var/db/freshports/cache/html /iocage/jails/x8dtu-nginx01/root/var/db/freshports/cache/html
2018 Aug 10 20:16:49 0.0 mount_nullfs[26210]: mount_nullfs -o ro -o nosuid -o noexec /iocage/jails/x8dtu-ingress01/root/var/db/freshports/signals /iocage/jails/x8dtu-nginx01/root/var/db/freshports/signals
 -+= 26131 0.0 /bin/sh /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh
  \-+= 26198 0.0 /usr/local/bin/python3.6 /usr/local/bin/iocage chroot x8dtu-nginx01 file /bin/sh
    \-+= 26208 0.0 mount -a -F /iocage/jails/x8dtu-nginx01/fstab
      \-+= 26210 0.0 mount_nullfs -o ro -o nosuid -o noexec /iocage/jails/x8dtu-ingress01/root/var/db/freshports/signals /iocage/jails/x8dtu-nginx01/root/var/db/freshports/signals

Ahah! It’s one of my Nagios check scripts, /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh

The script

This was the script in use. Do not use it. You’ll find out why later.

[dan@x8dtu:/usr/local/libexec/nagios-custom] $ cat check_py_iocage_host_vs_jails.sh
#!/bin/sh

HOSTVERSION=`/usr/bin/file /bin/sh`

JAILS=`/usr/local/bin/iocage list -H | /usr/bin/cut -f 2 -w`
ERRORS=''
for jail in ${JAILS}
do
  JAILVERSION=`/usr/local/bin/iocage chroot ${jail} 'file /bin/sh'`
  if [ "${JAILVERSION}" != "${HOSTVERSION}" ]
  then
    ERRORS="jail '${jail}' is ${JAILVERSION}"
  fi
done

if [ "${ERRORS}" == "" ]
then
  echo 'All jails match the host'
  exit 0
else
  echo "WARNING: HOST and jails are NOT in sync: host = ${HOSTVERSION} but ${ERRORS}"
  exit 2
fi
[dan@x8dtu:/usr/local/libexec/nagios-cuFalstom] $

This script is used to verify that the jail and the host are running the same version of world.

It was the iocage chroot command which was running umount.

I tried to track down why this was suddenly happening.

What changed?

I know Ansible was run earlier in the day. It was installing new scripts on each host for nrpe.

What happened? check_py_iocage_host_vs_jails.sh got updated.

My conclusion: local changes to the file on the x8dtu-ingress01 never pushed upstream.

Find the original

I checked ZFS snapshots. None.

I checked Bacula. I didn’t backup /usr/local/libexec/nagios-custom, presumably because it is in Ansible, but I do now.

Fail. No original script found.

4:30 pm Stop the bleeding

Now that I knew the cause, I commented out that entry in /usr/local/etc/nrpe.cfg and restarted nrpe3.

I restarted the two jails to fix their mount points.

The messed up commits, three or four of them, were rerun.

Success. The website is caught up and online.

Tweet sent.

Status page updated.

Original script found

I found the original script. Ansible backed it up, as instructed.

It was at /usr/local/libexec/nagios-custom/check_py_iocage_host_vs_jails.sh.39690.2018-08-10@15:59:42~

What was the difference?

# svn di
Index: check_py_iocage_host_vs_jails.sh
===================================================================
--- check_py_iocage_host_vs_jails.sh	(revision 1197)
+++ check_py_iocage_host_vs_jails.sh	(working copy)
@@ -2,11 +2,11 @@

 HOSTVERSION=`/usr/bin/file /bin/sh`

-JAILS=`/usr/local/bin/iocage list -H | /usr/bin/cut -f 2 -w`
+JAILS=`/usr/local/bin/iocage list | /usr/bin/egrep -v '^\| JID' | grep '^|' | /usr/bin/cut -f 4 -w`
 ERRORS=''
 for jail in ${JAILS}
 do
-  JAILVERSION=`/usr/local/bin/iocage chroot ${jail} 'file /bin/sh'`
+  JAILVERSION=`/usr/local/bin/iocage exec ${jail} 'file /bin/sh'`
   if [ "${JAILVERSION}" != "${HOSTVERSION}" ]
   then
     ERRORS="jail '${jail}' is ${JAILVERSION}"

I’ll keep the line 10 and ignore line 11. Line 10 is better.

Line 16 is the faulty one. exec is better than chroot.

What does the man page say?

     chroot        Chroot into a jail without actually starting the jail
                   itself.  Useful for initial setup like setting a root
                   password or configuring networking.  A command can be
                   specified as with the normal system, see chroot(8).
     exec          Execute a command inside the specified jail.  This is an
                   iocage UUID/NAME wrapper for jexec(8).  After invoking
                   exec, specify the jail, any commands to run inside that
                   jail, and any arguments for those commands.

To me, they should both work, but they do not. Clearly, chroot was mounting and umounting the fstab contents, by design.

I prefer chroot because it means the script can test non-running jails, but obviously I had to use exec instead.

Notes

  • I committed that ‘new’ script into the repo used by Ansible
  • Don’t work local, work on Ansible.
  • If you do work local, commit it back.
  • Backup your important files, even if they are supposed to be in Ansible
  • When things went wrong, I first suspected something Ansible did. I commented out the two new commands added to nrpe.cfg.
  • When I suspected Ansible, I should have looked around the /usr/local/libexec/nagios-custom directory and I’d have noticed the updated files.
  • I should learn more about dtrace and dwatch, especially on how to form a simple query.
  • Perhaps I should be monitoring mount points. At least then I’d have had more clues.

6:49 pm

My thanks to markj who provided the dtrace and dwatch queries which helped track down my mistake.

It’s pub time. Good night.

Mar 112018
 

Databases use relational integrity to enforce expected situations. A common scenario is duplicates. Case in point, I present the port_dependencies table:

freshports.org=# \d port_dependencies
                    Table "public.port_dependencies"
         Column         |     Type     | Collation | Nullable | Default 
------------------------+--------------+-----------+----------+---------
 port_id                | integer      |           | not null | 
 port_id_dependent_upon | integer      |           | not null | 
 dependency_type        | character(1) |           | not null | 
Indexes:
    "port_dependencies_pkey" PRIMARY KEY, btree (port_id, port_id_dependent_upon, dependency_type)
Foreign-key constraints:
    "port_dependencies_port_id_dependent_upon_fkey" FOREIGN KEY (port_id_dependent_upon) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE
    "port_dependencies_port_id_fkey" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE
Triggers:
    port_dependencies_delete_clear_cache AFTER DELETE ON port_dependencies FOR EACH ROW EXECUTE PROCEDURE port_dependencies_delete_clear_cache()
    port_dependencies_insert_clear_cache AFTER INSERT ON port_dependencies FOR EACH ROW EXECUTE PROCEDURE port_dependencies_insert_clear_cache()

freshports.org=# 

For those not familiar with FreeBSD ports, each port (you could also refer to them as a package or application) can have zero or more dependencies. The FreshPorts database extracts and lists these dependencies for convenient viewing.

As you can see in the above table definition, for a specific (port_id), a given dependency type (port_id_dependent_upon, dependency_type) can only occur once. There is no reason for a port to be dependent upon something twice.

Or so you, or I, might think.

Then this happened:

Could not execute SQL SELECT PortsDependenciesAdd( 'net/ceph', 'devel/boost-python-libs', 'L' ) as result ... 
maybe invalid? ERROR:  duplicate key value violates unique constraint "port_dependencies_pkey"

It was part of this commit on net/ceph. Let’s find out where FreshPorts went wrong.

Clearly, something tried to record a duplicate dependency. The code doesn’t look for it, and did not detect the violation.

The log file

FreshPorts logs a lot of stuff, because it can. It has always been useful for situations such as this.

Here is what I found:

depends with this: 'libboost_python.so:devel/boost-python-libs@py27 libboost_python.so:devel/boost-python-libs
libboost_thread.so:devel/boost-libs libleveldb.so:databases/leveldb libnss3.so:security/nss
libcryptopp.so:security/cryptopp libsnappy.so:archivers/snappy libcurl.so:ftp/curl libxml2.so:textproc/libxml2
libexpat.so:textproc/expat2 liblz4.so:archivers/liblz4 libplds4.so:devel/nspr libtcmalloc.so:devel/google-perftools
libfuse.so:sysutils/fusefs-libs libintl.so:devel/gettext-runtime libldap-2.4.so.2:net/openldap24-client'

The next entry referred to:

The 'L' depends are: devel/boost-python-libs@py27 - devel/boost-python-libs - devel/boost-libs - 
databases/leveldb - security/nss - security/cryptopp - archivers/snappy - ftp/curl - textproc/libxml2 - 
textproc/expat2 - archivers/liblz4 - devel/nspr - devel/google-perftools - sysutils/fusefs-libs -
devel/gettext-runtime - net/openldap24-client

Then the code starts processing them, one by one:

adding in devel/boost-python-libs@py27 which converts to 'devel/boost-python-libs'
sql is SELECT PortsDependenciesAdd( 'net/ceph', 'devel/boost-python-libs', 'L' ) as result
result is 1

As you can see, it strips the flavor (@py27) from the dependency. I think I know where this is going.

And then the next one:

adding in devel/boost-python-libs which converts to 'devel/boost-python-libs'
sql is SELECT PortsDependenciesAdd( 'net/ceph', 'devel/boost-python-libs', 'L' ) as result
Could not execute SQL SELECT PortsDependenciesAdd( 'net/ceph', 'devel/boost-python-libs', 'L' ) as result ...
maybe invalid? ERROR:  duplicate key value violates unique constraint "port_dependencies_pkey"
DETAIL:  Key (port_id, port_id_dependent_upon, dependency_type)=(45100, 25346, L) already exists.
CONTEXT:  SQL statement "INSERT INTO port_dependencies (port_id, port_id_dependent_upon, dependency_type)
            VALUES (l_PortID, l_PortIDDependency, p_DependencyType)"
PL/pgSQL function portsdependenciesadd(text,text,text) line 19 at SQL statement
that failed

Ahh, OK, that’s why. I see now.

Let’s go to the source.

The source

This isn’t the source, it’s the Makefile from that commit. Looking at lines 25-28 we see:

25	LIB_DEPENDS=    \
26	        ${PY_BOOST} \
27	        libboost_python.so:devel/boost-python-libs \
28	        libboost_thread.so:devel/boost-libs \

I bet it’s coming from here:

$ make -V PY_BOOST
libboost_python.so:devel/boost-python-libs@py27

Yes, that’s it. Now we know where and why the duplicates are arising.

How to fix this

I could take one of two approaches:

  1. Detect and eliminate duplicates within the code.
  2. Within the database, ignore attempts to add duplicates.

The database is wonderful at doing this. I could write code, and it might be buggy. I trust the database for this, much more than I trust my code. Plus, I’ll have less code to maintain.

I’ll take option #2.

The code

Here is the statement which inserts the tuple into the database:

INSERT INTO port_dependencies (port_id, port_id_dependent_upon, dependency_type)
      VALUES (l_PortID, l_PortIDDependency, p_DependencyType);

There is no detection of duplicate rows.

In different situations, I have used this approach to detect duplicates during insertion:

INSERT INTO port_dependencies (port_id, port_id_dependent_upon, dependency_type)
  SELECT l_PortID, l_PortIDDependency, p_DependencyType
   WHERE NOT EXISTS (SELECT port_id, port_id_dependent_upon, dependency_type
                       FROM port_dependencies
                      WHERE port_id                = l_PortID
                        AND port_id_dependent_upon = l_PortIDDependency
                        AND dependency_type        = p_DependencyType);

The above may be easier to follow if I do the variable substitution:

 INSERT INTO port_dependencies (port_id, port_id_dependent_upon, dependency_type)
  SELECT 45100, 25346, 'L'
   WHERE NOT EXISTS (SELECT port_id, port_id_dependent_upon, dependency_type
                       FROM port_dependencies
                      WHERE port_id                = 45100
                        AND port_id_dependent_upon = 25346
                        AND dependency_type        = 'L');

That works well, exactly as expected. It does what I need.

But there is a better way

Then I posted on Twitter about this, with the above example. My cousin, Scott Walsh, mentioned ON CONFLICT. This feature arrived with PostgreSQL 9.5 (see docs), which was released in early 2016. I haven’t kept up. FreshPorts added dependency support in 2011.

I immediately tried this approach:

INSERT INTO port_dependencies (port_id, port_id_dependent_upon, dependency_type)
freshports.org-#   SELECT 45100, 25346, 'L'
freshports.org-# ON CONFLICT DO NOTHING;
INSERT 0 0

That’s great.

Then I got to thinking… there are other conflicts which could occur, errors which should be brought to my attention. Foreign key violations come to mind.

Reading more of the documentation came up with this solution:

freshports.org=# INSERT INTO port_dependencies (port_id, port_id_dependent_upon, dependency_type)
freshports.org-#   SELECT 45100, 25346, 'L'
freshports.org-# ON CONFLICT ON CONSTRAINT port_dependencies_pkey DO NOTHING;
INSERT 0 0

EDIT 2017.03.17

I see I’ve left my SELECT in that INSERT. The following is what I will use in production:

INSERT INTO port_dependencies (port_id, port_id_dependent_upon, dependency_type)
    VALUES (l_PortID, l_PortIDDependency, p_DependencyType)
    ON CONFLICT ON CONSTRAINT port_dependencies_pkey DO NOTHING;

When running my first test of that code, this is what was found in the logs (two duplicate inserts, no failures):

sql is SELECT PortsDependenciesAdd( 'net/ceph', 'devel/boost-python-libs', 'L' ) as result
NOTICE:   branch is head
NOTICE:   GetPort pathname is=/ports/head/net/ceph
NOTICE:   branch is head
NOTICE:   GetPort pathname is=/ports/head/devel/boost-python-libs
result is 1
adding in devel/boost-python-libs which converts to 'devel/boost-python-libs'
sql is SELECT PortsDependenciesAdd( 'net/ceph', 'devel/boost-python-libs', 'L' ) as result
NOTICE:   branch is head
NOTICE:   GetPort pathname is=/ports/head/net/ceph
NOTICE:   branch is head
NOTICE:   GetPort pathname is=/ports/head/devel/boost-python-libs
result is 1

That’s exactly what I need. Phew.


This code deals specifically with one constraint, port_dependencies_pkey. If that constraint is violated, we have a duplicate key: do nothing, ignore it, as if nothing happened. Please move along, nothing to see here, thank you.

This, for my situation, is exactly what I want. Instead of removing duplicates in the front end code, I can let the database silently handle it. Score.

Thanks to those who helped me find this solution. I hope it helps you too. This is not a one-size-fits-all solution, but it might be applicable to your needs.

Jan 222018
 

Jochen Neumeister noticed that FreshPorts no longer lists phpunit6.

Sure enough, I went searching and failed to find it. It did not turn up.

Then I checked Include deleted ports, then clicked on Search again. Ahh, there it it.

Sure enough, it is marked as deleted. Look for the small icon at the top, just to the right of the category name. It looks like a tombstone with RIP written on it.

Rene Laden mentioned that FreshPorts is confused after r437302 on 30 Mar 2017 15:03:16. That commit message was Recreate port devel/phpunit6 as repo-copy of devel/phpunit. The previous commit was Remove devel/phpunit6 for direct recreation as repo-copy
at 30 Mar 2017 15:01:26.

Those commits are:

  1. remove437301201703301501.v2UF1QU6020080@repo.freebsd.org
  2. recreate437302201703301503.v2UF3GuI022547@repo.freebsd.org

Note that the recreate occurred two minutes after the remove.

What does the log say?

I know FreshPorts can handle a port being deleted. I know it can handle a port being committed to after it is deleted. Let’s find out what happened.

$ cd /usr/FreshPorts/freshports.org/msgs/FreeBSD/archive/2017_03/2017_03_30
$
$ # this is the remove message
$
$ grep -l 'svn commit: r437301' *
2017.03.30.15.10.13.14518.txt.loading
2017.03.30.15.10.13.14518.txt.raw
2017.03.30.15.10.13.14518.txt.xml

$
$ # this is the add message
$
$ grep -l 'svn commit: r437302' *
2017.03.30.15.03.21.25140.txt.loading
2017.03.30.15.03.21.25140.txt.raw
2017.03.30.15.03.21.25140.txt.xml

The file name has the following format: YYYY.MM.DD.HH.MM.SS.process_id.

The timestamp in the file name represents the time at which the incoming email was saved to the queue. Incoming commits are processed in the order in which they arrive at the FreshPorts server.

In this case, the create was processed at 15:03, a full seven minutes before the remove was processed at 15:10.

Clearly, the add was processed before the delete. This doesn’t bother FreshPorts. It can’t tell.

From the logs of the remove, I found:

# # # # Resurrecting deleted ports # # # #

found a deleted port: port='phpunit6', port_id='41762', element_id='789692'
now looking for files which were modified...
  inspecting: 'Remove', '/ports/head/devel/phpunit6/', '437301', '', 'ports', 'devel/phpunit6/'
  finished looking through the files
# # # # Finished resurrecting deleted ports # # # #

I had to think about this slowly.

We are in a port remove, and we are resurrecting a deleted port. How did that port get deleted? Or more precisely, why did FreshPorts think it was deleted and needed resurrecting? This is the only commit which was doing a delete. This commit *is* the remove.

However, even if the commits were processed out of order, this is what I think should have happened:

  1. the Create gets processed: all good
  2. the Remove is processed: port is marked as deleted
  3. another commit gets processed: port is resurrected and everything is fine

My question is: why is devel/phpunit6 still marked as deleted?

Let’s look farther into this.

The subsequent commits

The next commit against devel/phpunit6 was 447659 (201708092330.v79NU35e060658@repo.freebsd.org).

Let’s look into that log file and … see what I found:

$ cd ../../2017_08/2017_08_09
$ grep -l 201708092330.v79NU35e060658@repo.freebsd.org *
2017.08.09.23.30.09.9876.txt.loading
2017.08.09.23.30.09.9876.txt.raw
2017.08.09.23.30.09.9876.txt.xml

From in 2017.08.09.23.30.09.9876.txt.loading I found:

# # # # Resurrecting deleted ports # # # #

found a deleted port: port='phpunit6', port_id='41762', element_id='789692'
now looking for files which were modified...
  inspecting: 'Modify', '/ports/head/devel/phpunit6/Makefile', '447659', '', 'ports', 'devel/phpunit6/Makefile'
  inspecting: 'Modify', '/ports/head/devel/phpunit6/distinfo', '447659', '', 'ports', 'devel/phpunit6/distinfo'
  finished looking through the files
# # # # Finished resurrecting deleted ports # # # #

Huh. It’s trying to resurrect again.

I looked at the most recent commit (456937) and found:

# # # # Resurrecting deleted ports # # # #

found a deleted port: port='phpunit6', port_id='41762', element_id='789692'
now looking for files which were modified...
  inspecting: 'Modify', '/ports/head/devel/phpunit6/Makefile', '456937', '', 'ports', 'devel/phpunit6/Makefile'
  inspecting: 'Modify', '/ports/head/devel/phpunit6/distinfo', '456937', '', 'ports', 'devel/phpunit6/distinfo'
  finished looking through the files
# # # # Finished resurrecting deleted ports # # # #

I will bet every single commit in between is trying to resurrect and is failing.

Let’s find out why.

Resurrection

When looking at the backend code, I concluded that this condition was never true:

if ($category_name eq $port->{category} && $port->{name} eq $port_name) {

I added more debugging statements to the code and reran those commits in the same order:

$ perl load_xml_into_db.pl /usr/FreshPorts/freshports.org/msgs/FreeBSD/archive/2017_03/2017_03_30/2017.03.30.15.03.21.25140.txt.xml -O > ~/tmp/creating.1
$ perl load_xml_into_db.pl /usr/FreshPorts/freshports.org/msgs/FreeBSD/archive/2017_03/2017_03_30/2017.03.30.15.10.13.14518.txt.xml -O > ~/tmp/removing.1

After running the first command, the port was no longer in a deleted state. That makes sense.

The second command, which removed the port, left the port in a deleted state. That makes sense too.

However, examination of the the removal log revealed the cause of the problem:

# # # # Resurrecting deleted ports # # # #

at the top of loop with $portname='devel/phpunit6'
found a deleted port: category='devel', port='phpunit6', port_id='41762', element_id='789692'
now looking for files which were modified...
  inspecting: 'Remove', '/ports/head/devel/phpunit6/', '437301', '', 'ports', 'devel/phpunit6/'
  which splits to: $subtree='', $category_name='ports', $port_name='head', and $extra='devel/phpunit6/'
  finished looking through the files
# # # # Finished resurrecting deleted ports # # # #

The problem is the split. It is not so much the split as the fact that pathnames changed and much work was done on branches.

Instead of having separate calls to split, I should have a function which does this instead.

As you can see on line 7, the values for $category_name and $port_name are wrong. The split caused this.

You might be thinking: yeah, but the code should not resurrect this port when the commit is deleting. Yes, that is correct. However, the code first wants to detect that the file in questions belongs to a deleted port. This is the criteria for resurrection: an action occurs on a deleted port. Once this situation is detected, the code looks at the action being performed on that file. It the action is an Add or a Modify, only then will the port will be resurrected. If the action is a Delete, the resurrection will not occur.

What code has this issue

I starting grepping for split. I have included the output which relates directly to pathnames.

branches.pm:  my ($emptyLeadingSlash, $subtree, $branch, $category, $port) = split/\//,$pathname, 5;
observer_commits.pm:			my ($subtree, $category_name, $port_name, $extra) = split/\//,$filename, 4;
verifyport.pm:		  ($emptyLeadingSlash, $subtree,            $branch, $category_name, $port_name, $extra) = split/\//,$filename, 6;
verifyport.pm:		  ($emptyLeadingSlash, $subtree, $branches, $branch, $category_name, $port_name, $extra) = split/\//,$filename, 7;
verifyport.pm:		($subtree, $category_name, $port_name, $extra) = split/\//,$filename, 4;
verifyport.pm:				my ($subtree, $category_name, $port_name, $extra) = split/\//,$filename, 4;

Yes, this is not idea. I should create a new dedicated function.

The code in branches.pm appears to be correct, but that split is within the GetBranchFromPathName() function. The goal there is not to determine port or category, but branch. It is doing the right thing, and is outside scope for a pathname function.

My immediate thought is the observer_commits.pm split may be related to issue #56 and I will leave that for another day.

That leaves only the verifyport.pm code.

Code changes

Initially, I thought that creating a perl function which returns multiple parameters was beyond my skill set, but after sleeping upon it, I created a function which returned a hash. In the interim, I fixed the splits in verifyport.pm to be correct, and later converted them to a function call.

This seems to be the correct approach and includes the first two splits shown above:

                # define them first
                my ($emptyLeadingSlash, $subtree, $branches, $branch, $category_name, $port_name, $extra);

                # depending on which branch we are one, we need to split this path differently
                if ($CommitBranch eq $FreshPorts::Constants::HEAD)
                {
                  ($emptyLeadingSlash, $subtree,            $branch, $category_name, $port_name, $extra) = split/\//,$filename, 6;
                }
                else
                {
                  ($emptyLeadingSlash, $subtree, $branches, $branch, $category_name, $port_name, $extra) = split/\//,$filename, 7;
                }

The above them became the basis for the new function.

The second bit of code is definitely wrong.

        print "\n\nThat message is all done under Commit ID = '$commit_log_id'\n";

        print "the size of \@Files is ", scalar(@{$Files}), "\n";

        #
        # in this loop assign a value to needs_refresh for each port
        #
        foreach $value (@{$Files}) {
                ($action, $filename, $revision, $commit_log_element_id, $element_id) = @$value;

                ($subtree, $category_name, $port_name, $extra) = split/\//,$filename, 4;
                print "FILE ==STARTING _CompileListOfPorts: $action, $filename, $revision, $subtree, $category_name, ";
                if (defined($port_name)) {
                        print "$port_name, ";
                }

                if (defined($extra)) {
                        print "$extra, ";
                }

                print "$commit_log_element_id\n";

Sample output from this code (for revision = 458079 and message_id = 201801042012.w04KCc56056293@repo.freebsd.org) is:

That message is all done under Commit ID = '687179'
the size of @Files is 3
FILE ==: Add, /ports/head/databases/rubygem-mysql/files/, 458079, , ports, head, databases/rubygem-mysql/files/, 3407977
FILE ==: Add, /ports/head/databases/rubygem-mysql/files/patch-ext-mysql_api-mysql.c, 458079, , ports, head, databases/rubygem-mysql/files/patch-ext-mysql_api-mysql.c, 3407978
FILE ==: Modify, /ports/head/databases/rubygem-mysql/Makefile, 458079, , ports, head, databases/rubygem-mysql/Makefile, 3407979

On every line, you can see an empty value for $subtree, the $category_name is always ‘ports’, and the $port_name is always ‘head’. Yeah, that needs to be fixed too.

Testing the changes

The changes were tested by running the Create commit, then the Add commit. The incorrect situation was now restored with the port in a Deleted state.

Then I ran a later commit (23 Apr 2017 09:04:40, revision 439210 – 201704230904.v3N94eZs084807@repo.freebsd.org) in the series (

# # # # Resurrecting deleted ports # # # #

at the top of loop with $portname='devel/phpunit6'
found a deleted port: category='devel', port='phpunit6', port_id='41762', element_id='789692'
now looking for files which were modified...
  inspecting: 'Modify', '/ports/head/devel/phpunit6/Makefile', '439210', 'ports', 'devel', 'Makefile'
  which splits to: $subtree='ports', $category_name='devel', $port_name='phpunit6', and $extra='Makefile'
  ...found a file from that port
  ...hmmm, we are modifying a file for a port which is deleted...  ........ I will resurrect that port for you
Port devel/phpunit6 needs to be Resurrected
  ........ done!
  inspecting: 'Modify', '/ports/head/devel/phpunit6/distinfo', '439210', 'ports', 'devel', 'distinfo'
  which splits to: $subtree='ports', $category_name='devel', $port_name='phpunit6', and $extra='distinfo'
  ...found a file from that port
  ...hmmm, we are modifying a file for a port which is deleted...  ........ I will resurrect that port for you
Port devel/phpunit6 needs to be Resurrected
  ........ done!
  finished looking through the files
# # # # Finished resurrecting deleted ports # # # #

As you can see, it has correctly decided to resurrect that deleted port.

Should another Remove/Create pair of commits be processed in reverse order, the next commit will automatically correct the situation.

PORTREVISION issues

Why are all the commits listed as PORTREVISION = 6.0.11?

I reviewed the log for the However, deep in the log for the 439210 commit. I found:

This port is deleted: not refreshing.
This port is deleted: not saving.

I have to fix that. None of the changes are being saved. The ports tree has been svn up’d correctly, but the values are not being extracted and placed into the database.

That problem would explain why all the subsequent commits have the same revision.

The fix involves altering the local copy of the port as the database version is updated:

if ($category_name eq $port->{category} && $port->{name} eq $port_name) {
        print "  ...found a file from that port\n";
        if ($action eq $FreshPorts::Constants::ADD || $action eq $FreshPorts::Constants::MODIFY) {
                print "  ...hmmm, we are modifying a file for a port which is deleted...";
                print "  ........ I will resurrect that port for you\n";
                FreshPorts::Utilities::ReportError('notice', "Port $category_name/$port_name needs to be Resurrected", 0);
                $element->{id}     = $port->{element_id};
                $element->{status} = $FreshPorts::Element::Active;
                $element->update_status();

                # this will ensure that the port is updated  at the end of the commit processing.
                # deleted ports are not update.
                $port->{status} = $FreshPorts::Element::Active;
                print "  ........ done!\n";
        }
}

Lines 11-14 fix that issue. Each port has a corresponding entry in the element table. When the element status is updated, the port status is automatically updated. These lines reflect that action.

Why not process in order?

To process commits in order is easy.

Watch svn for any increments in revision number, pull that revision, process it.

It sounds trivial, but it’s a matter of coding.

The changes include:

  1. Create a process to monitor the repo
  2. Alter the XML file creation to source data only from the commit repo. This information is currently pulled from the commit email, which contains considerably more information.

Got patches?

Additional work to be completed

Following from just fixing this bug, there is additional work arising out of this.

Other ports

I have no doubt that this bug has affected other ports which will be incorrectly marked as being deleted. It may be relatively easy to compare the list of ports against INDEX to verify which ports have the incorrect status.

Other commits

Every commit after such a remove such as this one must be rerun. This is the fallout from the PORTREVISION issues mentioned above.

Dec 152017
 

For several weeks, FreshPorts had a vuxml processing problem. In this blog post, I will share what I found.

Introduction

Incoming events for FreshPorts are handled by a small Python script which checks for flags and incoming commits. This ensures that work is carried out serially because parallel work can have unintended consequences. After each task is completed, the flags are checked, and if work is to be done, that task is performed before commit processing continues.

If an incoming commit touches the security/vuxml/vuln.xml file, a flag is set to indicate that vuxml processing is ready to proceed.

That processing is invoked via this call from the process_vuxml.sh script:

/usr/local/bin/perl ./process_vuxml.pl < ${VULNFILE} >> ${LOGFILE}

What triggered the issue?

On Tue Oct 24 21:59:23 2017 UTC, with Revision 4906, I committed a change to process_vuxml.pl with this commit message:

Stop using HANDLE and start using File::IO

This will hopefully avoid: Name "main::HANDLE" used only once: possible typo at ./process_vuxml.pl line 89.

This should have been included in revision 4905

That patch was:

--- scripts/trunk/process_vuxml.pl	2017/10/09 15:10:01	4836
+++ scripts/trunk/process_vuxml.pl	2017/10/24 21:59:23	4906
@@ -19,7 +19,9 @@
 use warnings;
 use Digest::SHA qw(sha256_hex);
 use autodie qw(:default);
+use IO::File;
 
+use committer_opt_in;
 use database;
 use vuxml;
 use vuxml_parsing;
@@ -38,6 +40,9 @@
     my %vulns;
     my @vulns;
 
+    my $fh;
+    my $p;
+
     # slurp vuln.xml whole.
     local $/;
 
@@ -85,25 +90,27 @@
 
                 if ($updateRequired)
                 {
-                    open(HANDLE, '<', \$vulns{$v});
-            		my $p = FreshPorts::vuxml_parsing->new(Stream        => *HANDLE,
-                                                           DBHandle      => $dbh,
-                                                           UpdateInPlace => 1);
-                                                           
-            		$p->parse_xml($csum);
+                    $fh = IO::File->new();
+                    if ($fh->open('< ' . $vulns{$v})) {
+                		$p = FreshPorts::vuxml_parsing->new(Stream        => $fh,
+                                                               DBHandle      => $dbh,
+                                                               UpdateInPlace => 1);
+
+                		$p->parse_xml($csum);
+
+                        if ($p->database_updated())
+                        {
+                            print "yes, the database was updated\n";
+                        }
+                        else
+                        {
+                            print "no, the database was NOT updated\n";
+                            next;
+                        }
 
-                    close HANDLE;
-                    # process $vulns{$v} via vuxml_processing
-                    
-                    if ($p->database_updated())
-                    {
-                        print "yes, the database was updated\n";
-                    }
-                    else
-                    {
-                        print "no, the database was NOT updated\n";
-                        next;
+                        $fh->close;
                     }
+                    # process $vulns{$v} via vuxml_processing
 
                     print 'invoking vuxml_mark_commits with ' . $v . "\n";
         			my $CommitMarker = FreshPorts::vuxml_mark_commits->new(DBHandle => $dbh,

Please note (line numbers refer to the left side of the page, not actual line numbers in the source file):

  1. Line 17 declares $fh, a file handle
  2. Line 18 declares $p, perviously declared on line 28

How are vulns processed

This is an overview of how FreshPorts processing a new vuln.xml file.

  1. Split the file up into individual entries
  2. If the vid for this vuln is not found in the database, add it.
  3. If the vid is found, compare the current checksum with the value in the database
  4. If the checksums are the same, ignore this vuln entry; it has not changed.
  5. If the checksum differs, delete the vuln from the database and process this vuln as if it were new

Symptoms & debugging

No new vuxml entries were being processed. While debugging the code I thought it was failing on this this:

if ($fh->open('<' . $vulns{$v})) {

You can see that version of the code in earlier revisions of that gist.

Eventually I discovered that the parameters to IO::File open and open are not the same. I changed the above code to:

if ($fh->open(\$vulns{$v}, '<')) {

The next hurdle was this error:

process_vuxml.pl:no element found at line 1, column 0, byte 0 at /usr/local/lib/perl5/site_perl/mach/5.24/XML/Parser.pm line 187.

It was another week or more before I got back to debugging.

I soon discovered that the first vuln would be processed properly, without error. The second would error.

That would lead me to think this was a scope issue; a stale variable or similar.

After several solutions, I settled up on this change, which seemed to be the simplest:

--- scripts/trunk/process_vuxml.pl	2017/12/08 23:53:09	5009
+++ scripts/trunk/process_vuxml.pl	2017/12/10 18:06:26	5014
@@ -59,6 +59,7 @@
 	my $dbh;
 	$dbh = FreshPorts::Database::GetDBHandle();
 	if ($dbh->{Active}) {
+        $fh = IO::File->new();
         my $vuxml = FreshPorts::vuxml->new( $dbh );
           
         eval {
@@ -90,7 +91,6 @@
 
                 if ($updateRequired)
                 {
-                    $fh = IO::File->new();
                     if ($fh->open(\$vulns{$v}, '<')) {
                         $p = FreshPorts::vuxml_parsing->new(Stream        => $fh,
                                                             DBHandle      => $dbh,

As shown, the creation of $fh was moved outside a loop. $p is a Class variable. I discovered this in the vuxml_parsing.pm code:

# Class variables

our ($VuXML);

# Call like this:
#
# $v = new FreshPorts::vuxml_parsing(DBHandle      => $dbh,
#                                    Stream        => IO::File,
#                                    UpdateInPlace => $UpdateInPlace);
#
# DBHandle is a database handle from the DBI module. Required (well,
# if you want to write to a database it's necessary)
#
# Stream is optional, and defaults to reading STDIN.  Pass either an
# IO::Handle or a glob: *GLOB or a glob ref: \*GLOB
#
# UpdateInPlace is optional, defaults to TRUE.  Indicates whether or not
# the code should attempt to update existing VuXML entries.
#

sub new
{
    my $caller = shift;
    my $class  = ref($caller) || $caller;
    my %args   = ref( $_[0] ) eq 'HASH' ? %{ shift() } : @_;
    my $self;

    # There can be only one! Since we keep $self hanging around as a
    # Class variable, more than one instance would be a disaster.
    # Therefore, all calls to new() (except the first or when
    # DESTROY() has been called) just return a reference to the same
    # object.

    if ( $VuXML && $VuXML->isa(__PACKAGE__) ) {
        $self = $VuXML;
        return bless $self, $class;
    }

    $VuXML = $self = {};
    bless $self, $class;

This means that the value for $fh initially passed into $p (on line 35 of the first patch on this page) is the same $fh used on every subsequent call. That is what led me to moving the $fh = IO::File->new(); line.

With that, everything worked.

The later code changes

While looking at this, I reverted some of the original patch. In particular:

  1. The $my declarations on lines 17 & 18 were reverted.
  2. The committer_opt_in from line 9 was reverted. It seems to be an accidental add; it was never used.

My thanks to those that helped along the way. This was not an easy thing for me to solve.

Oct 102017
 

The previous post on this blog was nearly 10 months ago. However, things have not stood still. There have been a few issues fixed and some are still in progress.

The biggest thing underway is a major rework of the backend, the part that processes the incoming commits. There is nothing major; no huge code rewrites.

The basics remain:

  • FreshPorts gets an email from one of the commit mailing list.
  • That email is converted to XML and saved to disk, in a queue.
  • The queue is processed, one commit at a time.
  • An svn up is run.
  • A bunch of make -V occur
  • The database is updated.
  • Profit.

In current production, all of this occurs on one host.

In the new system, I will be taking advantage / exploiting many features of FreeBSD which were not available when FreshPorts started.

The hardware changes

The main change is ZFS. The existing system is on a hardware RAID card with 8 drives. This system was deployed in mid-2006 and it has been a great worker.

The new system boots off a pair of SSDs on ZFS zroot. I plan to make use of beadm for upgrades.

Also present are two 5TB drives, also in a ZFS mirror.

The OS has been installed, but so has most of the FreshPorts code. Over the past few weeks I have been redesigning the back-end to cater for the use of multiple jails. Why jails? I want to separate the main components of the system. This makes it easier to upgrade individual components without affecting the others. It also increase security by isolating various components to their respective worlds.

Jails

There are four jails on this server:

  1. nginx01 – runs the website, and php-fpm, because the website is based on PHP
  2. pg01 – Run PostgreSQL, the database which stores all the good stuff. No public IP address. Accessed over lo1 via a 127.1.0.0/24 address.
  3. mx-ingress01 – accepts incoming email and passes it on to the next stage.
  4. ingress01 – accepts email from mx-ingress01 and performs the basic steps mentioned in the first section of this blog post. No public IP address.

Each of these jails have been created by hand, but the software and daemons have been configured via Ansible.

Porting the code makes some stuff easier

When it came time to deploy the backend FreshPorts code into ingress01, I created a port. This is a departure from current production where code is deployed via svn up.

With the use of a port, many other things came with it:

  • All the backend code goes into /usr/local/libexec/freshports<./li>
  • Main configuration files are now in /usr/local/etc/freshports with symlinks to there from /usr/local/libexec/freshports.
  • /var/db/freshports is the main location for any data external to the database on pg01.
  • /var/db/freshports/cache is the cache for the webpages, with multiple directories for different classes of pages.
  • /var/db/ingress/message-queues/incoming is where the incoming XML data arrives. The ingress user writes here, and the freshports user reads and then moves the file to another queue. All of this occurs in the ingress01 jail.
  • The freshports user has /var/db/freshports/message-queues/ for commit message storage (i.e. the XML). Subdirectories include archive, recent, and retry.
  • The ports-jail is at /var/db/freshports/ports-jail and is where the freshports user does a chroot before running any make -V commands.
  • Within the above named directory, we have /var/db/freshports/ports-jail/var/db/repos which is where the svn up commands are run.
  • Logging is consolidated. It used to all go to /var/log/messages. Now it goes to /var/log/freshports/ and into a number of files contained therein. The port also installed entries to /usr/local/etc/syslog.d/ and to /usr/local/etc/newsyslog.d/ to ensure logs are both captured and rotated.

All of these directories are created by the port. That creating reduces the configuration time. I had scripts, but they were each run manually. Now, it’s fully automated, complete with permission, and user creation.

All of this is handled by the freshports-scripts port.

The freshports daemon

There has long been a freshports deamon, a small python script which looks for incoming XML files and processes them. This is powered by daemontools.

I suppose it could be replaced by a rc.d script, but I might do that another day.

This service is handled by the newly-created freshports-services port.

Current status

As of now, commits are coming in and being processed automatically. The website is being updated and all seems well. There are no known problems.

Remaining work

Before this website goes live, a number of items remain to be completed.

  1. I am not sure that UPDATING and MOVED commits are being processed properly. EDIT 2017.10.14 – done
  2. The fp-listen daemon, which listens for completed commits and clears the appropriate cache bits, is not running. EDIT 2017.10.11 – done
  3. There are a number of cronjob tasks which I should verify are running properly. They always ran as my user id, and they need to run as the freshports user.
  4. I need to get the website running with https under nginx. EDIT 2017.10.14 –
    done

None of these are huge tasks, just time consuming.

Summary

All of these changes are for good. They were not done in the past because there was no reason for them. Now that I’m trying to split tasks up into different jails, the changes make sense and the work is being done.

These changes will also make the inevitable transition under the control of the FreeBSD project. That will not happen soon, but it will happen. I can’t run FreshPorts for ever.

Dec 162016
 

Tonight, I was contacted by jrm on IRC, who told me that FreshPorts was missing something on a search. Yep. It was not in there.

I looked on dev and I saw the same thing.

Finding the cause

I started by enabling debug on dev. I found this SQL:

  SELECT count(*)
  FROM ports P LEFT OUTER JOIN ports_vulnerable    PV  ON PV.port_id       = P.id
               LEFT OUTER JOIN commit_log          CL  ON P.last_commit_id = CL.id
               LEFT OUTER JOIN repo                R   ON CL.repo_id       = R.id
               LEFT OUTER JOIN commit_log_branches CLB ON CL.id            = CLB.commit_log_id
                          JOIN system_branch       SB  ON SB.branch_name   = 'head'
                                                      AND SB.id            = CLB.branch_id,
       categories C, element E

    WHERE P.category_id  = C.id
      AND P.element_id   = E.id  AND 
     E.name ILIKE '%en-aspell%' and E.status = 'A' 

I started by removing one LEFT OUTER JOIN at a time until I found the cause. It was SB.id = CLB.branch_id.

OK, but why?

Let’s look at some of the SQL first.

One line 3, we start with P.last_commit_id, which is the last commit for this port.

From there, we join to the commit_log table to find the repo_id for that commit on line 4.

Line 5 gives us the commit_log_branch table entry for that commit, and we know now which branch this commit touched.

Lines 6 and 7 will restrict the search to head.

I started examining the values from these JOINs. I discovered that some commits did not have entries in the commit_log_branches table. These were commits which occurred before that table was created.

The problem arose when a port whose last commit occurred before the creation of that table. These ports would never be included because, according to the database, there was no such commit on head for them.

The solution

The solution is to add entries to the commit_log_branches table. The following SQL accomplished that.

insert into commit_log_branches
SELECT distinct P.last_commit_id, (select SB.id  FROM system_branch SB where branch_name = 'head')
  FROM ports P LEFT OUTER JOIN commit_log CL ON P.last_commit_id = CL.id
  WHERE NOT EXISTS (SELECT * FROM commit_log_branches CLB WHERE CL.id            = CLB.commit_log_id)
 and P.last_commit_id is not null;

While there, I did this for good measure:

create unique index commit_log_branches_pk on commit_log_branches(commit_log_id, branch_id);

The table now looks like this:

freshports.org=# \d commit_log_branches
 Table "public.commit_log_branches"
    Column     |  Type   | Modifiers 
---------------+---------+-----------
 commit_log_id | integer | not null
 branch_id     | integer | not null
Indexes:
    "commit_log_branches_pk" UNIQUE, btree (commit_log_id, branch_id)
Foreign-key constraints:
    "commit_log_branch_branch_id_fkey" FOREIGN KEY (branch_id) REFERENCES system_branch(id) ON UPDATE CASCADE ON DELETE CASCADE
    "commit_log_branch_commit_log_id_fkey" FOREIGN KEY (commit_log_id) REFERENCES commit_log(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=#