Dan Langille

Jan 222018
 

Jochen Neumeister noticed that FreshPorts no longer lists phpunit6.

Sure enough, I went searching and failed to find it. It did not turn up.

Then I checked Include deleted ports, then clicked on Search again. Ahh, there it it.

Sure enough, it is marked as deleted. Look for the small icon at the top, just to the right of the category name. It looks like a tombstone with RIP written on it.

Rene Laden mentioned that FreshPorts is confused after r437302 on 30 Mar 2017 15:03:16. That commit message was Recreate port devel/phpunit6 as repo-copy of devel/phpunit. The previous commit was Remove devel/phpunit6 for direct recreation as repo-copy
at 30 Mar 2017 15:01:26.

Those commits are:

  1. remove437301201703301501.v2UF1QU6020080@repo.freebsd.org
  2. recreate437302201703301503.v2UF3GuI022547@repo.freebsd.org

Note that the recreate occurred two minutes after the remove.

What does the log say?

I know FreshPorts can handle a port being deleted. I know it can handle a port being committed to after it is deleted. Let’s find out what happened.

$ cd /usr/FreshPorts/freshports.org/msgs/FreeBSD/archive/2017_03/2017_03_30
$
$ # this is the remove message
$
$ grep -l 'svn commit: r437301' *
2017.03.30.15.10.13.14518.txt.loading
2017.03.30.15.10.13.14518.txt.raw
2017.03.30.15.10.13.14518.txt.xml

$
$ # this is the add message
$
$ grep -l 'svn commit: r437302' *
2017.03.30.15.03.21.25140.txt.loading
2017.03.30.15.03.21.25140.txt.raw
2017.03.30.15.03.21.25140.txt.xml

The file name has the following format: YYYY.MM.DD.HH.MM.SS.process_id.

The timestamp in the file name represents the time at which the incoming email was saved to the queue. Incoming commits are processed in the order in which they arrive at the FreshPorts server.

In this case, the create was processed at 15:03, a full seven minutes before the remove was processed at 15:10.

Clearly, the add was processed before the delete. This doesn’t bother FreshPorts. It can’t tell.

From the logs of the remove, I found:

# # # # Resurrecting deleted ports # # # #

found a deleted port: port='phpunit6', port_id='41762', element_id='789692'
now looking for files which were modified...
  inspecting: 'Remove', '/ports/head/devel/phpunit6/', '437301', '', 'ports', 'devel/phpunit6/'
  finished looking through the files
# # # # Finished resurrecting deleted ports # # # #

I had to think about this slowly.

We are in a port remove, and we are resurrecting a deleted port. How did that port get deleted? Or more precisely, why did FreshPorts think it was deleted and needed resurrecting? This is the only commit which was doing a delete. This commit *is* the remove.

However, even if the commits were processed out of order, this is what I think should have happened:

  1. the Create gets processed: all good
  2. the Remove is processed: port is marked as deleted
  3. another commit gets processed: port is resurrected and everything is fine

My question is: why is devel/phpunit6 still marked as deleted?

Let’s look farther into this.

The subsequent commits

The next commit against devel/phpunit6 was 447659 (201708092330.v79NU35e060658@repo.freebsd.org).

Let’s look into that log file and … see what I found:

$ cd ../../2017_08/2017_08_09
$ grep -l 201708092330.v79NU35e060658@repo.freebsd.org *
2017.08.09.23.30.09.9876.txt.loading
2017.08.09.23.30.09.9876.txt.raw
2017.08.09.23.30.09.9876.txt.xml

From in 2017.08.09.23.30.09.9876.txt.loading I found:

# # # # Resurrecting deleted ports # # # #

found a deleted port: port='phpunit6', port_id='41762', element_id='789692'
now looking for files which were modified...
  inspecting: 'Modify', '/ports/head/devel/phpunit6/Makefile', '447659', '', 'ports', 'devel/phpunit6/Makefile'
  inspecting: 'Modify', '/ports/head/devel/phpunit6/distinfo', '447659', '', 'ports', 'devel/phpunit6/distinfo'
  finished looking through the files
# # # # Finished resurrecting deleted ports # # # #

Huh. It’s trying to resurrect again.

I looked at the most recent commit (456937) and found:

# # # # Resurrecting deleted ports # # # #

found a deleted port: port='phpunit6', port_id='41762', element_id='789692'
now looking for files which were modified...
  inspecting: 'Modify', '/ports/head/devel/phpunit6/Makefile', '456937', '', 'ports', 'devel/phpunit6/Makefile'
  inspecting: 'Modify', '/ports/head/devel/phpunit6/distinfo', '456937', '', 'ports', 'devel/phpunit6/distinfo'
  finished looking through the files
# # # # Finished resurrecting deleted ports # # # #

I will bet every single commit in between is trying to resurrect and is failing.

Let’s find out why.

Resurrection

When looking at the backend code, I concluded that this condition was never true:

if ($category_name eq $port->{category} && $port->{name} eq $port_name) {

I added more debugging statements to the code and reran those commits in the same order:

$ perl load_xml_into_db.pl /usr/FreshPorts/freshports.org/msgs/FreeBSD/archive/2017_03/2017_03_30/2017.03.30.15.03.21.25140.txt.xml -O > ~/tmp/creating.1
$ perl load_xml_into_db.pl /usr/FreshPorts/freshports.org/msgs/FreeBSD/archive/2017_03/2017_03_30/2017.03.30.15.10.13.14518.txt.xml -O > ~/tmp/removing.1

After running the first command, the port was no longer in a deleted state. That makes sense.

The second command, which removed the port, left the port in a deleted state. That makes sense too.

However, examination of the the removal log revealed the cause of the problem:

# # # # Resurrecting deleted ports # # # #

at the top of loop with $portname='devel/phpunit6'
found a deleted port: category='devel', port='phpunit6', port_id='41762', element_id='789692'
now looking for files which were modified...
  inspecting: 'Remove', '/ports/head/devel/phpunit6/', '437301', '', 'ports', 'devel/phpunit6/'
  which splits to: $subtree='', $category_name='ports', $port_name='head', and $extra='devel/phpunit6/'
  finished looking through the files
# # # # Finished resurrecting deleted ports # # # #

The problem is the split. It is not so much the split as the fact that pathnames changed and much work was done on branches.

Instead of having separate calls to split, I should have a function which does this instead.

As you can see on line 7, the values for $category_name and $port_name are wrong. The split caused this.

You might be thinking: yeah, but the code should not resurrect this port when the commit is deleting. Yes, that is correct. However, the code first wants to detect that the file in questions belongs to a deleted port. This is the criteria for resurrection: an action occurs on a deleted port. Once this situation is detected, the code looks at the action being performed on that file. It the action is an Add or a Modify, only then will the port will be resurrected. If the action is a Delete, the resurrection will not occur.

What code has this issue

I starting grepping for split. I have included the output which relates directly to pathnames.

branches.pm:  my ($emptyLeadingSlash, $subtree, $branch, $category, $port) = split/\//,$pathname, 5;
observer_commits.pm:			my ($subtree, $category_name, $port_name, $extra) = split/\//,$filename, 4;
verifyport.pm:		  ($emptyLeadingSlash, $subtree,            $branch, $category_name, $port_name, $extra) = split/\//,$filename, 6;
verifyport.pm:		  ($emptyLeadingSlash, $subtree, $branches, $branch, $category_name, $port_name, $extra) = split/\//,$filename, 7;
verifyport.pm:		($subtree, $category_name, $port_name, $extra) = split/\//,$filename, 4;
verifyport.pm:				my ($subtree, $category_name, $port_name, $extra) = split/\//,$filename, 4;

Yes, this is not idea. I should create a new dedicated function.

The code in branches.pm appears to be correct, but that split is within the GetBranchFromPathName() function. The goal there is not to determine port or category, but branch. It is doing the right thing, and is outside scope for a pathname function.

My immediate thought is the observer_commits.pm split may be related to issue #56 and I will leave that for another day.

That leaves only the verifyport.pm code.

Code changes

Initially, I thought that creating a perl function which returns multiple parameters was beyond my skill set, but after sleeping upon it, I created a function which returned a hash. In the interim, I fixed the splits in verifyport.pm to be correct, and later converted them to a function call.

This seems to be the correct approach and includes the first two splits shown above:

                # define them first
                my ($emptyLeadingSlash, $subtree, $branches, $branch, $category_name, $port_name, $extra);

                # depending on which branch we are one, we need to split this path differently
                if ($CommitBranch eq $FreshPorts::Constants::HEAD)
                {
                  ($emptyLeadingSlash, $subtree,            $branch, $category_name, $port_name, $extra) = split/\//,$filename, 6;
                }
                else
                {
                  ($emptyLeadingSlash, $subtree, $branches, $branch, $category_name, $port_name, $extra) = split/\//,$filename, 7;
                }

The above them became the basis for the new function.

The second bit of code is definitely wrong.

        print "\n\nThat message is all done under Commit ID = '$commit_log_id'\n";

        print "the size of \@Files is ", scalar(@{$Files}), "\n";

        #
        # in this loop assign a value to needs_refresh for each port
        #
        foreach $value (@{$Files}) {
                ($action, $filename, $revision, $commit_log_element_id, $element_id) = @$value;

                ($subtree, $category_name, $port_name, $extra) = split/\//,$filename, 4;
                print "FILE ==STARTING _CompileListOfPorts: $action, $filename, $revision, $subtree, $category_name, ";
                if (defined($port_name)) {
                        print "$port_name, ";
                }

                if (defined($extra)) {
                        print "$extra, ";
                }

                print "$commit_log_element_id\n";

Sample output from this code (for revision = 458079 and message_id = 201801042012.w04KCc56056293@repo.freebsd.org) is:

That message is all done under Commit ID = '687179'
the size of @Files is 3
FILE ==: Add, /ports/head/databases/rubygem-mysql/files/, 458079, , ports, head, databases/rubygem-mysql/files/, 3407977
FILE ==: Add, /ports/head/databases/rubygem-mysql/files/patch-ext-mysql_api-mysql.c, 458079, , ports, head, databases/rubygem-mysql/files/patch-ext-mysql_api-mysql.c, 3407978
FILE ==: Modify, /ports/head/databases/rubygem-mysql/Makefile, 458079, , ports, head, databases/rubygem-mysql/Makefile, 3407979

On every line, you can see an empty value for $subtree, the $category_name is always ‘ports’, and the $port_name is always ‘head’. Yeah, that needs to be fixed too.

Testing the changes

The changes were tested by running the Create commit, then the Add commit. The incorrect situation was now restored with the port in a Deleted state.

Then I ran a later commit (23 Apr 2017 09:04:40, revision 439210 – 201704230904.v3N94eZs084807@repo.freebsd.org) in the series (

# # # # Resurrecting deleted ports # # # #

at the top of loop with $portname='devel/phpunit6'
found a deleted port: category='devel', port='phpunit6', port_id='41762', element_id='789692'
now looking for files which were modified...
  inspecting: 'Modify', '/ports/head/devel/phpunit6/Makefile', '439210', 'ports', 'devel', 'Makefile'
  which splits to: $subtree='ports', $category_name='devel', $port_name='phpunit6', and $extra='Makefile'
  ...found a file from that port
  ...hmmm, we are modifying a file for a port which is deleted...  ........ I will resurrect that port for you
Port devel/phpunit6 needs to be Resurrected
  ........ done!
  inspecting: 'Modify', '/ports/head/devel/phpunit6/distinfo', '439210', 'ports', 'devel', 'distinfo'
  which splits to: $subtree='ports', $category_name='devel', $port_name='phpunit6', and $extra='distinfo'
  ...found a file from that port
  ...hmmm, we are modifying a file for a port which is deleted...  ........ I will resurrect that port for you
Port devel/phpunit6 needs to be Resurrected
  ........ done!
  finished looking through the files
# # # # Finished resurrecting deleted ports # # # #

As you can see, it has correctly decided to resurrect that deleted port.

Should another Remove/Create pair of commits be processed in reverse order, the next commit will automatically correct the situation.

PORTREVISION issues

Why are all the commits listed as PORTREVISION = 6.0.11?

I reviewed the log for the However, deep in the log for the 439210 commit. I found:

This port is deleted: not refreshing.
This port is deleted: not saving.

I have to fix that. None of the changes are being saved. The ports tree has been svn up’d correctly, but the values are not being extracted and placed into the database.

That problem would explain why all the subsequent commits have the same revision.

The fix involves altering the local copy of the port as the database version is updated:

if ($category_name eq $port->{category} && $port->{name} eq $port_name) {
        print "  ...found a file from that port\n";
        if ($action eq $FreshPorts::Constants::ADD || $action eq $FreshPorts::Constants::MODIFY) {
                print "  ...hmmm, we are modifying a file for a port which is deleted...";
                print "  ........ I will resurrect that port for you\n";
                FreshPorts::Utilities::ReportError('notice', "Port $category_name/$port_name needs to be Resurrected", 0);
                $element->{id}     = $port->{element_id};
                $element->{status} = $FreshPorts::Element::Active;
                $element->update_status();

                # this will ensure that the port is updated  at the end of the commit processing.
                # deleted ports are not update.
                $port->{status} = $FreshPorts::Element::Active;
                print "  ........ done!\n";
        }
}

Lines 11-14 fix that issue. Each port has a corresponding entry in the element table. When the element status is updated, the port status is automatically updated. These lines reflect that action.

Why not process in order?

To process commits in order is easy.

Watch svn for any increments in revision number, pull that revision, process it.

It sounds trivial, but it’s a matter of coding.

The changes include:

  1. Create a process to monitor the repo
  2. Alter the XML file creation to source data only from the commit repo. This information is currently pulled from the commit email, which contains considerably more information.

Got patches?

Additional work to be completed

Following from just fixing this bug, there is additional work arising out of this.

Other ports

I have no doubt that this bug has affected other ports which will be incorrectly marked as being deleted. It may be relatively easy to compare the list of ports against INDEX to verify which ports have the incorrect status.

Other commits

Every commit after such a remove such as this one must be rerun. This is the fallout from the PORTREVISION issues mentioned above.

Dec 152017
 

For several weeks, FreshPorts had a vuxml processing problem. In this blog post, I will share what I found.

Introduction

Incoming events for FreshPorts are handled by a small Python script which checks for flags and incoming commits. This ensures that work is carried out serially because parallel work can have unintended consequences. After each task is completed, the flags are checked, and if work is to be done, that task is performed before commit processing continues.

If an incoming commit touches the security/vuxml/vuln.xml file, a flag is set to indicate that vuxml processing is ready to proceed.

That processing is invoked via this call from the process_vuxml.sh script:

/usr/local/bin/perl ./process_vuxml.pl < ${VULNFILE} >> ${LOGFILE}

What triggered the issue?

On Tue Oct 24 21:59:23 2017 UTC, with Revision 4906, I committed a change to process_vuxml.pl with this commit message:

Stop using HANDLE and start using File::IO

This will hopefully avoid: Name "main::HANDLE" used only once: possible typo at ./process_vuxml.pl line 89.

This should have been included in revision 4905

That patch was:

--- scripts/trunk/process_vuxml.pl	2017/10/09 15:10:01	4836
+++ scripts/trunk/process_vuxml.pl	2017/10/24 21:59:23	4906
@@ -19,7 +19,9 @@
 use warnings;
 use Digest::SHA qw(sha256_hex);
 use autodie qw(:default);
+use IO::File;
 
+use committer_opt_in;
 use database;
 use vuxml;
 use vuxml_parsing;
@@ -38,6 +40,9 @@
     my %vulns;
     my @vulns;
 
+    my $fh;
+    my $p;
+
     # slurp vuln.xml whole.
     local $/;
 
@@ -85,25 +90,27 @@
 
                 if ($updateRequired)
                 {
-                    open(HANDLE, '<', \$vulns{$v});
-            		my $p = FreshPorts::vuxml_parsing->new(Stream        => *HANDLE,
-                                                           DBHandle      => $dbh,
-                                                           UpdateInPlace => 1);
-                                                           
-            		$p->parse_xml($csum);
+                    $fh = IO::File->new();
+                    if ($fh->open('< ' . $vulns{$v})) {
+                		$p = FreshPorts::vuxml_parsing->new(Stream        => $fh,
+                                                               DBHandle      => $dbh,
+                                                               UpdateInPlace => 1);
+
+                		$p->parse_xml($csum);
+
+                        if ($p->database_updated())
+                        {
+                            print "yes, the database was updated\n";
+                        }
+                        else
+                        {
+                            print "no, the database was NOT updated\n";
+                            next;
+                        }
 
-                    close HANDLE;
-                    # process $vulns{$v} via vuxml_processing
-                    
-                    if ($p->database_updated())
-                    {
-                        print "yes, the database was updated\n";
-                    }
-                    else
-                    {
-                        print "no, the database was NOT updated\n";
-                        next;
+                        $fh->close;
                     }
+                    # process $vulns{$v} via vuxml_processing
 
                     print 'invoking vuxml_mark_commits with ' . $v . "\n";
         			my $CommitMarker = FreshPorts::vuxml_mark_commits->new(DBHandle => $dbh,

Please note (line numbers refer to the left side of the page, not actual line numbers in the source file):

  1. Line 17 declares $fh, a file handle
  2. Line 18 declares $p, perviously declared on line 28

How are vulns processed

This is an overview of how FreshPorts processing a new vuln.xml file.

  1. Split the file up into individual entries
  2. If the vid for this vuln is not found in the database, add it.
  3. If the vid is found, compare the current checksum with the value in the database
  4. If the checksums are the same, ignore this vuln entry; it has not changed.
  5. If the checksum differs, delete the vuln from the database and process this vuln as if it were new

Symptoms & debugging

No new vuxml entries were being processed. While debugging the code I thought it was failing on this this:

if ($fh->open('<' . $vulns{$v})) {

You can see that version of the code in earlier revisions of that gist.

Eventually I discovered that the parameters to IO::File open and open are not the same. I changed the above code to:

if ($fh->open(\$vulns{$v}, '<')) {

The next hurdle was this error:

process_vuxml.pl:no element found at line 1, column 0, byte 0 at /usr/local/lib/perl5/site_perl/mach/5.24/XML/Parser.pm line 187.

It was another week or more before I got back to debugging.

I soon discovered that the first vuln would be processed properly, without error. The second would error.

That would lead me to think this was a scope issue; a stale variable or similar.

After several solutions, I settled up on this change, which seemed to be the simplest:

--- scripts/trunk/process_vuxml.pl	2017/12/08 23:53:09	5009
+++ scripts/trunk/process_vuxml.pl	2017/12/10 18:06:26	5014
@@ -59,6 +59,7 @@
 	my $dbh;
 	$dbh = FreshPorts::Database::GetDBHandle();
 	if ($dbh->{Active}) {
+        $fh = IO::File->new();
         my $vuxml = FreshPorts::vuxml->new( $dbh );
           
         eval {
@@ -90,7 +91,6 @@
 
                 if ($updateRequired)
                 {
-                    $fh = IO::File->new();
                     if ($fh->open(\$vulns{$v}, '<')) {
                         $p = FreshPorts::vuxml_parsing->new(Stream        => $fh,
                                                             DBHandle      => $dbh,

As shown, the creation of $fh was moved outside a loop. $p is a Class variable. I discovered this in the vuxml_parsing.pm code:

# Class variables

our ($VuXML);

# Call like this:
#
# $v = new FreshPorts::vuxml_parsing(DBHandle      => $dbh,
#                                    Stream        => IO::File,
#                                    UpdateInPlace => $UpdateInPlace);
#
# DBHandle is a database handle from the DBI module. Required (well,
# if you want to write to a database it's necessary)
#
# Stream is optional, and defaults to reading STDIN.  Pass either an
# IO::Handle or a glob: *GLOB or a glob ref: \*GLOB
#
# UpdateInPlace is optional, defaults to TRUE.  Indicates whether or not
# the code should attempt to update existing VuXML entries.
#

sub new
{
    my $caller = shift;
    my $class  = ref($caller) || $caller;
    my %args   = ref( $_[0] ) eq 'HASH' ? %{ shift() } : @_;
    my $self;

    # There can be only one! Since we keep $self hanging around as a
    # Class variable, more than one instance would be a disaster.
    # Therefore, all calls to new() (except the first or when
    # DESTROY() has been called) just return a reference to the same
    # object.

    if ( $VuXML && $VuXML->isa(__PACKAGE__) ) {
        $self = $VuXML;
        return bless $self, $class;
    }

    $VuXML = $self = {};
    bless $self, $class;

This means that the value for $fh initially passed into $p (on line 35 of the first patch on this page) is the same $fh used on every subsequent call. That is what led me to moving the $fh = IO::File->new(); line.

With that, everything worked.

The later code changes

While looking at this, I reverted some of the original patch. In particular:

  1. The $my declarations on lines 17 & 18 were reverted.
  2. The committer_opt_in from line 9 was reverted. It seems to be an accidental add; it was never used.

My thanks to those that helped along the way. This was not an easy thing for me to solve.

Oct 102017
 

The previous post on this blog was nearly 10 months ago. However, things have not stood still. There have been a few issues fixed and some are still in progress.

The biggest thing underway is a major rework of the backend, the part that processes the incoming commits. There is nothing major; no huge code rewrites.

The basics remain:

  • FreshPorts gets an email from one of the commit mailing list.
  • That email is converted to XML and saved to disk, in a queue.
  • The queue is processed, one commit at a time.
  • An svn up is run.
  • A bunch of make -V occur
  • The database is updated.
  • Profit.

In current production, all of this occurs on one host.

In the new system, I will be taking advantage / exploiting many features of FreeBSD which were not available when FreshPorts started.

The hardware changes

The main change is ZFS. The existing system is on a hardware RAID card with 8 drives. This system was deployed in mid-2006 and it has been a great worker.

The new system boots off a pair of SSDs on ZFS zroot. I plan to make use of beadm for upgrades.

Also present are two 5TB drives, also in a ZFS mirror.

The OS has been installed, but so has most of the FreshPorts code. Over the past few weeks I have been redesigning the back-end to cater for the use of multiple jails. Why jails? I want to separate the main components of the system. This makes it easier to upgrade individual components without affecting the others. It also increase security by isolating various components to their respective worlds.

Jails

There are four jails on this server:

  1. nginx01 – runs the website, and php-fpm, because the website is based on PHP
  2. pg01 – Run PostgreSQL, the database which stores all the good stuff. No public IP address. Accessed over lo1 via a 127.1.0.0/24 address.
  3. mx-ingress01 – accepts incoming email and passes it on to the next stage.
  4. ingress01 – accepts email from mx-ingress01 and performs the basic steps mentioned in the first section of this blog post. No public IP address.

Each of these jails have been created by hand, but the software and daemons have been configured via Ansible.

Porting the code makes some stuff easier

When it came time to deploy the backend FreshPorts code into ingress01, I created a port. This is a departure from current production where code is deployed via svn up.

With the use of a port, many other things came with it:

  • All the backend code goes into /usr/local/libexec/freshports<./li>
  • Main configuration files are now in /usr/local/etc/freshports with symlinks to there from /usr/local/libexec/freshports.
  • /var/db/freshports is the main location for any data external to the database on pg01.
  • /var/db/freshports/cache is the cache for the webpages, with multiple directories for different classes of pages.
  • /var/db/ingress/message-queues/incoming is where the incoming XML data arrives. The ingress user writes here, and the freshports user reads and then moves the file to another queue. All of this occurs in the ingress01 jail.
  • The freshports user has /var/db/freshports/message-queues/ for commit message storage (i.e. the XML). Subdirectories include archive, recent, and retry.
  • The ports-jail is at /var/db/freshports/ports-jail and is where the freshports user does a chroot before running any make -V commands.
  • Within the above named directory, we have /var/db/freshports/ports-jail/var/db/repos which is where the svn up commands are run.
  • Logging is consolidated. It used to all go to /var/log/messages. Now it goes to /var/log/freshports/ and into a number of files contained therein. The port also installed entries to /usr/local/etc/syslog.d/ and to /usr/local/etc/newsyslog.d/ to ensure logs are both captured and rotated.

All of these directories are created by the port. That creating reduces the configuration time. I had scripts, but they were each run manually. Now, it’s fully automated, complete with permission, and user creation.

All of this is handled by the freshports-scripts port.

The freshports daemon

There has long been a freshports deamon, a small python script which looks for incoming XML files and processes them. This is powered by daemontools.

I suppose it could be replaced by a rc.d script, but I might do that another day.

This service is handled by the newly-created freshports-services port.

Current status

As of now, commits are coming in and being processed automatically. The website is being updated and all seems well. There are no known problems.

Remaining work

Before this website goes live, a number of items remain to be completed.

  1. I am not sure that UPDATING and MOVED commits are being processed properly. EDIT 2017.10.14 – done
  2. The fp-listen daemon, which listens for completed commits and clears the appropriate cache bits, is not running. EDIT 2017.10.11 – done
  3. There are a number of cronjob tasks which I should verify are running properly. They always ran as my user id, and they need to run as the freshports user.
  4. I need to get the website running with https under nginx. EDIT 2017.10.14 –
    done

None of these are huge tasks, just time consuming.

Summary

All of these changes are for good. They were not done in the past because there was no reason for them. Now that I’m trying to split tasks up into different jails, the changes make sense and the work is being done.

These changes will also make the inevitable transition under the control of the FreeBSD project. That will not happen soon, but it will happen. I can’t run FreshPorts for ever.

Dec 162016
 

Tonight, I was contacted by jrm on IRC, who told me that FreshPorts was missing something on a search. Yep. It was not in there.

I looked on dev and I saw the same thing.

Finding the cause

I started by enabling debug on dev. I found this SQL:

  SELECT count(*)
  FROM ports P LEFT OUTER JOIN ports_vulnerable    PV  ON PV.port_id       = P.id
               LEFT OUTER JOIN commit_log          CL  ON P.last_commit_id = CL.id
               LEFT OUTER JOIN repo                R   ON CL.repo_id       = R.id
               LEFT OUTER JOIN commit_log_branches CLB ON CL.id            = CLB.commit_log_id
                          JOIN system_branch       SB  ON SB.branch_name   = 'head'
                                                      AND SB.id            = CLB.branch_id,
       categories C, element E

    WHERE P.category_id  = C.id
      AND P.element_id   = E.id  AND 
     E.name ILIKE '%en-aspell%' and E.status = 'A' 

I started by removing one LEFT OUTER JOIN at a time until I found the cause. It was SB.id = CLB.branch_id.

OK, but why?

Let’s look at some of the SQL first.

One line 3, we start with P.last_commit_id, which is the last commit for this port.

From there, we join to the commit_log table to find the repo_id for that commit on line 4.

Line 5 gives us the commit_log_branch table entry for that commit, and we know now which branch this commit touched.

Lines 6 and 7 will restrict the search to head.

I started examining the values from these JOINs. I discovered that some commits did not have entries in the commit_log_branches table. These were commits which occurred before that table was created.

The problem arose when a port whose last commit occurred before the creation of that table. These ports would never be included because, according to the database, there was no such commit on head for them.

The solution

The solution is to add entries to the commit_log_branches table. The following SQL accomplished that.

insert into commit_log_branches
SELECT distinct P.last_commit_id, (select SB.id  FROM system_branch SB where branch_name = 'head')
  FROM ports P LEFT OUTER JOIN commit_log CL ON P.last_commit_id = CL.id
  WHERE NOT EXISTS (SELECT * FROM commit_log_branches CLB WHERE CL.id            = CLB.commit_log_id)
 and P.last_commit_id is not null;

While there, I did this for good measure:

create unique index commit_log_branches_pk on commit_log_branches(commit_log_id, branch_id);

The table now looks like this:

freshports.org=# \d commit_log_branches
 Table "public.commit_log_branches"
    Column     |  Type   | Modifiers 
---------------+---------+-----------
 commit_log_id | integer | not null
 branch_id     | integer | not null
Indexes:
    "commit_log_branches_pk" UNIQUE, btree (commit_log_id, branch_id)
Foreign-key constraints:
    "commit_log_branch_branch_id_fkey" FOREIGN KEY (branch_id) REFERENCES system_branch(id) ON UPDATE CASCADE ON DELETE CASCADE
    "commit_log_branch_commit_log_id_fkey" FOREIGN KEY (commit_log_id) REFERENCES commit_log(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=# 
Oct 302016
 

Committing to a branch (e.g. 2016Q4) when the port in question has slave ports is getting tricky, especially if the port was added after the branch was created.

Case in point, www/linux-c7-flashplugin11 was added to the tree on 19 Oct 2016. The 2016Q4 branch was created on 3 Oct, and therefore that port does not exist on the branch at all.

www/linux-c7-flashplugin11 is a slave port of www/linux-c6-flashplugin11, which was created on 22 Sep 2014 and is there on the 2016Q4 branch.

Clearly, the slave port will not be on the branch.

As of the time of writing, there have been two MFH on the master port. Should the slave port be MFH as well? No.

No, because the slave port does not exist on the branch, and anyone following the branch will not be using that newly added slave port.

The issue for FreshPorts is knowing whether or not a given port is on the branch.

Background on branches

When FreshPorts sees a commit on a branch, the first thing it does is create that port on the branch if it does not already exist there. The copy is based on head and (glossing over all the gory details), the values are copied over verbatim. This is not entirely accurate, but it is close enough for this discussion. Technically, a copy occurs in the database which ensures the directory structure for the port exists on the branch. Because this is based on head, if any new files have been added to the port since the branch was taken, those same files will now exist on the branch. The values for the port (name, description, etc) will be based on extract via make -V performed on the ports tree after an svn up. Thus, the values displayed on the port page will reflect what is on the branch, and not what is in head.

The hypothetical solution

I think the only clean way for FreshPorts to know if the slave port exists on the branch is inspection of the local working copy of the repo after the svn up has completed. If it exists, the database entries for the slave port should be refreshed based on what is found in the repo. If it does not exist, no further action is required for that slave port.

This may be easy, or it may involve some rejigging of the code.

Aug 232016
 

I was asked if this was a known bug. It’s not. I don’t think anyone knows about it.

dvl: ping. FreshPorts truncates COMMENT. Known issue? https://www.freshports.org/devel/include-what-you-use/

Here’s what I found there:

Tool for use with clang to analyze

Here is what the Makefile contains:

$ grep COMMENT Makefile 
COMMENT=	Tool for use with clang to analyze #includes in C and C++ source files

Here is what the command returns:

$ make -V COMMENT
Tool for use with clang to analyze

It appears the line is treated as containing a comment.

Jul 252016
 

FreshPorts uses a cache for every port page (e.g. https://www.freshports.org/sysutils/bacula-server/). When an update to a port occurs, we must clear the cache for that port. This sounds simple, and it is. The various ways in which updates occur complicates the situation.

This post is all about fixing one edge case: the addition or removal of a port dependency (e.g. RUN_DEPENDS, LIB_DEPENDS).

When port A depends on port B, this fact is listed on both pages, but adding/removing a dependency is not properly handled (see this bug). Tonight, I have coded a fix. I’d like it to be less complex, but it has been fixed.

The table

This is the table in question:

freshports.org=# \d port_dependencies
         Table "public.port_dependencies"
         Column         |     Type     | Modifiers 
------------------------+--------------+-----------
 port_id                | integer      | not null
 port_id_dependent_upon | integer      | not null
 dependency_type        | character(1) | not null
Indexes:
    "port_dependencies_pkey" PRIMARY KEY, btree (port_id, port_id_dependent_upon, dependency_type)
Foreign-key constraints:
    "port_dependencies_port_id_dependent_upon_fkey" FOREIGN KEY (port_id_dependent_upon) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE
    "port_dependencies_port_id_fkey" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=# 

Using the terminology from the first sentence, port A and port B are represented by port_id and port_id_dependent_upon respectively. Any new row or any deleted row must invalidate the cache for both port A and port B.

Keeping track of cache to clear

This is the table we use to clear the ports cache:

freshports.org=# \d cache_clearing_ports
                                     Table "public.cache_clearing_ports"
   Column   |            Type             |                             Modifiers                             
------------+-----------------------------+-------------------------------------------------------------------
 id         | integer                     | not null default nextval('cache_clearing_ports_id_seq'::regclass)
 port_id    | integer                     | not null
 category   | text                        | not null
 port       | text                        | not null
 date_added | timestamp without time zone | not null default now()
Indexes:
    "cache_clearing_ports_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "cache_clearing_ports_port_id_fkey" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=# 

We add rows to this table to indicate that a given port must be cleared from the cache.

On INSERT

On an insert to the port_dependencies table, let’s do this:

CREATE OR REPLACE FUNCTION port_dependencies_insert_clear_cache() RETURNS TRIGGER AS $$
   DECLARE
      l_cache_clearing_ports_id   int8;
      l_port      text;
      l_category  text;
   BEGIN
        --
        -- This function handles the addition of a new dependency.
        -- yes, we need to clear the cache for both port_id and port_id_dependent_upon
        -- from the cache_clearing_ports.  I figure there is a shorter way to this but
        -- I cannot think it through right now.
        --

        IF TG_OP = 'INSERT' THEN
         -- handle port A (port_id)
         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = NEW.port_id;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = NEW.port_id;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (NEW.port_id, l_category, l_port);
          END IF;

         -- handle port B (port_id_dependent_upon)
         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = NEW.port_id_dependent_upon;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = NEW.port_id_dependent_upon;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (NEW.port_id_dependent_upon, l_category, l_port);
          END IF;

          NOTIFY port_updated;
      END IF;

      -- when a port changes, add an entry to the cache clearing table
      RETURN NEW;
   END
$$ LANGUAGE 'plpgsql';

  DROP TRIGGER IF EXISTS port_dependencies_insert_clear_cache ON port_dependencies;
CREATE TRIGGER port_dependencies_insert_clear_cache
    AFTER INSERT on port_dependencies
    FOR EACH ROW
    EXECUTE PROCEDURE port_dependencies_insert_clear_cache();

That function is rather long, and I know it can be simplified, but that’s not on tonight’s agenda. If you are so inclined, I am happy to hear your suggestions. Specifically, I think I can do an INSERT INTO … WHERE NOT EXISTS and do the SELECT & INSERT in one step, without the need for IF FOUND.

On DELETE

Here is the code for the DELETE.

CREATE OR REPLACE FUNCTION port_dependencies_delete_clear_cache() RETURNS TRIGGER AS $$
   DECLARE
      l_cache_clearing_ports_id   int8;
      l_port      text;
      l_category  text;
   BEGIN
        --
        -- This function handles the deletion of a existing dependency.
        -- yes, we need to clear the cache for both port_id and port_id_dependent_upon
        -- from the cache_clearing_ports.  I figure there is a shorter way to this but
        -- I cannot think it through right now.
        --

        IF TG_OP = 'DELETE' THEN
         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = OLD.port_id;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = OLD.port_id;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (OLD.port_id, l_category, l_port);
          END IF;

         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = OLD.port_id_dependent_upon;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = OLD.port_id_dependent_upon;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (OLD.port_id_dependent_upon, l_category, l_port);
          END IF;

          NOTIFY port_updated;
      END IF;

      -- when a port changes, add an entry to the cache clearing table
      RETURN OLD;
   END
$$ LANGUAGE 'plpgsql';

  DROP TRIGGER IF EXISTS port_dependencies_delete_clear_cache ON port_dependencies;
CREATE TRIGGER port_dependencies_delete_clear_cache
    AFTER DELETE on port_dependencies
    FOR EACH ROW
    EXECUTE PROCEDURE port_dependencies_delete_clear_cache();

It is nearly identical to the first function. Again, those of you who want to suggest improvements there, feel free.

I will run this in dev for a while and see how it goes. Initial testing has been promising.

NOTE: there are no updates to this table, by design. Instead when we see a new commit to a port, we do a DELETE from port_dependencies WHERE port_id = :a.

Jul 182016
 

Recently, I’ve been adding more and more file content into the database, mostly because it’s useful. For example, we add the contents of the distinfo, because that information is useful. Next, it was pkg-plist, and before I was done that, someone suggested added in the Makefile.

Neither pkg-plist nor Makefile will be perfect:

  1. Not all ports have a pkg-plist
  2. Some ports use Makefile.inc, and we are not capturing that

Why store these files?

Because we can, and then we can search on it.

Ever wondered how many ports do not have a pkg-plist file?

freshports.org=# select count(*) from ports where pkg_plist is not null and status = 'A';;
 count 
-------
 17632
(1 row)

freshports.org=# select count(*) from ports where pkg_plist is null and status = 'A';;
 count 
-------
 11733
(1 row)

freshports.org=#

How many ports use OSVERSION?

freshports.org=# select count(*) from ports where makefile ilike '%OSVERSION%';
 count 
-------
   658
(1 row)

Yes, you can do this with find and grep, but these queries are much faster than that.

We can create all kinds of queries with this data, and allow you to save the output in plain text form.

Loading this up on dev

I loaded this up on dev first. Then I looked at how many Makefiles I had.

freshports.org=# select count(*) from ports where Makefile is not null and status = 'A';;
 count 
-------
 29349
(1 row)

freshports.org=# select count(*) from ports where Makefile is null and status = 'A';;
 count 
-------
    10
(1 row)

Ten ports without Makefiles? Who are they?

freshports.org=# select id, element_pathname(element_id) from ports where Makefile is null and status = 'A';;
  id   |                    element_pathname                     
-------+---------------------------------------------------------
 39093 | /ports/branches/2014Q1/print/typetools
 39450 | /ports/branches/2014Q2/lang/twelf
 39268 | /ports/branches/2014Q1/graphics/ipe
 38946 | /ports/branches/2014Q1/textproc/hs-lhs2tex
 39958 | /ports/branches/2014Q4/multimedia/gmp-api
 40059 | /ports/branches/2014Q4/multimedia/openh264
 40223 | /ports/branches/2016Q1/sysutils/p5-Schedule-Cron-Events
 39208 | /ports/branches/2014Q1/textproc/dblatex
 40070 | /ports/branches/2015Q1/textproc/p5-HTML-FormatExternal
 39110 | /ports/branches/2014Q1/www/chromium
(10 rows)

Ahh, that explains these errors I saw while processing some ports. I won’t worry about they because they are all old branches.

Jul 18 02:48:20 jester FreshPorts[69764]: error executing make command for multimedia/gmp-api: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q4 multimedia/gmp-api 2>/tmp/FreshPorts.multimedia.gmp-api.make-error.2016.7.18.2.48.20.69764  produced this error:  Error message is: cd: /usr/local/repos/PORTS-2014Q4/multimedia/gmp-api: No such file or directory make: cannot open /usr/local/repos/PORTS-2014Q4/multimedia/gmp-api/Makefile. Make results are :  make: stopped in /  (/usr/local/FreshPorts/scripts) 
Jul 18 02:50:33 jester FreshPorts[69764]: error executing make command for multimedia/openh264: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q4 multimedia/openh264 2>/tmp/FreshPorts.multimedia.openh264.make-error.2016.7.18.2.50.33.69764  produced this error:  Error message is: cd: /usr/local/repos/PORTS-2014Q4/multimedia/openh264: No such file or directory make: cannot open /usr/local/repos/PORTS-2014Q4/multimedia/openh264/Makefile. Make results are :  make: stopped in /  (/usr/local/FreshPorts/scripts) 
Jul 18 03:12:40 jester FreshPorts[69764]: error executing make command for print/typetools: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q1 print/typetools 2>/tmp/FreshPorts.print.typetools.make-error.2016.7.18.3.12.40.69764  produced this error:  Error message is: make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.tex.mk" line 92: malformed TEX_DEFAULT: tetex  (/usr/local/FreshPorts/scripts) 
Jul 18 03:35:37 jester FreshPorts[69764]: error executing make command for sysutils/p5-Schedule-Cron-Events: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2016Q1 sysutils/p5-Schedule-Cron-Events 2>/tmp/FreshPorts.sysutils.p5-Schedule-Cron-Events.make-error.2016.7.18.3.35.37.69764  produced this error:  Error message is: cd: /usr/local/repos/PORTS-2016Q1/sysutils/p5-Schedule-Cron-Events: No such file or directory make: cannot open /usr/local/repos/PORTS-2016Q1/sysutils/p5-Schedule-Cron-Events/Makefile. Make results are :  make: stopped in /  (/usr/local/FreshPorts/scripts) 
Jul 18 03:38:48 jester FreshPorts[69764]: error executing make command for textproc/dblatex: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q1 textproc/dblatex 2>/tmp/FreshPorts.textproc.dblatex.make-error.2016.7.18.3.38.48.69764  produced this error:  Error message is: make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.tex.mk" line 92: malformed TEX_DEFAULT: tetex  (/usr/local/FreshPorts/scripts) 
Jul 18 03:39:50 jester FreshPorts[69764]: error executing make command for textproc/hs-lhs2tex: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q1 textproc/hs-lhs2tex 2>/tmp/FreshPorts.textproc.hs-lhs2tex.make-error.2016.7.18.3.39.50.69764  produced this error:  Error message is: make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.tex.mk" line 92: malformed TEX_DEFAULT: tetex  (/usr/local/FreshPorts/scripts) 
Jul 18 03:41:25 jester FreshPorts[69764]: error executing make command for textproc/p5-HTML-FormatExternal: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2015Q1 textproc/p5-HTML-FormatExternal 2>/tmp/FreshPorts.textproc.p5-HTML-FormatExternal.make-error.2016.7.18.3.41.25.69764  produced this error:  Error message is: cd: /usr/local/repos/PORTS-2015Q1/textproc/p5-HTML-FormatExternal: No such file or directory make: cannot open /usr/local/repos/PORTS-2015Q1/textproc/p5-HTML-FormatExternal/Makefile. Make results are :  make: stopped in /  (/usr/local/FreshPorts/scripts) 
Jul 18 03:48:07 jester FreshPorts[69764]: error executing make command for www/chromium: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q1 www/chromium 2>/tmp/FreshPorts.www.chromium.make-error.2016.7.18.3.48.7.69764  produced this error:  Error message is: make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" line 1516: Cannot open /usr/local/repos/PORTS-2014Q1/Mk/Uses/tar.mk make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" line 5281: warning: duplicate script for target "-depends" ignored make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" line 5278: warning: using previous script for "-depends" defined here make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" line 5281: warning: duplicate script for target "-depends" ignored make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" line 5278: warning: using previous script for "-depends" defined here make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" lin

What next?

Right now, the data is loading up… Makefile, pkg-plist, distinfo, etc. The script should be done at about 0800 UTC.

Tomorrow morning, I will restart commit processing, and clear the cache so you can see all the pretty new stuff.

In the coming weeks, the search page will be amended to allow searching on the new fields.

Jul 152016
 

About two weeks ago, I wrote a function for creating a port. This will take the port from head and copy it to a branch. This is vital because of the reasons discussed here: creating a new port and copying a port from head to a branch.

This function worked and was useful, but it was missed one vital part: population of the ports_categories table. This table tells us what categories the port is in. Now that I think of it, I may have to populate other tables, such as:

  1. ports_vulnerable
  2. ports_moved
  3. ports_updating_xref
  4. commit_log_ports_vuxml

Those can wait for another day.

Existing bugs fixed

While debugging this issue, I noticed that CopyPortFromHeadToBranch() was documented as returning the port id, but was returning the element id. The difference is that all ports are elements, but not all elements are ports. The code which invokes CopyPortFromHeadToBranch() actually creates the port, while the function just copies stuff around in the element table.

The diff is as follow:

-CREATE OR REPLACE FUNCTION CopyPortFromHeadToBranch(text, text, text) returns int4 AS $$
+CREATE OR REPLACE FUNCTION CopyPortFilesFromHeadToBranch(text, text, text) returns int4 AS $$
 DECLARE
   a_category_name ALIAS for $1;
   a_port_name     ALIAS for $2;
   a_branch_name   ALIAS for $3;
  
-  l_r             element_id_pathname%rowtype;
-  l_port_id       int4;
- 
+  l_r                      element_id_pathname%rowtype;
+  l_element_id_of_new_port int4;
+
   BEGIN
- 
+
+  -- this query will add the new port... the key is the element_add() function call
+
   FOR l_r IN
   WITH RECURSIVE all_descendents AS ( 
   SELECT id, name, parent_id, directory_file_flag, status
@@ -3243,20 +3248,22 @@
     FROM element E
     JOIN all_descendents AD
       ON (E.parent_id = AD.id)
-)
-    SELECT element_add(replace(element_pathname(id), '/ports/head/', '/ports/branches/' || a_branch_name || '/'), directory_file_flag),
-                       replace(element_pathname(id), '/ports/head/', '/ports/branches/2016Q2/')
+  )
+    SELECT element_add(replace(element_pathname(id), '/ports/head/', '/ports/branches/' || a_branch_name || '/'), directory_file_flag)
       FROM all_descendents
      WHERE status = 'A'
   LOOP
   END LOOP;
- 
+
   IF FOUND THEN
-    SELECT pathname_id('/ports/branches/' || a_branch_name || '/' || a_category_name || '/' || a_port_name || '/')
-      INTO l_port_id;
+    -- based on the element_id of the pathname for newly created port, grab the port id
+     SELECT pathname_id('/ports/branches/' || a_branch_name || '/' || a_category_name || '/' || a_port_name || '/')
+      INTO l_element_id_of_new_port;
+
   END IF;
- 
-  return l_port_id;
+
+  return l_element_id_of_new_port;
+
   END;
 $$  LANGUAGE 'plpgsql';
 
@@ -3278,12 +3285,27 @@
  
   BEGIN
  
-  SELECT CopyPortFromHeadToBranch(a_category_name, a_port_name, a_branch_name)
+  SELECT CopyPortFilesFromHeadToBranch(a_category_name, a_port_name, a_branch_name)
     INTO l_element_id_of_new_port;
  
   IF FOUND THEN
     SELECT createport(l_element_id_of_new_port, getcategory(a_category_name))
       INTO l_port_id;
+
+    SELECT PA.id
+      FROM ports_active PA
+     WHERE PA.element_id = l_element_id_of_new_port
+      INTO l_port_id;
+
+    -- we need to set up the port_categories table first...
+
+    RAISE WARNING 'about to call GetPort and I only want stuff from head, so branch agnostic is OK.';
+
+    INSERT INTO ports_categories(port_id, category_id)
+     SELECT l_port_id, PC.category_id
+       FROM ports_categories PC
+      WHERE PC.port_id = GetPort(a_category_name, a_port_name);
+
   END IF;
   
   RETURN l_port_id;

I will outline the problems fixed above:

  • line 2 – Rename the function to reflect what it does
  • line 11 – we are returning the element id of the new port
  • line 28 – remove this extraneous parameter from the SELECT
  • line 65 – grab the port id based on the element id and use that to copy values from the ports_categories from the old code to the new port
  • line 74 – I had tried to do this INSERT within the CopyPortFilesFromHeadToBranch() function, but that can’t be done because the port is not created there, it is created on line 62. Once I realized that issue, which was perpetuated by the incorrectly named variable on line 39, it all became clear.

The working code

Here is the code which http://dev.freshports.org/ is using now.

--
-- this function copies a port files from /ports/head/CATEGORY/PORT to /ports/branches/BRANCH
-- It returns the id of the new element for that port.
-- This function DOES not create the port itself (i.e. no entries added to port table).
--
CREATE OR REPLACE FUNCTION CopyPortFilesFromHeadToBranch(text, text, text) returns int4 AS $$
DECLARE
  a_category_name ALIAS for $1;
  a_port_name     ALIAS for $2;
  a_branch_name   ALIAS for $3;

  l_r                      element_id_pathname%rowtype;
  l_element_id_of_new_port int4;

  BEGIN

  -- this query will add the new port... the key is the element_add() function call

  FOR l_r IN
  WITH RECURSIVE all_descendents AS ( 
  SELECT id, name, parent_id, directory_file_flag, status
    FROM element
    WHERE id = (select pathname_id('/ports/head/' || a_category_name || '/' || a_port_name || '/'))
  UNION
  SELECT E.id, E.name, E.parent_id, E.directory_file_flag, E.status
    FROM element E
    JOIN all_descendents AD
      ON (E.parent_id = AD.id)
  )
    SELECT element_add(replace(element_pathname(id), '/ports/head/', '/ports/branches/' || a_branch_name || '/'), directory_file_flag)
      FROM all_descendents
     WHERE status = 'A'
  LOOP
  END LOOP;

  IF FOUND THEN
    -- based on the element_id of the pathname for newly created port, grab the port id
     SELECT pathname_id('/ports/branches/' || a_branch_name || '/' || a_category_name || '/' || a_port_name || '/')
      INTO l_element_id_of_new_port;

  END IF;

  return l_element_id_of_new_port;

  END;
$$  LANGUAGE 'plpgsql';


--
-- this function create a port on a branch using head as a starting point.
-- i.e. from /ports/head/CATEGORY/PORT to /ports/branches/BRANCH/CATEGORY/PORT
-- is created.
-- It returns the id of the new port.
--
CREATE OR REPLACE FUNCTION CreatePort(text, text, text) returns int4 AS $$
DECLARE
  a_category_name ALIAS for $1;
  a_port_name     ALIAS for $2;
  a_branch_name   ALIAS for $3;

  l_element_id_of_new_port int4;
  l_port_id                int4;

  BEGIN

  SELECT CopyPortFilesFromHeadToBranch(a_category_name, a_port_name, a_branch_name)
    INTO l_element_id_of_new_port;

  IF FOUND THEN
    SELECT createport(l_element_id_of_new_port, getcategory(a_category_name))
      INTO l_port_id;

    SELECT PA.id
      FROM ports_active PA
     WHERE PA.element_id = l_element_id_of_new_port
      INTO l_port_id;

    -- we need to set up the port_categories table first...

    RAISE WARNING 'about to call GetPort and I only want stuff from head, so branch agnostic is OK.';

    INSERT INTO ports_categories(port_id, category_id)
     SELECT l_port_id, PC.category_id
       FROM ports_categories PC
      WHERE PC.port_id = GetPort(a_category_name, a_port_name);

  END IF;

  RETURN l_port_id;

  END;
$$  LANGUAGE 'plpgsql';