Dan Langille

I've been playing with Open Source software, starting with FreeBSD, since New Zealand Post installed DSL on my street in 1998. From there, I started writing at The FreeBSD Diary, moving my work here after I discovered WordPress. Along the way, I started the BSDCan and PGCon conferences. I slowly moved from software development into full time systems administration and now work for very-well known company who has been a big force in the security industry.

Aug 162020
 

One of the lesser known FreshPorts features is searching pkg-plist. You want nslookup and dig but you can’t find the package which installs that. The FreshPorts search can find it for you.

When a new commit occurs for a given port, FreshPorts extracts and store the pkg-plist information. This allows you to search.

How does FreshPorts get that data?

make generate-plist

This is the script which runs make generate-plist:

[dan@dev-ingress01:/var/db/freshports/ports-jail] $ cat make-generate-plist.sh
#!/bin/sh
#
# This extracts the config options for a given port.
#
# sudo /usr/sbin/chroot -u USER JAIL /make-generate-plist.sh REPO_PATH PORT
#
# where USER      - user as which to execute the commands.  e.g. dan
#       JAIL      - path to the jail created with the create-jail-directories.sh command. e.g. /usr/jail/FreshPorts
#       REPO_PATH - path to the SVN repository e.g. /usr/local/PORTS-RELENG_9_1_0
#       PORT      - sysutils/bacula-server
#

. ./vars.sh

REPO_PATH=$1
PORT=$2

cd ${REPO_PATH}/${PORT}

${MAKE} generate-plist PORTSDIR=${REPO_PATH} OPTIONSFILE=${LOCALBASE} -f ${REPO_PATH}/${PORT}/Makefile > /dev/null
if [ $? == 0 ]
then
  cat work/.PLIST.mktmp
  ${MAKE} clean > /dev/null
fi
[dan@dev-ingress01:/var/db/freshports/ports-jail] $ 

Let’s try running that from the command line and see what is generated:

$ cd :/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools
$ sudo make generate-plist -f ./Makefile PORTSDIR=../../ 
===>   Generating temporary packing list

Next, let’s look at the file we just created:

$ make -V WRKDIR
/var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work
$ cat /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/.PLIST.mktmp
/usr/local/share/licenses/p5-Mail-Tools-2.21/catalog.mk
/usr/local/share/licenses/p5-Mail-Tools-2.21/LICENSE
/usr/local/share/licenses/p5-Mail-Tools-2.21/ART10
/usr/local/share/licenses/p5-Mail-Tools-2.21/GPLv1+
lib/perl5/site_perl/Mail/Address.pm
lib/perl5/site_perl/Mail/Address.pod
lib/perl5/site_perl/Mail/Cap.pm
lib/perl5/site_perl/Mail/Cap.pod
lib/perl5/site_perl/Mail/Field.pm
lib/perl5/site_perl/Mail/Field.pod
lib/perl5/site_perl/Mail/Field/AddrList.pm
lib/perl5/site_perl/Mail/Field/AddrList.pod
lib/perl5/site_perl/Mail/Field/Date.pm
lib/perl5/site_perl/Mail/Field/Date.pod
lib/perl5/site_perl/Mail/Field/Generic.pm
lib/perl5/site_perl/Mail/Field/Generic.pod
lib/perl5/site_perl/Mail/Filter.pm
lib/perl5/site_perl/Mail/Filter.pod
lib/perl5/site_perl/Mail/Header.pm
lib/perl5/site_perl/Mail/Header.pod
lib/perl5/site_perl/Mail/Internet.pm
lib/perl5/site_perl/Mail/Internet.pod
lib/perl5/site_perl/Mail/Mailer.pm
lib/perl5/site_perl/Mail/Mailer.pod
lib/perl5/site_perl/Mail/Mailer/qmail.pm
lib/perl5/site_perl/Mail/Mailer/rfc822.pm
lib/perl5/site_perl/Mail/Mailer/sendmail.pm
lib/perl5/site_perl/Mail/Mailer/smtp.pm
lib/perl5/site_perl/Mail/Mailer/smtps.pm
lib/perl5/site_perl/Mail/Mailer/testfile.pm
lib/perl5/site_perl/Mail/Send.pm
lib/perl5/site_perl/Mail/Send.pod
lib/perl5/site_perl/Mail/Util.pm
lib/perl5/site_perl/Mail/Util.pod
lib/perl5/site_perl/MailTools.pm
lib/perl5/site_perl/MailTools.pod
lib/perl5/site_perl/man/man3/Mail::Address.3.gz
lib/perl5/site_perl/man/man3/Mail::Cap.3.gz
lib/perl5/site_perl/man/man3/Mail::Field.3.gz
lib/perl5/site_perl/man/man3/Mail::Field::AddrList.3.gz
lib/perl5/site_perl/man/man3/Mail::Field::Date.3.gz
lib/perl5/site_perl/man/man3/Mail::Field::Generic.3.gz
lib/perl5/site_perl/man/man3/Mail::Filter.3.gz
lib/perl5/site_perl/man/man3/Mail::Header.3.gz
lib/perl5/site_perl/man/man3/Mail::Internet.3.gz
lib/perl5/site_perl/man/man3/Mail::Mailer.3.gz
lib/perl5/site_perl/man/man3/Mail::Send.3.gz
lib/perl5/site_perl/man/man3/Mail::Util.3.gz
lib/perl5/site_perl/man/man3/MailTools.3.gz
@owner
@group
@mode

That matches what I see (today) on the webpage for that port.

Clean up:

$ make clean
===>  Cleaning for p5-Mail-Tools-2.21

make makeplist

Let’s try make makeplist:

[freshports@dev-ingress01 ~/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools]$ make makeplist -f ./Makefile PORTSDIR=../../
===>  License ART10 GPLv1+ accepted by the user
===>   p5-Mail-Tools-2.21 depends on file: /usr/local/sbin/pkg - found
=> MailTools-2.21.tar.gz doesn't seem to exist in /var/ports/distfiles/.
=> /var/ports/distfiles/ is not writable by you; cannot fetch.
*** Error code 1

Stop.
make: stopped in /var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools

This required fetching the DISTFILES.

From another ssh session, as another user, I ran this:

$ ls -ld /var/ports/distfiles/
drwxr-xr-x  2 root  wheel  4 Jun 10  2019 /var/ports/distfiles/
$ sudo chown -R freshports:freshports /var/ports/distfiles/

Let’s try that again:

$ make makeplist -f ./Makefile PORTSDIR=../../
===>  License ART10 GPLv1+ accepted by the user
===>   p5-Mail-Tools-2.21 depends on file: /usr/local/sbin/pkg - found
=> MailTools-2.21.tar.gz doesn't seem to exist in /var/ports/distfiles/.
=> Attempting to fetch https://cpan.metacpan.org/modules/by-module/Mail/MailTools-2.21.tar.gz
MailTools-2.21.tar.gz                                   56 kB 2943 kBps    00s
===> Fetching all distfiles required by p5-Mail-Tools-2.21 for building
===>  Extracting for p5-Mail-Tools-2.21
=> SHA256 Checksum OK for MailTools-2.21.tar.gz.
===>  Patching for p5-Mail-Tools-2.21
===>   p5-Mail-Tools-2.21 depends on package: p5-TimeDate>=0 - found
===>   p5-Mail-Tools-2.21 depends on package: perl5>=5.32.r0<5.33 - found
===>   p5-Mail-Tools-2.21 depends on package: perl5>=5.32.r0<5.33 - found
===>  Configuring for p5-Mail-Tools-2.21
Checking if your kit is complete...
Looks good
Generating a Unix-style Makefile
Writing Makefile for MailTools
Writing MYMETA.yml and MYMETA.json
===>  Building for p5-Mail-Tools-2.21
--- blib/lib/.exists ---
--- blib/arch/.exists ---
--- blib/lib/auto/MailTools/.exists ---
--- blib/arch/auto/MailTools/.exists ---
--- blib/bin/.exists ---
--- blib/script/.exists ---
--- blib/man1/.exists ---
--- blib/man3/.exists ---
--- config ---
--- subdirs ---
--- dynamic ---
--- ppd ---
--- pm_to_blib ---
--- blibdirs ---
--- config ---
--- dynamic ---
--- linkext ---
--- pm_to_blib ---
cp lib/Mail/Cap.pm blib/lib/Mail/Cap.pm
cp lib/Mail/Send.pm blib/lib/Mail/Send.pm
cp lib/Mail/Filter.pm blib/lib/Mail/Filter.pm
cp lib/Mail/Field/AddrList.pod blib/lib/Mail/Field/AddrList.pod
cp lib/Mail/Cap.pod blib/lib/Mail/Cap.pod
cp lib/Mail/Field/Generic.pod blib/lib/Mail/Field/Generic.pod
cp lib/MailTools.pm blib/lib/MailTools.pm
cp lib/Mail/Mailer/qmail.pm blib/lib/Mail/Mailer/qmail.pm
cp lib/Mail/Address.pod blib/lib/Mail/Address.pod
cp lib/Mail/Util.pod blib/lib/Mail/Util.pod
cp lib/Mail/Mailer/smtp.pm blib/lib/Mail/Mailer/smtp.pm
cp lib/Mail/Header.pm blib/lib/Mail/Header.pm
cp lib/Mail/Field.pm blib/lib/Mail/Field.pm
cp lib/Mail/Filter.pod blib/lib/Mail/Filter.pod
cp lib/Mail/Field.pod blib/lib/Mail/Field.pod
cp lib/Mail/Mailer/sendmail.pm blib/lib/Mail/Mailer/sendmail.pm
cp lib/Mail/Mailer/rfc822.pm blib/lib/Mail/Mailer/rfc822.pm
cp lib/Mail/Field/Date.pm blib/lib/Mail/Field/Date.pm
cp lib/Mail/Header.pod blib/lib/Mail/Header.pod
cp lib/Mail/Field/Date.pod blib/lib/Mail/Field/Date.pod
cp lib/Mail/Internet.pod blib/lib/Mail/Internet.pod
cp lib/Mail/Mailer.pod blib/lib/Mail/Mailer.pod
cp lib/Mail/Field/AddrList.pm blib/lib/Mail/Field/AddrList.pm
cp lib/Mail/Mailer/smtps.pm blib/lib/Mail/Mailer/smtps.pm
cp lib/Mail/Field/Generic.pm blib/lib/Mail/Field/Generic.pm
cp lib/Mail/Internet.pm blib/lib/Mail/Internet.pm
cp lib/Mail/Address.pm blib/lib/Mail/Address.pm
cp lib/Mail/Mailer/testfile.pm blib/lib/Mail/Mailer/testfile.pm
cp lib/Mail/Util.pm blib/lib/Mail/Util.pm
cp lib/Mail/Mailer.pm blib/lib/Mail/Mailer.pm
cp lib/Mail/Send.pod blib/lib/Mail/Send.pod
cp lib/MailTools.pod blib/lib/MailTools.pod
--- pure_all ---
--- manifypods ---
Manifying 13 pod documents
--- all ---
===>  Staging for p5-Mail-Tools-2.21
===>   p5-Mail-Tools-2.21 depends on package: p5-TimeDate>=0 - found
===>   p5-Mail-Tools-2.21 depends on package: perl5>=5.32.r0<5.33 - found
===>   Generating temporary packing list
Manifying 13 pod documents
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/MailTools.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/MailTools.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Field.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Mailer.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Mailer.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Field.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Address.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Filter.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Cap.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Internet.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Util.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Internet.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Send.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Header.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Header.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Cap.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Send.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Address.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Filter.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Util.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Field/Date.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Field/Generic.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Field/Generic.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Field/AddrList.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Field/Date.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Field/AddrList.pod
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Mailer/qmail.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Mailer/smtps.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Mailer/sendmail.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Mailer/testfile.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Mailer/rfc822.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/Mail/Mailer/smtp.pm
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Send.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Field::Generic.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Util.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Field.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Address.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Field::Date.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Mailer.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Cap.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Internet.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Header.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/MailTools.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Filter.3
Installing /var/ports/var/db/freshports/ports-jail/var/db/repos/PORTS-head/mail/p5-Mail-Tools/work/stage/usr/local/lib/perl5/site_perl/man/man3/Mail::Field::AddrList.3
====> Compressing man pages (compress-man)
/you/have/to/check/what/makeplist/gives/you
%%SITE_PERL%%/Mail/Address.pm
%%SITE_PERL%%/Mail/Address.pod
%%SITE_PERL%%/Mail/Cap.pm
%%SITE_PERL%%/Mail/Cap.pod
%%SITE_PERL%%/Mail/Field.pm
%%SITE_PERL%%/Mail/Field.pod
%%SITE_PERL%%/Mail/Field/AddrList.pm
%%SITE_PERL%%/Mail/Field/AddrList.pod
%%SITE_PERL%%/Mail/Field/Date.pm
%%SITE_PERL%%/Mail/Field/Date.pod
%%SITE_PERL%%/Mail/Field/Generic.pm
%%SITE_PERL%%/Mail/Field/Generic.pod
%%SITE_PERL%%/Mail/Filter.pm
%%SITE_PERL%%/Mail/Filter.pod
%%SITE_PERL%%/Mail/Header.pm
%%SITE_PERL%%/Mail/Header.pod
%%SITE_PERL%%/Mail/Internet.pm
%%SITE_PERL%%/Mail/Internet.pod
%%SITE_PERL%%/Mail/Mailer.pm
%%SITE_PERL%%/Mail/Mailer.pod
%%SITE_PERL%%/Mail/Mailer/qmail.pm
%%SITE_PERL%%/Mail/Mailer/rfc822.pm
%%SITE_PERL%%/Mail/Mailer/sendmail.pm
%%SITE_PERL%%/Mail/Mailer/smtp.pm
%%SITE_PERL%%/Mail/Mailer/smtps.pm
%%SITE_PERL%%/Mail/Mailer/testfile.pm
%%SITE_PERL%%/Mail/Send.pm
%%SITE_PERL%%/Mail/Send.pod
%%SITE_PERL%%/Mail/Util.pm
%%SITE_PERL%%/Mail/Util.pod
%%SITE_PERL%%/MailTools.pm
%%SITE_PERL%%/MailTools.pod
%%SITE_ARCH%%/auto/MailTools/.packlist
%%PERL5_MAN3%%/Mail::Address.3.gz
%%PERL5_MAN3%%/Mail::Cap.3.gz
%%PERL5_MAN3%%/Mail::Field.3.gz
%%PERL5_MAN3%%/Mail::Field::AddrList.3.gz
%%PERL5_MAN3%%/Mail::Field::Date.3.gz
%%PERL5_MAN3%%/Mail::Field::Generic.3.gz
%%PERL5_MAN3%%/Mail::Filter.3.gz
%%PERL5_MAN3%%/Mail::Header.3.gz
%%PERL5_MAN3%%/Mail::Internet.3.gz
%%PERL5_MAN3%%/Mail::Mailer.3.gz
%%PERL5_MAN3%%/Mail::Send.3.gz
%%PERL5_MAN3%%/Mail::Util.3.gz
%%PERL5_MAN3%%/MailTools.3.gz
$ 

That list is arguably better than the first list. Here are the issues I see:

  1. Variables (e.g. %%SITE_PERL%%) are better than actual pathnames because that shows you where things might go, as opposed to the current default configuration.
  2. Full pathnames are better than variables because you’re looking for an actual file – non developers will want this.
  3. It is easier to get the output from make generate-plist than make makeplist.
  4. The output of make generate-plist is more comprehensive (e.g. it includes catalog.mk
  5. make makeplist requires the DISTFILES

Hmm.

Aug 152020
 

A few days ago, I did the database work to ignore case when browsing to a cat/port. Tonight, I did the website side.

database access

First, I altered the database function (in classes/element_record.php) to use the new stored procedure:

-       function FetchByName($Name) {
+       function FetchByName($Name, $caseSensitive = true) {
+               $Debug = 0;
+
+               if ($Debug) echo "looking for '$Name' and caseSensitive is '$caseSensitive'<br>";
                if (IsSet($Name)) {
                        $this->element_pathname = $Name;
                        $this->id = '';
                }
-               $sql = "select * from elementGet('" . pg_escape_string($Name) . "')";
 
-               $result = pg_exec($this->dbh, $sql);
+               if ($caseSensitive) {
+                       $sql = "select * from elementGet('" . pg_escape_string($Name) . "')";
+                       if ($Debug) echo "invoking $sql<br>";
+                       $result = pg_exec($this->dbh, $sql);
+               } else {
+                       $result = pg_query_params($this->dbh, 'select * from elementGetCaseInsensitive($1)', array($Name));
+               }
+
                if ($result) {
+                       if ($Debug) echo "we got a result<br>";
+
                        $numrows = pg_numrows($result);
+                       if ($Debug) echo "we have '$numrows' rows<br>";
                        if ($numrows == 1) {
                                $myrow = pg_fetch_array ($result, 0);
                                $this->PopulateValues($myrow);

On line one, I modify the function to pass in an optional parameter which preserves the original behavior.

Line 10 is the original code.

Lines 13-20 invokes the elementGetCaseInsensitive() stored procedure to work on a case insensitive solution.

Redirecting when finding a different match

When getting a result, we check to see if the path differs. If we get a result, we know it matches, but we need to know if it’s a different case. Here is that code (from rewrite/functions.php):

-       if ($ElementRecord->FetchByName('/ports/head/' . $pathname)) {
+       if ($ElementRecord->FetchByName(FRESHPORTS_PORTS_TREE_PREFIX . PATH_NAME, 0)) {
                $IsElement = true;
                if ($Debug) echo 'we found an element for that<br>';
+               if ($Debug) echo "we have: '$ElementRecord->element_pathname'<br>";
+               if ($Debug) echo " we had: '" . FRESHPORTS_PORTS_TREE_PREFIX . PATH_NAME . "'<br>";
+               if (PathnameDiffers($ElementRecord->element_pathname . '/', FRESHPORTS_PORTS_TREE_PREFIX . PATH_NAME)) {
+                       # in a case insensitive search, we want to redirect if the case was wrong
+                       if ($Debug) echo "we are redirecting to '" . $ElementRecord->element_pathname . "/'<br>";
+                       if ($Debug) echo 'which normalizes to ' . str_replace(FRESHPORTS_PORTS_TREE_PREFIX, '/', $ElementRecord->element_pathname . '/<br>');
+                       $https = ((!empty($_SERVER['HTTPS'])) && ($_SERVER['HTTPS'] != 'off'));
+                       if ($https) {
+                               $protocol = "https";
+                       } else {
+                               $protocol = "http";
+                       }
+
+                       header("HTTP/1.1 301 Moved Permanently");
+                       header('Location: ' . $protocol . '://' . $_SERVER['HTTP_HOST'] . str_replace(FRESHPORTS_PORTS_TREE_PREFIX, '/', $ElementRecord->element_pathname . '/'));
+                       exit;
+               }
+       }
+

Line 7 invokes PathDiffers to do a comparison between the two strings. This is not a simple A != B operation. sysutils/bacula9-server is not the same as sysutils/bacula9-server/ despite them showing the same content on the website. The paths, or URIs, must first be ‘normalized’ (that’s my chosen terminology, not anything official).

If the user has included a trailing slash (/) on their URL, the code will strip that before looking for a matching entry in the element table.

The debugging information shows some of this:

the URI is
'/sysutils/bacula9-SERVER/'

the url parts are
array(1) {
  ["path"]=>
  string(25) "/sysutils/bacula9-SERVER/"
}

The pathname is '/sysutils/bacula9-SERVER/'
we found an element for that
we have: '/ports/head/sysutils/bacula9-server'
we had: '/ports/head/sysutils/bacula9-SERVER'
we are redirecting to '/ports/head/sysutils/bacula9-server/'
which normalizes to /sysutils/bacula9-server/

The function in question looks like this:

function PathnameDiffers($Path1, $Path2) {
    # if the two paths are different, we might want to redirect
    # if one path ends in a / and the other does not, adjust them
    if (substr($Path1, -1) == '/') {
        if (substr($Path2, -1) != '/') {
            $Path2 .= '/';
        }
    } else {
        if (substr($Path2, -1) == '/') {
            $Path1 .= '/';
        }
    }

    return $Path1 != $Path2;
}

That redirect happens early enough in the code to handle all elements, not just ports (e.g. cat/ports/Makefile or distinfo or pkg-descr).

Aug 112020
 

This idea came to me today from swills:

stupid question, would it be possible to make the urls on freshports case insensitive?

so that like https://www.freshports.org/x11/libx11 would go to https://www.freshports.org/x11/libX11

After confirming this was for manually typed URLs (as opposed to generated links), I started looking into it. This is what I found.

In this post, we’re going to look at some old PHP database code and the PostgreSQL stored procedures behind them. I’ll show you the multi-line stored procedure now in use and how a single query will now do the same thing. This change isn’t some clever coding or another solution. The database has evolved and the procedures did not have to. Until now.

Going from sysutils/py-mqttwarn to a database entry

I will base all my links to code on the 1.50.5 release.

If the Nginx webserver cannot follow the URL to a file on disk, it invokes www/–/index.php via this bit of declartion:

  location / {
    try_files $uri $uri/ @missing;
  }

  location @missing {
    rewrite ^ /--/index.php break;

    fastcgi_split_path_info ^(.+\.php)(/.+)$;
    fastcgi_pass unix:/var/run/php-fpm.sock;
    fastcgi_index index.php;
    fastcgi_param SCRIPT_FILENAME $request_filename;
    include fastcgi_params;
  }

Line 2 says: try finding the URL using files on disk and if you file, go do what @missing points to.

Line 6 say: invoke /–/index.php for things you don’t find in a file.

Why /–/index.php? I wanted to put this file somewhere that it would be highly unlikely to collide with anything in the FreeBSD repo (with regards to directories and file names).

The code checks for other special stuff ((badges, scripts, API, etc) before invoking freshports_Parse404URI() on line 148. I will discuss that soon, but first: if nothing else is picked up by /–/index.php, then it’s time to invoke rewrite/missing.php

freshports_Parse404URI() is declared in rewrite/functions.php and the code relevant to our journey is on line 70:

if ($ElementRecord->FetchByName('/ports/head/' . $pathname)) {

Ignoring that hardcoded pathname for now, this function gets passed something like this:

/ports/head/sysutils/py-mqttwarn

There might be more on the path: branch name, commit number, etc.

FetchByName() is located on line 38 of classes/element_record

Now we are getting into the oldest code of FreshPorts. I’m sure this stuff hasn’t changed in over 15 years.

The database function elementGet() is invoked which looks like this:

CREATE OR REPLACE FUNCTION elementGet (text) RETURNS SETOF element_type AS $$
   SELECT id,
          name::text,
          directory_file_flag::text,
          status::text,
          case when IsCategory(Pathname_ID($1)) IS NULL THEN FALSE ELSE TRUE END,
          case when IsPort(    Pathname_ID($1)) IS NULL THEN FALSE ELSE TRUE END,
          element_pathname(id)
     FROM element
    WHERE id = PathName_ID($1);
$$ LANGUAGE SQL STABLE;

The key in this function is PathName_ID() which is used three times and uniquely identifies a single row in the element table.

Element table

The element table is an implementation of a filesystem using a self-referential solution (i.e. it is recursive).

The table (without constraints, references, and triggers) looks like this:

freshports.dev=# \d element
                                     Table "public.element"
       Column        |     Type     | Collation | Nullable |               Default               
---------------------+--------------+-----------+----------+-------------------------------------
 id                  | integer      |           | not null | nextval('element_id_seq'::regclass)
 name                | text         |           | not null | 
 parent_id           | integer      |           |          | 
 directory_file_flag | character(1) |           | not null | 
 status              | character(1) |           | not null | 
Indexes:
    "element_pkey" PRIMARY KEY, btree (id)
    "element_delete" btree (status) WHERE status = 'D'::bpchar
    "element_name" btree (name)
    "element_parent_id" btree (parent_id)

An entry in the top level directory will have parent_id = NULL.

Pathname_ID function

Pathname_ID() is a function which progresses along a supplied pathname from left to right following the each entry from the element table.

CREATE OR REPLACE FUNCTION Pathname_ID(text) returns int4 AS $$

   DECLARE
      in_element_pathname ALIAS FOR $1;

      debug_str         text;
      element_id        int4;
      element_parent_id int4;
      element_name      text;
      element_pathname  text;
      position          int4;
      slash             text;

begin
  slash            := '/';
  element_pathname := in_element_pathname;
  element_id       := -1;

  IF substr(element_pathname,1,1) = slash THEN
    element_pathname := substr(element_pathname, 2);
  END IF;

  position := strpos(element_pathname, slash);
  IF position = 0 THEN
    element_name     := element_pathname;
    element_pathname := '';
  ELSE
    element_name     := substr(element_pathname, 1, position-1);
    element_pathname := substr(element_pathname, position+1);
  END IF;

  select id 
    into element_parent_id
    from element 
   where element.name = element_name
     and parent_id is null;

  debug_str := debug_str || '1 = ' || element_name;

  WHILE(char_length(element_pathname) > 0) LOOP
    position := strpos(element_pathname, slash);
    if position = 0 then
       element_name     := element_pathname;
       element_pathname := '';
    else
       element_name     := substr(element_pathname, 1, position-1);
       element_pathname := substr(element_pathname, position+1);
    end if;

    debug_str := debug_str || '1 = ' || element_name;
    select id 
      into element_parent_id
      from element 
     where element.name = element_name 
       and parent_id    = element_parent_id;

    debug_str := debug_str || ' 2 = ' || element_name;
    debug_str := debug_str || ' 3 = ' || element_parent_id;
  end loop;

  return element_parent_id;
END;
$$  LANGUAGE 'plpgsql' STABLE;

The key lines are 35 and 54. To meet the goal of a case insensitive match, those operators would need to be changed from = to a case insensitive comparison, such as:

where lower(element.name) = lower(element_name)

I do not want to alter this code which is so far down the tree. Instead, I will create a new function which will take a different approach. It will also take advantage of new database features which did not exist 8 years ago, let along 15+ years ago.

Testing new queries

I did some playing around with queries today, looking to find the element_id for a given port, in our example sysutils/py-mqttwarn.

I started off with this, show me things named x11:

freshports.stage=# select *, element_pathname(id) from element where lower(name) = lower('X11');
   id    | name | parent_id | directory_file_flag | status |                    element_pathname                    
---------+------+-----------+---------------------+--------+--------------------------------------------------------
   43426 | x11  |     33542 | F                   | A      | /ports/head/misc/kdeutils11/x11
   52955 | x11  |      4760 | F                   | A      | /ports/head/misc/kdeutils2/x11
   58887 | x11  |     50378 | F                   | A      | /ports/head/textproc/xml-i18n-tools/x11
   58962 | x11  |      3582 | F                   | A      | /ports/head/www/gtkhtml/x11
   83338 | x11  |     77344 | D                   | A      | /doc/head/en_US.ISO8859-1/books/handbook/x11
  104943 | x11  |    104814 | D                   | A      | /doc/head/fr_FR.ISO8859-1/books/handbook/x11
   99879 | x11  |     86040 | D                   | A      | /doc/head/ja_JP.eucJP/books/handbook/x11
  200645 | x11  |    200550 | D                   | A      | /doc/head/nl_NL.ISO8859-1/books/handbook/x11
  110618 | x11  |     77691 | D                   | A      | /doc/head/de_DE.ISO8859-1/books/handbook/x11
  112293 | x11  |    112148 | D                   | A      | /doc/head/it_IT.ISO8859-15/books/handbook/x11
  113146 | X11  |     77283 | D                   | A      | /base/head/release/scripts/X11
  200746 | x11  |    200655 | D                   | A      | /doc/head/zh_CN.GB2312/books/handbook/x11
  171234 | x11  |    110131 | D                   | A      | /doc/head/ru_RU.KOI8-R/books/handbook/x11
  271292 | x11  |    271177 | D                   | A      | /doc/head/mn_MN.UTF-8/books/handbook/x11
  200939 | x11  |    110120 | D                   | A      | /doc/head/es_ES.ISO8859-1/books/handbook/x11
  234867 | x11  |    234757 | D                   | A      | /doc/head/zh_TW.Big5/books/handbook/x11
  255132 | x11  |    255022 | D                   | A      | /doc/head/pl_PL.ISO8859-2/books/handbook/x11
  288692 | x11  |    288608 | D                   | A      | /doc/head/el_GR.ISO8859-7/books/handbook/x11
  300947 | x11  |    300813 | D                   | A      | /doc/head/hu_HU.ISO8859-2/books/handbook/x11
  433144 | x11  |    433041 | D                   | A      | /ru_RU.KOI8-R/books/handbook/x11
  433549 | x11  |    433544 | D                   | A      | /mn_MN.UTF-8/books/handbook/x11
  434309 | x11  |    433663 | D                   | A      | /ja_JP.eucJP/books/handbook/x11
  439412 | x11  |    432543 | D                   | A      | /en_US.ISO8859-1/books/handbook/x11
  439459 | x11  |    433298 | D                   | A      | /nl_NL.ISO8859-1/books/handbook/x11
  457198 | x11  |    111733 | D                   | A      | /doc/head/pt_BR.ISO8859-1/books/handbook/x11
     231 | x11  |    464087 | D                   | A      | /ports/head/x11
  465674 | x11  |    465603 | D                   | A      | /doc/translations/nl_NL.ISO8859-1/books/handbook/x11
  501886 | x11  |    465481 | D                   | A      | /doc/translations/en_US.ISO8859-1/books/handbook/x11
  537516 | x11  |    537439 | D                   | A      | /doc/head/zh_CN.UTF-8/books/handbook/x11
  663947 | x11  |    663946 | D                   | A      | /doc/release/10.2.0/en_US.ISO8859-1/books/handbook/x11
  587542 | x11  |    587471 | D                   | D      | /doc/head/zh_TW.UTF-8/books/handbook/x11
  745067 | x11  |    743628 | D                   | A      | /ports/branches/2016Q4/x11
  770464 | x11  |    763349 | D                   | A      | /ports/branches/2017Q1/x11
  792524 | x11  |    790218 | D                   | A      | /ports/branches/2017Q2/x11
  812565 | x11  |    812271 | D                   | A      | /ports/branches/2017Q3/x11
  829995 | x11  |    829562 | D                   | A      | /ports/branches/2017Q4/x11
  870434 | x11  |    867105 | D                   | A      | /ports/branches/2018Q2/x11
  882372 | x11  |    881009 | D                   | A      | /ports/branches/2018Q3/x11
  901665 | x11  |    901631 | D                   | A      | /ports/branches/2018Q4/x11
  914724 | x11  |    914354 | D                   | A      | /ports/branches/2019Q1/x11
  936002 | x11  |    935538 | D                   | A      | /ports/branches/2019Q2/x11
  954616 | x11  |    949697 | D                   | A      | /ports/branches/2019Q3/x11
  971674 | x11  |    961357 | D                   | A      | /ports/branches/2019Q4/x11
  978479 | x11  |    978429 | D                   | A      | /ports/branches/2020Q1/x11
 1014222 | x11  |   1012899 | D                   | A      | /ports/branches/2020Q2/x11
 1055981 | x11  |   1053714 | D                   | A      | /ports/branches/2020Q3/x11
(46 rows)

Based on line 29, show me a child of that element named libX11:

freshports.stage=# select *, element_pathname(id) from element where lower(name) = lower('libX11') and parent_id = 231;
   id   |  name  | parent_id | directory_file_flag | status |    element_pathname    
--------+--------+-----------+---------------------+--------+------------------------
 163623 | libX11 |       231 | D                   | A      | /ports/head/x11/libX11
(1 row)

freshports.stage=# 

That’s the one.

Using this approach, I came up with few queries, which I later abandoned for another solution.

The abandoned queries

I tried this join of four tables:

freshports.stage=# explain analyse
freshports.stage-# SELECT E4.id, element_pathname(E4.id)
freshports.stage-#   FROM element E1, element E2, element E3, element E4
freshports.stage-#  WHERE E1.parent_id IS NULL
freshports.stage-#    AND E2.parent_id = E1.id
freshports.stage-#    AND E3.parent_id = E2.id
freshports.stage-#    AND E4.parent_id = E3.id
freshports.stage-#    AND E2.name = 'head'
freshports.stage-#    AND E1.name = 'ports'
freshports.stage-#    AND lower(E3.name) = lower('sysutils')
freshports.stage-#    AND lower(E4.name) = lower('py-mqttwarn');
                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=20.05..74.95 rows=1 width=36) (actual time=1.833..2.299 rows=1 loops=1)
   ->  Nested Loop  (cost=19.62..67.62 rows=1 width=4) (actual time=0.126..0.138 rows=1 loops=1)
         ->  Nested Loop  (cost=19.19..27.23 rows=1 width=4) (actual time=0.061..0.063 rows=1 loops=1)
               ->  Bitmap Heap Scan on element e1  (cost=9.76..13.77 rows=1 width=4) (actual time=0.034..0.035 rows=1 loops=1)
                     Recheck Cond: ((name = 'ports'::text) AND (parent_id IS NULL))
                     Heap Blocks: exact=1
                     ->  BitmapAnd  (cost=9.76..9.76 rows=1 width=0) (actual time=0.030..0.030 rows=0 loops=1)
                           ->  Bitmap Index Scan on element_name  (cost=0.00..4.56 rows=17 width=0) (actual time=0.016..0.016 rows=38 loops=1)
                                 Index Cond: (name = 'ports'::text)
                           ->  Bitmap Index Scan on element_parent_id  (cost=0.00..4.95 rows=70 width=0) (actual time=0.010..0.010 rows=100 loops=1)
                                 Index Cond: (parent_id IS NULL)
               ->  Bitmap Heap Scan on element e2  (cost=9.43..13.45 rows=1 width=8) (actual time=0.023..0.024 rows=1 loops=1)
                     Recheck Cond: ((name = 'head'::text) AND (parent_id = e1.id))
                     Heap Blocks: exact=1
                     ->  BitmapAnd  (cost=9.43..9.43 rows=1 width=0) (actual time=0.020..0.020 rows=0 loops=1)
                           ->  Bitmap Index Scan on element_name  (cost=0.00..4.56 rows=17 width=0) (actual time=0.009..0.009 rows=15 loops=1)
                                 Index Cond: (name = 'head'::text)
                           ->  Bitmap Index Scan on element_parent_id  (cost=0.00..4.62 rows=26 width=0) (actual time=0.008..0.008 rows=5 loops=1)
                                 Index Cond: (parent_id = e1.id)
         ->  Index Scan using element_parent_id on element e3  (cost=0.43..40.38 rows=1 width=8) (actual time=0.064..0.074 rows=1 loops=1)
               Index Cond: (parent_id = e2.id)
               Filter: (lower(name) = 'sysutils'::text)
               Rows Removed by Filter: 88
   ->  Index Scan using element_parent_id on element e4  (cost=0.43..7.07 rows=1 width=8) (actual time=1.540..1.994 rows=1 loops=1)
         Index Cond: (parent_id = e3.id)
         Filter: (lower(name) = 'py-mqttwarn'::text)
         Rows Removed by Filter: 2380
 Planning Time: 0.689 ms
 Execution Time: 2.352 ms
(29 rows)

freshports.stage=# 

I tried rewriting with a JOIN JOIN approach, but that’s really just the previous query:

freshports.stage=# -- find a port which matches this
freshports.stage=# -- case insensitive
freshports.stage=# 
freshports.stage=# 
freshports.stage=# explain analyse
freshports.stage-# SELECT E4.id, element_pathname(E4.id)
freshports.stage-#   FROM element E1
freshports.stage-#   JOIN element E2 ON E1.parent_id IS NULL AND E1.name = 'ports'
freshports.stage-#                  AND E2.parent_id = E1.id AND E2.name = 'head'
freshports.stage-#   JOIN element E3 ON E3.parent_id = E2.id AND lower(E3.name) = lower('sysutils')
freshports.stage-#   JOIN element E4 ON E4.parent_id = E3.id AND lower(E4.name) = lower('bacula9-server');
                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=20.05..74.95 rows=1 width=36) (actual time=3.729..4.788 rows=1 loops=1)
   ->  Nested Loop  (cost=19.62..67.62 rows=1 width=4) (actual time=0.278..0.309 rows=1 loops=1)
         ->  Nested Loop  (cost=19.19..27.23 rows=1 width=4) (actual time=0.129..0.132 rows=1 loops=1)
               ->  Bitmap Heap Scan on element e1  (cost=9.76..13.77 rows=1 width=4) (actual time=0.072..0.073 rows=1 loops=1)
                     Recheck Cond: ((name = 'ports'::text) AND (parent_id IS NULL))
                     Heap Blocks: exact=1
                     ->  BitmapAnd  (cost=9.76..9.76 rows=1 width=0) (actual time=0.063..0.063 rows=0 loops=1)
                           ->  Bitmap Index Scan on element_name  (cost=0.00..4.56 rows=17 width=0) (actual time=0.032..0.033 rows=38 loops=1)
                                 Index Cond: (name = 'ports'::text)
                           ->  Bitmap Index Scan on element_parent_id  (cost=0.00..4.95 rows=70 width=0) (actual time=0.023..0.023 rows=100 loops=1)
                                 Index Cond: (parent_id IS NULL)
               ->  Bitmap Heap Scan on element e2  (cost=9.43..13.45 rows=1 width=8) (actual time=0.047..0.049 rows=1 loops=1)
                     Recheck Cond: ((name = 'head'::text) AND (parent_id = e1.id))
                     Heap Blocks: exact=1
                     ->  BitmapAnd  (cost=9.43..9.43 rows=1 width=0) (actual time=0.041..0.041 rows=0 loops=1)
                           ->  Bitmap Index Scan on element_name  (cost=0.00..4.56 rows=17 width=0) (actual time=0.019..0.019 rows=15 loops=1)
                                 Index Cond: (name = 'head'::text)
                           ->  Bitmap Index Scan on element_parent_id  (cost=0.00..4.62 rows=26 width=0) (actual time=0.016..0.016 rows=5 loops=1)
                                 Index Cond: (parent_id = e1.id)
         ->  Index Scan using element_parent_id on element e3  (cost=0.43..40.38 rows=1 width=8) (actual time=0.147..0.173 rows=1 loops=1)
               Index Cond: (parent_id = e2.id)
               Filter: (lower(name) = 'sysutils'::text)
               Rows Removed by Filter: 88
   ->  Index Scan using element_parent_id on element e4  (cost=0.43..7.07 rows=1 width=8) (actual time=3.097..4.125 rows=1 loops=1)
         Index Cond: (parent_id = e3.id)
         Filter: (lower(name) = 'bacula9-server'::text)
         Rows Removed by Filter: 2380
 Planning Time: 1.608 ms
 Execution Time: 4.898 ms
(29 rows)

The next query involves passing the parent id down to the next level:

freshports.stage=# -- find a port which matches this
freshports.stage=# -- case insensitive
freshports.stage=# 
freshports.stage=# explain analyse
freshports.stage-# SELECT E4.ID, element_pathname(E4.id)
freshports.stage-#   FROM element E4
freshports.stage-#  WHERE lower(E4.name) = lower('py37-mqttwarn') AND E4.parent_id = (
freshports.stage(# SELECT E3.ID
freshports.stage(#   FROM element E3
freshports.stage(#  WHERE lower(E3.name) = lower('sysutils') AND E3.parent_id = (
freshports.stage(# SELECT E2.ID
freshports.stage(#   FROM element E2
freshports.stage(#  WHERE E2.name = 'head' AND E2.parent_id = (
freshports.stage(# SELECT E1.id
freshports.stage(#   FROM element E1 
freshports.stage(#  WHERE E1.parent_id IS NULL AND E1.name = 'ports')));
                                                                        QUERY PLAN                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using element_parent_id on element e4  (cost=68.93..110.05 rows=1 width=36) (actual time=1.902..1.902 rows=0 loops=1)
   Index Cond: (parent_id = $2)
   Filter: (lower(name) = 'py37-mqttwarn'::text)
   Rows Removed by Filter: 2381
   InitPlan 3 (returns $2)
     ->  Index Scan using element_parent_id on element e3  (cost=27.64..68.51 rows=1 width=4) (actual time=0.121..0.134 rows=1 loops=1)
           Index Cond: (parent_id = $1)
           Filter: (lower(name) = 'sysutils'::text)
           Rows Removed by Filter: 88
           InitPlan 2 (returns $1)
             ->  Bitmap Heap Scan on element e2  (cost=23.20..27.21 rows=1 width=4) (actual time=0.061..0.061 rows=1 loops=1)
                   Recheck Cond: ((name = 'head'::text) AND (parent_id = $0))
                   Heap Blocks: exact=1
                   InitPlan 1 (returns $0)
                     ->  Bitmap Heap Scan on element e1  (cost=9.76..13.77 rows=1 width=4) (actual time=0.032..0.032 rows=1 loops=1)
                           Recheck Cond: ((name = 'ports'::text) AND (parent_id IS NULL))
                           Heap Blocks: exact=1
                           ->  BitmapAnd  (cost=9.76..9.76 rows=1 width=0) (actual time=0.027..0.027 rows=0 loops=1)
                                 ->  Bitmap Index Scan on element_name  (cost=0.00..4.56 rows=17 width=0) (actual time=0.011..0.011 rows=38 loops=1)
                                       Index Cond: (name = 'ports'::text)
                                 ->  Bitmap Index Scan on element_parent_id  (cost=0.00..4.95 rows=70 width=0) (actual time=0.013..0.013 rows=100 loops=1)
                                       Index Cond: (parent_id IS NULL)
                   ->  BitmapAnd  (cost=9.43..9.43 rows=1 width=0) (actual time=0.058..0.058 rows=0 loops=1)
                         ->  Bitmap Index Scan on element_name  (cost=0.00..4.56 rows=17 width=0) (actual time=0.016..0.016 rows=15 loops=1)
                               Index Cond: (name = 'head'::text)
                         ->  Bitmap Index Scan on element_parent_id  (cost=0.00..4.62 rows=26 width=0) (actual time=0.040..0.040 rows=5 loops=1)
                               Index Cond: (parent_id = $0)
 Planning Time: 0.235 ms
 Execution Time: 1.945 ms

The chosen query

The chosen query uses a relatively new table: element_pathname

freshports.stage=# \d element_pathname
            Table "public.element_pathname"
   Column   |  Type   | Collation | Nullable | Default 
------------+---------+-----------+----------+---------
 element_id | integer |           | not null | 
 pathname   | text    |           | not null | 
Indexes:
    "element_pathname_pathname" UNIQUE, btree (pathname)
    "element_pathname_element_id" btree (element_id)
Foreign-key constraints:
    "element_pathname_element_id_fkey" FOREIGN KEY (element_id) REFERENCES element(id) ON DELETE CASCADE

freshports.stage=# 

This table is maintained via triggers on the element table.

The same query here is:

freshports.stage=# explain analyse select *
FROM element_pathname where lower(pathname) = '/ports/head/sysutils/py-mqttwarn';
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..19555.74 rows=5361 width=54) (actual time=158.979..160.785 rows=1 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on element_pathname  (cost=0.00..18019.64 rows=2234 width=54) (actual time=145.503..156.289 rows=0 loops=3)
         Filter: (lower(pathname) = '/ports/head/sysutils/py-mqttwarn'::text)
         Rows Removed by Filter: 351214
 Planning Time: 0.483 ms
 Execution Time: 160.830 ms
(8 rows)

freshports.stage=#

Our query will always be on lower case, so let’s create an index on lower case. I am curious about the size of this index, and it added only 77MB to the database.

freshports.stage=# select pg_size_pretty(pg_indexes_size('element_pathname'));
 pg_size_pretty 
----------------
 107 MB
(1 row)

freshports.stage=# create index element_pathname_pathname_lc on element_pathname ((lower(pathname)));
CREATE INDEX
freshports.stage=# select pg_size_pretty(pg_indexes_size('element_pathname'));
 pg_size_pretty 
----------------
 184 MB
(1 row)

freshports.stage=# select count(*) from element_pathname;
  count  
---------
 1053643
(1 row)

freshports.stage=# 

Now let’s try that query again:

freshports.stage=# explain analyse select *                                                                          
FROM element_pathname where lower(pathname) = '/ports/head/sysutils/py-mqttwarn';
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on element_pathname  (cost=241.38..9537.09 rows=5268 width=54) (actual time=0.133..0.133 rows=1 loops=1)
   Recheck Cond: (lower(pathname) = '/ports/head/sysutils/py-mqttwarn'::text)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on element_pathname_pathname_lc  (cost=0.00..240.06 rows=5268 width=0) (actual time=0.108..0.108 rows=1 loops=1)
         Index Cond: (lower(pathname) = '/ports/head/sysutils/py-mqttwarn'::text)
 Planning Time: 0.087 ms
 Execution Time: 0.155 ms
(7 rows)

freshports.stage=# 

Now we’re using an index scan on that newly created lower case index.

Let’s do a few more queries to see how they go:

freshports.stage=# explain analyse select *
FROM element_pathname where lower(pathname) = '/ports/head/sysutils/bacula-server';
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on element_pathname  (cost=241.38..9537.09 rows=5268 width=54) (actual time=0.050..0.051 rows=1 loops=1)
   Recheck Cond: (lower(pathname) = '/ports/head/sysutils/bacula-server'::text)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on element_pathname_pathname_lc  (cost=0.00..240.06 rows=5268 width=0) (actual time=0.037..0.037 rows=1 loops=1)
         Index Cond: (lower(pathname) = '/ports/head/sysutils/bacula-server'::text)
 Planning Time: 0.073 ms
 Execution Time: 0.067 ms
(7 rows)
freshports.stage=# explain analyse select *
FROM element_pathname where lower(pathname) = '/ports/head/sysutils/bacula9-server';
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on element_pathname  (cost=241.38..9537.09 rows=5268 width=54) (actual time=0.065..0.066 rows=1 loops=1)
   Recheck Cond: (lower(pathname) = '/ports/head/sysutils/bacula9-server'::text)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on element_pathname_pathname_lc  (cost=0.00..240.06 rows=5268 width=0) (actual time=0.049..0.050 rows=1 loops=1)
         Index Cond: (lower(pathname) = '/ports/head/sysutils/bacula9-server'::text)
 Planning Time: 0.086 ms
 Execution Time: 0.088 ms
(7 rows)

freshports.stage=# 
freshports.stage=# explain analyse select *
FROM element_pathname where lower(pathname) = '/ports/head/lang/gcc10-devel';
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on element_pathname  (cost=241.38..9537.09 rows=5268 width=54) (actual time=0.132..0.132 rows=1 loops=1)
   Recheck Cond: (lower(pathname) = '/ports/head/lang/gcc10-devel'::text)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on element_pathname_pathname_lc  (cost=0.00..240.06 rows=5268 width=0) (actual time=0.116..0.117 rows=1 loops=1)
         Index Cond: (lower(pathname) = '/ports/head/lang/gcc10-devel'::text)
 Planning Time: 0.090 ms
 Execution Time: 0.155 ms
(7 rows)

freshports.stage=# 
freshports.stage=# explain analyse select *
FROM element_pathname where lower(pathname) = '/ports/head/lang/perl5.32';
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on element_pathname  (cost=241.38..9537.09 rows=5268 width=54) (actual time=0.059..0.060 rows=1 loops=1)
   Recheck Cond: (lower(pathname) = '/ports/head/lang/perl5.32'::text)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on element_pathname_pathname_lc  (cost=0.00..240.06 rows=5268 width=0) (actual time=0.044..0.044 rows=1 loops=1)
         Index Cond: (lower(pathname) = '/ports/head/lang/perl5.32'::text)
 Planning Time: 0.085 ms
 Execution Time: 0.081 ms
(7 rows)

Yes, I think this will do nicely.

Things change

There is code and tables in here which date back to late 1999 or early 2000. Since then, the database has grown from 15 tables in Nov 2001 to 80 tables today.

I’m not sure when this svg was created or how many tables it represents, but I know its a few years old now.

Database and code evolve over time and it is sometimes interesteing to see the paths they take.

Jul 302020
 

In my recent post, my conclusion preferred Rename over Delete & Add. Now I’ve changed my mind.

I looked at the code, and the database. Now I prefer Delete & Add.

The database

In the database, there is no way to represent the Rename action without modifying the structure of the commit_log_elements table:

freshports.devgit=# \d commit_log_elements
                                  Table "public.commit_log_elements"
    Column     |     Type     | Collation | Nullable |                     Default                     
---------------+--------------+-----------+----------+-------------------------------------------------
 id            | integer      |           | not null | nextval('commit_log_elements_id_seq'::regclass)
 commit_log_id | integer      |           | not null | 
 element_id    | integer      |           | not null | 
 revision_name | text         |           |          | 
 change_type   | character(1) |           | not null | 

The change_type field is controlled by this constraint:

Check constraints:
    "commit_log_elements_change_type" CHECK (change_type = 'A'::bpchar OR change_type = 'M'::bpchar OR change_type = 'R'::bpchar OR change_type = 'r'::bpchar)

Let’s look at the comments I set up for that constraint:

freshports.devgit=# \dd commit_log_elements_change_type
                                          Object descriptions
 Schema |              Name               |      Object      |               Description                
--------+---------------------------------+------------------+------------------------------------------
 public | commit_log_elements_change_type | table constraint |                                         +
        |                                 |                  | A - add                                 +
        |                                 |                  | M - modify                              +
        |                                 |                  | R - remove for subersion, delete for git+
        |                                 |                  | r - rename (added for git)
(1 row)

Each entry in this table documents an action (in the change_type column) and the file (element_id) upon which that action occurs. All for a given commit (commit_log_id). The revision_name column refers to the subversion revisions of a file in a given commit. For git commits, that field contains the commit hash.

Except for r (Rename). For that, we get only the origin, not the destination, of the rename.

An Add and Remove includes all the data.

A Rename does not. It omits the destination. I don’t see an easy solution for
Rename.

How much data?

We have 51 entries for r:

freshports.devgit=# select count(*) from commit_log_elements where change_type = 'r';
 count 
-------
    51
(1 row)

We can redo all those commits easily enough.

That’s why

That explains is why I’m going with Add and Remove.

But first, let’s run a test.

The test with new XML

This is the new XML I’m going to test with:

[dan@devgit-ingress01:/var/db/ingress/message-queues/testing-new] $ cat 2020.07.27.01.58.33.000078.fd2bb768b5a39d29758b8f74bc15c321d6b4f980.xml
<?xml version='1.0' encoding='UTF-8'?>
<UPDATES Version="1.4.0.0">
  <UPDATE>
    <DATE Year="2020" Month="7" Day="27"/>
    <TIME Timezone="UTC" Hour="1" Minute="58" Second="33"/>
    <OS Repo="ports" Id="FreeBSD" Branch="master"/>
    <LOG>editors/nvi2-port: Rename to editors/nvi2

Grrr... my addport notes failed me.</LOG>
    <PEOPLE>
      <UPDATER Handle="leres &lt;leres@FreeBSD.org&gt;"/>
    </PEOPLE>
    <COMMIT Hash="fd2bb768b5a39d29758b8f74bc15c321d6b4f980" HashShort="fd2bb76" Subject="editors/nvi2-port: Rename to editors/nvi2" EncodingLoses="false" Repository="ports"/>
    <FILES>
      <FILE Action="Modify" Path="editors/Makefile"/>
      <FILE Action="Delete" Path="editors/nvi2-port/Makefile"/>
      <FILE Action="Delete" Path="editors/nvi2-port/distinfo"/>
      <FILE Action="Delete" Path="editors/nvi2-port/pkg-descr"/>
      <FILE Action="Add" Path="editors/nvi2/Makefile"/>
      <FILE Action="Add" Path="editors/nvi2/distinfo"/>
      <FILE Action="Add" Path="editors/nvi2/pkg-descr"/>
    </FILES>
  </UPDATE>
</UPDATES>

I ran it through the processing:

$ sudo mv -i 2020.07.27.01.58.33.000078.fd2bb768b5a39d29758b8f74bc15c321d6b4f980.xml ~ingress/message-queues/incoming/

Looking in the logs I found this:

the size of @Files is 7
FILE ==: Modify, /ports/head/editors/Makefile, fd2bb768b5a39d29758b8f74bc15c321d6b4f980, , ports, head, editors/Makefile, 4106650
FILE ==: Delete, /ports/head/editors/nvi2-port/Makefile, fd2bb768b5a39d29758b8f74bc15c321d6b4f980, , ports, head, editors/nvi2-port/Makefile, 4106651
FILE ==: Delete, /ports/head/editors/nvi2-port/distinfo, fd2bb768b5a39d29758b8f74bc15c321d6b4f980, , ports, head, editors/nvi2-port/distinfo, 4106652
FILE ==: Delete, /ports/head/editors/nvi2-port/pkg-descr, fd2bb768b5a39d29758b8f74bc15c321d6b4f980, , ports, head, editors/nvi2-port/pkg-descr, 4106653
FILE ==: Add, /ports/head/editors/nvi2/Makefile, fd2bb768b5a39d29758b8f74bc15c321d6b4f980, , ports, head, editors/nvi2/Makefile, 4106654
FILE ==: Add, /ports/head/editors/nvi2/distinfo, fd2bb768b5a39d29758b8f74bc15c321d6b4f980, , ports, head, editors/nvi2/distinfo, 4106655
FILE ==: Add, /ports/head/editors/nvi2/pkg-descr, fd2bb768b5a39d29758b8f74bc15c321d6b4f980, , ports, head, editors/nvi2/pkg-descr, 4106656
# # # # Deleting deleted ports # # # #

deleting : port = editors/nvi2-port, port_id = '59458', ' element_id = 1057410'
# # # # Finished deleting deleted ports # # # #

Line 11 is the key. That’s the port being deleted this this little bit of code from verifyport.pm:

#
# $port now contains the port for this file.
# let's adjust the needs_refresh value.
#
#
# if we just deleted the Makefile for this port, there's no sense in refreshing the port.
# because it's been deleted.
#
if ($extra eq $FreshPorts::Constants::FILE_MAKEFILE && ($action eq $FreshPorts::Constants::REMOVE || $action eq $FreshPorts::Constants::DELETE)) {
    #
    # we are deleted (local value, never actually saved to db)
    #
    $port->{deleted} = 1;
    print "THIS PORT HAS BEEN DELETED\n";
}

In the end, the port was properly deleted without any code changes.

Another test

Let’s also try the databases/sqliteodbc-sqlite2 commit from the previous post.

The XML is:

[dan@devgit-ingress01:/var/db/ingress/message-queues/testing-new] $ cat 2020.07.27.18.50.49.000000.c4f1589e8f8c02859c3a8893bd7a0da15b6df8b9.xml
<?xml version='1.0' encoding='UTF-8'?>
<UPDATES Version="1.4.0.0">
  <UPDATE>
    <DATE Year="2020" Month="7" Day="27"/>
    <TIME Timezone="UTC" Hour="18" Minute="50" Second="49"/>
    <OS Repo="ports" Id="FreeBSD" Branch="master"/>
    <LOG>Remove expired port:

2020-07-27 databases/sqliteodbc-sqlite2: SQLite2 has stopped development in 2005, please move to SQLite3</LOG>
    <PEOPLE>
      <UPDATER Handle="rene &lt;rene@FreeBSD.org&gt;"/>
    </PEOPLE>
    <COMMIT Hash="c4f1589e8f8c02859c3a8893bd7a0da15b6df8b9" HashShort="c4f1589" Subject="Remove expired port:" EncodingLoses="false" Repository="ports"/>
    <FILES>
      <FILE Action="Modify" Path="MOVED"/>
      <FILE Action="Modify" Path="databases/Makefile"/>
      <FILE Action="Delete" Path="databases/sqliteodbc-sqlite2/Makefile"/>
    </FILES>
  </UPDATE>
</UPDATES>

The log shows:

That message is all done under Commit ID = '810547'
the size of @Files is 3
FILE ==: Modify, /ports/head/MOVED, c4f1589e8f8c02859c3a8893bd7a0da15b6df8b9, , ports, head, MOVED, 4107056
FILE ==: Modify, /ports/head/databases/Makefile, c4f1589e8f8c02859c3a8893bd7a0da15b6df8b9, , ports, head, databases/Makefile, 4107057
FILE ==: Delete, /ports/head/databases/sqliteodbc-sqlite2/Makefile, c4f1589e8f8c02859c3a8893bd7a0da15b6df8b9, , ports, head, databases/sqliteodbc-sqlite2/Makefile, 4107058
# # # # Deleting deleted ports # # # #

deleting : port = databases/sqliteodbc-sqlite2, port_id = '38622', ' element_id = 723652'
# # # # Finished deleting deleted ports # # # #

Good!

I think we’re good to go here.

Running the old commits

Let’s get the list of old commits to run:

freshports.devgit=# select distinct revision_name from commit_log_elements where change_type = 'r';
              revision_name               
------------------------------------------
 481991bdc9aeb780e44dae0e235b47034335375a
 59a1c626e040d1eb3ef9fe05b67d0d448296f01f
 75f3786fe43cd66ac10614047b0373f4d1b64696
 7b86ce7d8dfa46481bb92ce3dd5cf4dc7464d9f0
 8240d44f53adaf5d66bfcd4b44879cc0894d49a2
 8b92b78775cfd4174e4e765d6023b195fee0bbfa
 a2810a57b46b7e367c0b1aabb18a03f9d5971be6
 d016c9140a6e4b7627d26485926a659f2686185d
 d38f3fa0fc45e73bdb17bbc139d846f784f3d1e8
 d5c503dd5c31a156c7cc5b78afbf98cb2643d716
 dbbc6bd62c9b904b7237df4344f55cdc2bbcce1f
 e792298a3b48d74277e293cc864a113958a39bd1
 e9f48a1168c84f84424d95816af645895e338daf
 f2bfe60090b840b6d99a3288c0b745843cefcfe1
(14 rows)

That’s an easy list to run.

Here’s my script to delete the existing commits:

freshports.devgit=# begin;
BEGIN
freshports.devgit=# 
freshports.devgit=# delete from commit_log where svn_revision in
freshports.devgit-# (
freshports.devgit(#  '481991bdc9aeb780e44dae0e235b47034335375a',
freshports.devgit(#  '59a1c626e040d1eb3ef9fe05b67d0d448296f01f',
freshports.devgit(#  '75f3786fe43cd66ac10614047b0373f4d1b64696',
freshports.devgit(#  '7b86ce7d8dfa46481bb92ce3dd5cf4dc7464d9f0',
freshports.devgit(#  '8240d44f53adaf5d66bfcd4b44879cc0894d49a2',
freshports.devgit(#  '8b92b78775cfd4174e4e765d6023b195fee0bbfa',
freshports.devgit(#  'a2810a57b46b7e367c0b1aabb18a03f9d5971be6',
freshports.devgit(#  'd016c9140a6e4b7627d26485926a659f2686185d',
freshports.devgit(#  'd38f3fa0fc45e73bdb17bbc139d846f784f3d1e8',
freshports.devgit(#  'd5c503dd5c31a156c7cc5b78afbf98cb2643d716',
freshports.devgit(#  'dbbc6bd62c9b904b7237df4344f55cdc2bbcce1f',
freshports.devgit(#  'e792298a3b48d74277e293cc864a113958a39bd1',
freshports.devgit(#  'e9f48a1168c84f84424d95816af645895e338daf',
freshports.devgit(#  'f2bfe60090b840b6d99a3288c0b745843cefcfe1');
DELETE 14
freshports.devgit=# commit;
COMMIT
freshports.devgit=# 
#!/bin/sh

commits="481991bdc9aeb780e44dae0e235b47034335375a
 59a1c626e040d1eb3ef9fe05b67d0d448296f01f
 75f3786fe43cd66ac10614047b0373f4d1b64696
 7b86ce7d8dfa46481bb92ce3dd5cf4dc7464d9f0
 8240d44f53adaf5d66bfcd4b44879cc0894d49a2
 8b92b78775cfd4174e4e765d6023b195fee0bbfa
 a2810a57b46b7e367c0b1aabb18a03f9d5971be6
 d016c9140a6e4b7627d26485926a659f2686185d
 d38f3fa0fc45e73bdb17bbc139d846f784f3d1e8
 d5c503dd5c31a156c7cc5b78afbf98cb2643d716
 dbbc6bd62c9b904b7237df4344f55cdc2bbcce1f
 e792298a3b48d74277e293cc864a113958a39bd1
 e9f48a1168c84f84424d95816af645895e338daf
 f2bfe60090b840b6d99a3288c0b745843cefcfe1"
 
 for commit in $commits
 do
   echo /usr/local/libexec/freshports/test-new-xml-code-single.sh ports $commit | sudo su -fm ingress
 done

That will create the new and old versions of the XML in two different directories. I take the new stuff and toss it into the incoming queue.

[dan@devgit-ingress01:/var/db/ingress/message-queues/testing-new] $ ls -l
total 195
-rw-rw-r--  1 ingress  ingress    887 Jul 30 22:03 2019.09.18.17.37.59.000000.f2bfe60090b840b6d99a3288c0b745843cefcfe1.xml
-rw-rw-r--  1 ingress  ingress   1291 Jul 30 22:03 2020.07.11.18.08.50.000000.e792298a3b48d74277e293cc864a113958a39bd1.xml
-rw-rw-r--  1 ingress  ingress   1246 Jul 30 22:02 2020.07.11.20.45.27.000000.59a1c626e040d1eb3ef9fe05b67d0d448296f01f.xml
-rw-rw-r--  1 ingress  ingress   1316 Jul 30 22:02 2020.07.15.15.47.41.000000.7b86ce7d8dfa46481bb92ce3dd5cf4dc7464d9f0.xml
-rw-rw-r--  1 ingress  ingress   2492 Jul 30 22:03 2020.07.17.08.21.24.000000.d38f3fa0fc45e73bdb17bbc139d846f784f3d1e8.xml
-rw-rw-r--  1 ingress  ingress   2492 Jul 30 22:02 2020.07.17.08.21.54.000000.481991bdc9aeb780e44dae0e235b47034335375a.xml
-rw-rw-r--  1 ingress  ingress   7161 Jul 30 22:03 2020.07.21.17.15.38.000000.d016c9140a6e4b7627d26485926a659f2686185d.xml
-rw-rw-r--  1 ingress  ingress   1316 Jul 30 22:02 2020.07.22.10.41.53.000000.8b92b78775cfd4174e4e765d6023b195fee0bbfa.xml
-rw-rw-r--  1 ingress  ingress    982 Jul 30 22:03 2020.07.22.14.00.59.000000.e9f48a1168c84f84424d95816af645895e338daf.xml
-rw-rw-r--  1 ingress  ingress  82325 Jul 30 22:02 2020.07.23.16.31.18.000000.75f3786fe43cd66ac10614047b0373f4d1b64696.xml
-rw-rw-r--  1 ingress  ingress    908 Jul 30 22:03 2020.07.26.08.01.07.000000.d5c503dd5c31a156c7cc5b78afbf98cb2643d716.xml
-rw-rw-r--  1 ingress  ingress   1168 Jul 30 22:03 2020.07.28.18.52.11.000000.a2810a57b46b7e367c0b1aabb18a03f9d5971be6.xml
-rw-rw-r--  1 ingress  ingress   1490 Jul 30 22:03 2020.07.30.12.20.24.000000.dbbc6bd62c9b904b7237df4344f55cdc2bbcce1f.xml
-rw-rw-r--  1 ingress  ingress  37100 Jul 30 22:02 2020.07.30.20.39.39.000000.8240d44f53adaf5d66bfcd4b44879cc0894d49a2.xml
[dan@devgit-ingress01:/var/db/ingress/message-queues/testing-new] $ sudo mv * ~ingress/message-queues/incoming/
[dan@devgit-ingress01:/var/db/ingress/message-queues/testing-new] $ 

After waiting a bit, I ran this query again:

freshports.devgit=# select distinct revision_name from commit_log_elements where change_type = 'r';
              revision_name               
------------------------------------------
 f2bfe60090b840b6d99a3288c0b745843cefcfe1
(1 row)

freshports.devgit=# 

So why didn’t that one run. Let’s see the XML:

<?xml version='1.0' encoding='UTF-8'?>
<UPDATES Version="1.4.0.0">
  <UPDATE>
    <DATE Year="2019" Month="9" Day="18"/>
    <TIME Timezone="UTC" Hour="17" Minute="37" Second="59"/>
    <OS Repo="ports" Id="FreeBSD" Branch="master"/>
    <LOG>net-mgmt/unifi5: Update to 5.11.46

Also pull in port improvements from glewis@ to enable customizing which Java
that Unifi runs with

PR:             240016</LOG>
    <PEOPLE>
      <UPDATER Handle="feld &lt;feld@FreeBSD.org&gt;"/>
    </PEOPLE>
    <COMMIT Hash="f2bfe60090b840b6d99a3288c0b745843cefcfe1" HashShort="f2bfe60" Subject="net-mgmt/unifi5: Update to 5.11.46" EncodingLoses="false" Repository="ports"/>
    <FILES>
      <FILE Action="Modify" Path="net-mgmt/unifi5/Makefile"/>
      <FILE Action="Modify" Path="net-mgmt/unifi5/distinfo"/>
      <FILE Action="Modify" Path="net-mgmt/unifi5/files/unifi.in"/>
    </FILES>
  </UPDATE>
</UPDATES>

No renames. What’s up with that? Only modifies. Let’s check the database.

I found a lot of them…

freshports.devgit=# select count(*) from commit_log_elements where revision_name = 'f2bfe60090b840b6d99a3288c0b745843cefcfe1';
 count 
-------
  3189
(1 row)

Over a lot of commits:

freshports.devgit=# select count(distinct commit_log_id) from commit_log_elements where revision_name = 'f2bfe60090b840b6d99a3288c0b745843cefcfe1';
 count 
-------
   813
(1 row)

This brings to mind a hardcode commit_id I recall in the code….

Searching my blog, I found subversion vs git – in the database.

Searching the code, I found:

$ grep -r f2bfe60090b840b6d99a3288c0b745843cefcfe1 ~/scripts/ ~/modules/
modules/xml_munge_git.pm:#	$Updates{commit_hash}  = 'f2bfe60090b840b6d99a3288c0b745843cefcfe1';

So I did have it hardcoded for a while. That explains it. I can ignore that commit.

I think this merge can be accepted with some slight changes.

Jul 292020
 

I spent an evening working through the delete issue with git repos.

A Twitter thread was the result.

Sergey Kozlov came up with some code changes which changed the XML output. Now I think we have a plan.

The commit I’m using for a working example was to editors/nvi2 on 27 Jul 2020 01:58:34 – the details of the git and svn commits are outlined below.

The svn commit

  1. The email
  2. The files part of the XML:
    <FILE Action="Add" Revision="543502" Path="head/editors/nvi2/"></FILE>
    <FILE Path="head/editors/nvi2-port/" Action="Remove" Revision="543502"></FILE>
    <FILE Path="head/editors/Makefile" Action="Modify" Revision="543502"></FILE>
    

The git commit

  1. There is no email
  2. The files part of the git XML:
    <FILE Action="Modify" Path="editors/Makefile"/>
    <FILE Action="Rename" Path="editors/nvi2-port/Makefile" Destination="editors/nvi2/Makefile"/>
    <FILE Action="Rename" Path="editors/nvi2-port/distinfo" Destination="editors/nvi2/distinfo"/>
    <FILE Action="Rename" Path="editors/nvi2-port/pkg-descr" Destination="editors/nvi2/pkg-descr"/>
    

The crucial difference

The svn XML shows head/editors/nvi2-port being Removed. As demonstrated via Twitter, this Removed action causes this SQL:

insert into commit_log_elements(id, commit_log_id, element_id, revision_name, change_type) values (4113357, 809907, 1063362, '543502', 'R')

That element_id value represents:

freshports.dev=# select element_pathname(1063362);
       element_pathname        
-------------------------------
/ports/head/editors/nvi2-port

That ‘R‘ on the insert to the commit_log_element table invoked this trigger action:

   if (NEW.change_type = 'R') then
      UPDATE element
         SET status = 'D'
       WHERE id = NEW.element_id;
   end if;

That marks the file (actually, a directory /ports/head/editors/nvi2-port) in the database. That’s the element table. What about the ports table? How does that happen.

How?

The above trigger deletes the element. The element table implements a directory and file using a recursive structure. In part, it looks like this:

freshports.dev=# \d element
                                     Table "public.element"
       Column        |     Type     | Collation | Nullable |               Default               
---------------------+--------------+-----------+----------+-------------------------------------
 id                  | integer      |           | not null | nextval('element_id_seq'::regclass)
 name                | text         |           | not null | 
 parent_id           | integer      |           |          | 
 directory_file_flag | character(1) |           | not null | 
 status              | character(1) |           | not null | 

As the commit is processed, the element (/ports/head/editors/nvi2-port) is deleted because of line 1 of the svn XML (head/editors/nvi2). In general terms, a port is an abstraction of the file system. As that part of the file system is deleted, the port must also be deleted, but that’s a separate table.

So how does the port get deleted? I’m now writing this in hindsight. I didn’t know that when I was posting to Twitter, but tonight I re-discovered it.

It is a trigger on the element table:

element_ports_status AFTER UPDATE ON element FOR EACH ROW EXECUTE FUNCTION element_ports_status()

It’s been many years since I wrote this. Checking the repo, I find it was committed on Mon Sep 11 02:27:26 2006 – I recall this was how it was done, but I’d completely forgotten about it.

-- keep ports.status in sync with element.status
-- this is a performance boosting exercise

CREATE OR REPLACE FUNCTION element_ports_status() RETURNS TRIGGER AS $$
   BEGIN
      IF (NEW.status <> OLD.status) then
         UPDATE ports
            SET status = NEW.status
          WHERE ports.element_id = NEW.id;
      END IF;
      RETURN NEW;
   END
$$ LANGUAGE 'plpgsql';

There it is. So long as I delete the directory which pertains to this port, the system will automagically, via triggers, delete the port.

What needs to change

With the git XML, we don’t get to see the directory being deleted (or removed, as the case may be).

I think we will have to make due with the port‘s Makefile being Remove‘d (in the svn XML) (or Rename‘d as shown in the git XML).

Rename or remove?

We have two options with the git library:

  1. <FILE Action="Rename" Path="editors/nvi2-port/Makefile" Destination="editors/nvi2/Makefile"/>
    <FILE Action="Rename" Path="editors/nvi2-port/distinfo" Destination="editors/nvi2/distinfo"/>
    <FILE Action="Rename" Path="editors/nvi2-port/pkg-descr" Destination="editors/nvi2/pkg-descr"/>/li>
    
  2. <FILE Action="Delete" Path="editors/nvi2-port/Makefile"/>
    <FILE Action="Delete" Path="editors/nvi2-port/distinfo"/>
    <FILE Action="Delete" Path="editors/nvi2-port/pkg-descr"/>
    <FILE Action="Add" Path="editors/nvi2/Makefile"/>
    <FILE Action="Add" Path="editors/nvi2/distinfo"/>
    <FILE Action="Add" Path="editors/nvi2/pkg-descr"/>
    

Both sets of XML amount to the same results. I am not yet sure which one I like better.

Logistically, the first option shows what is happening: you can see where A is becoming B.

In option 2, that is not so clear.

I think option 1 is the way I’m going to go. This blog post serves to document the above and to help me when the time comes to alter the code.

Things to code

The code has to cater for the Makefile Rename and for the Makefile Delete. The former happens here in this example. The latter would happen when a port is deleted.

Let’s look at databases/sqliteodbc-sqlite2 which was removed on 27 Jul 2020 18:50:49 with this commit:

  • the email
  • The XML:
                <FILE Action="Remove" Revision="543544" Path="head/databases/sqliteodbc-sqlite2/"></FILE>
                <FILE Path="head/MOVED" Action="Modify" Revision="543544"></FILE>
                <FILE Path="head/databases/Makefile" Revision="543544" Action="Modify"></FILE>
    

Here, we have the directory head/databases/sqliteodbc-sqlite2 being deleted. This SVN works well.

Let’s look at this in terms of the git XML:

<FILE Action="Modify" Path="MOVED"/>
<FILE Action="Modify" Path="databases/Makefile"/>
<FILE Action="Delete" Path="databases/sqliteodbc-sqlite2/Makefile"/>

It seems the Delete under git is the same as Remove under svn.

The git code needs to be modified to recognize that a Makefile is being deleted and when that happens, also delete the element and the port.

Jul 092020
 

I want to document some of the not-visible differences between a git commit and a subversion commit when in the database.

subversion git
id 808224 808303
message_id 202007091812.069IC6QK044004@repo.freebsd.org a1d2d5b40fba4186daf29b2b2e9ff2dd8b7d8039
message_date 2020-07-09 18:12:06+00 2020-07-09 18:12:06+00
message_subject svn commit: r541803 – in head/devel: . py-absl-py [null]
date_added 2020-07-09 18:39:53.510006+00 2020-07-09 18:59:49.863092+00
commit_date 2020-07-09 18:12:06+00 2020-07-09 18:12:06+00
committer sunpoet sunpoet
description Add py-absl-py 0.9.0

Abseil Python Common Libraries is a collection of Python library code for
building Python applications. The code is collected from Google’s own Python
code base, and has been extensively tested and used in production.

Features:
– Simple application startup
– Distributed commandline flags system
– Custom logging module with additional features
– Testing utilities

WWW: https://github.com/abseil/abseil-py

Add py-absl-py 0.9.0

Abseil Python Common Libraries is a collection of Python library code for
building Python applications. The code is collected from Google’s own Python
code base, and has been extensively tested and used in production.

Features:
– Simple application startup
– Distributed commandline flags system
– Custom logging module with additional features
– Testing utilities

WWW: https://github.com/abseil/abseil-py

system_id 1 1
encoding_losses f f
svn_revision 541803 f2bfe60090b840b6d99a3288c0b745843cefcfe1
repo_id 1 [null]

Of note:

  1. message_id – In subversion, I’ve used the email message_id of the incoming commit. In git, it is the commit hash
  2. message_subject – The message subject is taken from the subversion email commit message. There is no such field for git because we are extracting directly from git.
  3. svn_revision – These are the subversion revision number and the git commit hash, respectively.
  4. repo_id – This is null for git, and I am not sure why. This may need to be fixed.
Jul 042020
 

This is another post in the FreshPorts-git series.

Yesterday was good progress. This morning (at Sat Jul 4 13:05:10 UTC 2020) I manually initiated another import of commits from git.

It looked like this:

[dan@devgit-ingress01:~/scripts] $ echo /usr/local/libexec/freshports/git-delta.sh | sudo su -fm freshports
/var/db/freshports/ports-jail/var/db/repos/PORTS-head-git
remote: Enumerating objects: 261, done.
remote: Counting objects: 100% (261/261), done.
remote: Compressing objects: 100% (131/131)s done.
remote: Total 265 (delta 149), reused 240 (delta 128), pack-reused 4
Receiving objects: 100% (265/265), 446.50 KiB | 7.09 MiB/s, done.
Resolving deltas: 100% (149/149), completed with 73 local objects.
From https://github.com/freebsd/freebsd-ports
   a35902288517..ecb98d7cd285  branches/2020Q3 -> origin/branches/2020Q3
   25564d38f64f..322037573d9e  master          -> origin/master
   f86664ab0be0..bbe14661ebcb  svn_head        -> origin/svn_head
HEAD is now at f2bfe60090b8 net-mgmt/unifi5: Update to 5.11.46
Updating files: 100% (48503/48503), done.
Previous HEAD position was f2bfe60090b8 net-mgmt/unifi5: Update to 5.11.46
Switched to branch 'master'
Your branch is behind 'origin/master' by 37 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
Successfully rebased and updated refs/heads/master.
starting point is d8c85e202575
starting
{'path': PosixPath('/var/db/freshports/ports-jail/var/db/repos/PORTS-head-git'), 'commit': 'd8c85e202575', 'output': PosixPath('/var/db/freshports/message-queues/incoming'), 'repo': 'ports', 'os': 'FreeBSD', 'force': False, 'log_level': 'INFO'}
using repo at  /var/db/freshports/ports-jail/var/db/repos/PORTS-head-git
322037573d9ecbf0f165556df2e349caeaef0534 ba205b385c9b0e91f1f1080655f1ba0c2e1fe7a8 68435f7018ede781c6ba35a7141f1544534e05c4 0d7872f4f69ec9e7b5f7e3298350295301bce70c f5872271c8b9d14e29a6932defcc018d1723194e 918b387b2ed21dd299c4bc966390f3487c7e34bb 6c6db5424c903041d26ea2d2e229af2b1f1d70d6 3061da4ba638f5f9a66818f1b5dec41877f8582b 56a0228b872c9481e823352ac055a0279c362798 55cc857149775dd6ba0b7b08c40b1e6819b9d202 3946cedbad37f7be74dad0648c8935bac1b5465e 882d8de6e0199c6665f3cbe80713fceab235ac92 15f27776ba4f1265590daac8233ff5b9e3aa2197 8475003fd4dbffee1321a5d6e20a0aa750ea5159 b85fb997401b3dd3b71f4b77613adb53a163091d 753eac4badbe2c0a20c6e4c76a228278e7aa30f7 7669079c8ba321c34459b0e1d77c5fccdb4b94e0 43a8a92f64260db919d59824a83873c2fc6be57a e8cdd0bec45d1401594c2a401745e966cde83dcc e250ee3aa03e3bff627d437d68d542adda7ac686 7af772923aea0850ca4fd1fd0851f33586f2b751 0616869d4329ed675dfa0389c7cbb6ee20bcbf31 d3b846b034254602ac214d969c45b62c7dc0fae5 b4a577fe3ccf1188accc53ad9e6a265f80d21b83 926903c161c402218390429f587f9175ec12b31d ecb0127a683f6952ad6ae3b77baf9a310e8499f8 cec47cde28d46071d527ab4d761fc6262cf90cad 0d83d083c00de50dcfb56d61b1095a4aa1d008dc 2f2e89bcc90d64866861a871e546939ab9a8b4d8 c3f75bbcea90b5aca473c5db897f734dc91e8ffb 8c2aef4fd7eab6577fd744749daa88b24edf45d4 8b0acc2a0e75f1e774b940b2966bcab3bce877d9 16de7316ca6bfc7eb2f5f2515408dcf6ad1145f6 b483cab42afe32c814d51cd898fdfd54f8ccf10d cb4e25ed898c489b7c37c929438354f25523d673 da4a6913b642bb4e490d2f91d9068ed4aac97e8f
starting for loop using  d8c85e202575
finished for loop
Would have ran git-to-freshport.py starting at: d8c85e202575
[dan@devgit-ingress01:~/scripts] $ 

This created one XML file for each git commit. The files are dropped into the ~freshports/messages-queues/incoming directory. The freshports-git daemon (well, it’s a long running shell script, managed via daemontools) notices the new files and processed them into the FreshPorts database.

Compare the commits under these two version of the same port:

devgit, for now, is processing both svn and git commits. This is for development and comparison purposes only.

Being able to process both/either streams is good. I don’t know yet how to avoid duplicate commits once we change over.

Jun 302020
 

I’m documenting this as a go along because I’m sure I’ll be doing something that I later wish I had written down.

Create the repo

This creates a FreeBSD git-based port repo.

echo git clone https://github.com/freebsd/freebsd-ports.git /var/db/freshports/ports-jail/var/db/repos/PORTS-head-git | sudo su -l freshports

The latest local commit

This tells us the hash of the latest commit contained within this local copy of the repo.

Let’s call this the base commit.

$ cd PORTS-head-git
[dan@devgit-ingress01:/var/db/freshports/ports-jail/var/db/repos/PORTS-head-git] $ git log -n 1
commit 240cab957e5c13c50ed539ed7bc7d511bb1c3368 (HEAD -> master, origin/master, origin/HEAD)
Author: yuri 
Date:   Tue Jun 30 16:10:10 2020 +0000

    math/fplll: Update 5.3.2 -> 5.3.3
    
    Reported by:    portscout

Now, we do this as the freshports user.

Pull the latest details:

[freshports@devgit-ingress01 ~/ports-jail/var/db/repos/PORTS-head-git]$ git pull
warning: Pulling without specifying how to reconcile divergent branches is
discouraged. You can squelch this message by running one of the following
commands sometime before your next pull:

  git config pull.rebase false  # merge (the default strategy)
  git config pull.rebase true   # rebase
  git config pull.ff only       # fast-forward only

You can replace "git config" with "git config --global" to set a default
preference for all repositories. You can also pass --rebase, --no-rebase,
or --ff-only on the command line to override the configured default per
invocation.

Already up to date.
[freshports@devgit-ingress01 ~/ports-jail/var/db/repos/PORTS-head-git]$ 

Let’s try that recommended command:

[freshports@devgit-ingress01 ~/ports-jail/var/db/repos/PORTS-head-git]$ git config pull.rebase false
[freshports@devgit-ingress01 ~/ports-jail/var/db/repos/PORTS-head-git]$ git pull
Already up to date.

Wait a bit longer for more commits

Some time later:

[freshports@devgit-ingress01 ~/ports-jail/var/db/repos/PORTS-head-git]$ git pull
remote: Enumerating objects: 345, done.
remote: Counting objects: 100% (308/308), done.
remote: Compressing objects: 100% (66/66), done.
remote: Total 150 (delta 94), reused 137 (delta 81), pack-reused 0
Receiving objects: 100% (150/150), 30.17 KiB | 9.00 KiB/s, done.
Resolving deltas: 100% (94/94), completed with 70 local objects.
From https://github.com/freebsd/freebsd-ports
   240cab957e5c..008e30ab98fb  master          -> origin/master
   6c5a4e63b897..f412caa04aa1  branches/2020Q2 -> origin/branches/2020Q2
   3d3db8513a0a..a70ca3962b88  svn_head        -> origin/svn_head
Updating 240cab957e5c..008e30ab98fb
Fast-forward
 audio/jamulus/Makefile                                        |    2 +-
 audio/jamulus/distinfo                                        |    6 +-
 biology/bamtools/Makefile                                     |    2 +-
 databases/py-redis/Makefile                                   |    2 +-
 databases/py-redis/distinfo                                   |    6 +-
 devel/cargo-c/Makefile                                        |   68 +--
 devel/cargo-c/distinfo                                        |  138 +++---
 devel/cmake-gui/Makefile                                      |    1 +
 devel/cmake/Makefile                                          |    1 +
 devel/opendht/Makefile                                        |    2 +-
 graphics/open3d/Makefile                                      |    2 +-
 graphics/ossim/Makefile                                       |    2 +-
 lang/solidity/Makefile                                        |    1 +
 lang/swi-pl/Makefile                                          |   76 +--
 lang/swi-pl/distinfo                                          |    6 +-
 lang/swi-pl/files/extra-patch-skip-gnu-qsort                  |   55 ---
 lang/swi-pl/files/patch-packages_clib_cmake_FindLibUUID.cmake |   14 +
 lang/swi-pl/pkg-plist                                         | 3677 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------------------------
 math/cadabra2/Makefile                                        |    1 +
 math/libqalculate/Makefile                                    |    6 +-
 math/libqalculate/distinfo                                    |    8 +-
 math/libqalculate/pkg-plist                                   |    2 +-
 math/py-pynndescent/Makefile                                  |    2 +-
 math/py-pynndescent/distinfo                                  |    6 +-
 math/py-uncertainties/Makefile                                |    2 +-
 math/py-uncertainties/distinfo                                |    6 +-
 math/qalculate/Makefile                                       |    2 +-
 math/qalculate/distinfo                                       |    6 +-
 math/qalculate/pkg-plist                                      |   15 +-
 misc/ezc3d/Makefile                                           |    2 +-
 misc/ezc3d/distinfo                                           |    6 +-
 misc/ignition-fuel-tools/Makefile                             |    2 +-
 multimedia/libopenshot/Makefile                               |    1 +
 net-p2p/eiskaltdcpp-daemon/Makefile                           |    2 +-
 net/coturn/Makefile                                           |    2 +-
 net/coturn/distinfo                                           |    6 +-
 net/flower/Makefile                                           |    2 +-
 net/libjson-rpc-cpp/Makefile                                  |    2 +-
 science/py-tensorflow/Makefile                                |    2 +-
 sysutils/fusefs-securefs/Makefile                             |    1 +
 www/chromium/Makefile                                         |    1 +
 www/domoticz/Makefile                                         |    1 +
 www/iridium/Makefile                                          |    2 +-
 www/qt5-webengine/Makefile                                    |    1 +
 x11-themes/qtcurve-kf5/Makefile                               |    2 +-
 x11-themes/qtcurve-qt5/Makefile                               |    2 +-
 x11-themes/qtcurve/Makefile                                   |   23 +-
 x11-themes/qtcurve/distinfo                                   |    8 +-
 x11-themes/qtcurve/pkg-plist.kde4                             |   21 -
 x11-themes/qtcurve/pkg-plist.utils                            |   27 ++
 x11/polybar/Makefile                                          |    1 +
 x11/waybar/Makefile                                           |    2 +-
 52 files changed, 2148 insertions(+), 2088 deletions(-)
 delete mode 100644 lang/swi-pl/files/extra-patch-skip-gnu-qsort
 create mode 100644 lang/swi-pl/files/patch-packages_clib_cmake_FindLibUUID.cmake
 delete mode 100644 x11-themes/qtcurve/pkg-plist.kde4
[freshports@devgit-ingress01 ~/ports-jail/var/db/repos/PORTS-head-git]$ 

OK, now we have more commits!

List all the commits since the base commit

After a git clone https://github.com/FreshPorts/git_proc_commit I modified git_proc_commit/git-show-commit.sh:

[dan@devgit-ingress01:~/src/git_proc_commit] $ git diff
diff --git a/git-show-commit.sh b/git-show-commit.sh
index fb20d30..4385933 100755
--- a/git-show-commit.sh
+++ b/git-show-commit.sh
@@ -1,14 +1,14 @@
 #!/bin/sh
 
 # the latest commit we have processed
-latest="3bd153c2494182bb89915e6fc9222288c154285f" # [NEW PORT]: devel/py-oci - Wed Sep 18 17:14:34 2019 +0000
+latest="240cab957e5c13c50ed539ed7bc7d511bb1c3368" # [NEW PORT]: devel/py-oci - Wed Sep 18 17:14:34 2019 +0000
 
 # before running this script:
 # mkdir ~/src
 # cd ~/src
 # git clone git@github.com:freebsd/freebsd-ports.git
 
-repo_dir=~/src/freebsd-ports
+repo_dir=~freshports/ports-jail/var/db/repos/PORTS-head-git
 
 GIT="/usr/local/bin/git"
 
[dan@devgit-ingress01:~/src/git_proc_commit] $ 

Basically, the path and the commit hash were updated. That hash I added in was obtained via the git log -n 1 command.

When running it:

[freshports@devgit-ingress01 ~/ports-jail/var/db/repos/PORTS-head-git]$ ~dan/src/git_proc_commit/git-show-commit.sh
008e30ab98fba636554ada737157baa84debedb3
46560d7e66714e5d574c69e42ab408f3181c1614
1c7120178a876f6daa217abd64cf8b715de46bd8
12ad02023e15ffeaeca60c78c64c11f9736bbac0
f6a1f3e3c8e17a09dbffefd8debcf3c44861e713
99d2fba63577a91850c9b311d0028c461a4a0b29
42855a0b23b363889207c7d0de5f122982bce37d
21e3f0999b283d1a440c7d0d078b325b138874f2
4046ad10a608fdcb2d6bd71a8e5d4361ad0dc069
4983ea53d2c3b0dfc377c87d6529ca18c76ce229
6f21f15631a6b851cc2f1fc8b2e59c64d808e94f
164aa7374784c77ea68d79875ab3c79950802f4b

Those are the commits which occurred since the previous git pull.

With this information, we can iterate over the commits, processing them one at a time.

First script

This was used for cvs commits, and then svn commits. Now it’s for git commits. This is untested. I’m out of time.

[dan@devgit-ingress01:~/scripts] $ svn di
Index: freebsd-git.sh
===================================================================
--- freebsd-git.sh	(revision 5375)
+++ freebsd-git.sh	(working copy)
@@ -1,25 +1,25 @@
 #!/bin/sh
 #
-# $Id: freebsd-cvs.sh,v 1.9 2011-08-15 16:31:56 dan Exp $
+# $Id: freebsd-git.sh,v 1.9 2011-08-15 16:31:56 dan Exp $
 #
 # Copyright (c) 1999-2003 DVL Software
 #
-# Process a raw mail message by converting it to XML, then importing it into
+# Process a raw git log  message by converting it to XML, then importing it into
 # the database.
 #
-# Takes a file name as a parameter
+# Takes a git commit hash a parameter
 #
-LOGGERTAG="freebsd-cvs.sh"
+LOGGERTAG="freebsd-git.sh"
 
 if [ $# -ne 1 ]
 then
-   echo $0 : usage $0 FILE
+   echo $0 : usage $0 COMMIT_HASH
    exit 1
 fi
 
 if [ ! -f config.sh ]
 then
-	echo "config.sh not found by freebsd-cvs.sh..."
+	echo "config.sh not found by freebsd-git.sh..."
 	exit 1
 fi
 
@@ -37,18 +37,23 @@
 
 ${LOGGER} -t ${LOGGERTAG} "$0 invoked, using XML='${XML}' and OUTPUT='${OUTPUT}'"
 
-PATHNAME=$1
+COMMIT_HASH=$1
 
-FILE=`basename ${PATHNAME}` 
+PROCESS_ID=${$}
+FILE=`date +%Y.%m.%d.%H.%M.%S`.$PROCESS_ID.${COMMIT_HASH}.txt
 
 #
 # convert the raw file to XML
 #
-${LOGGER} -t ${LOGGERTAG} "$0 converting to XML via process_mail.pl"
-${LOGGER} -t ${LOGGERTAG} /usr/local/bin/perl ${SCRIPTDIR}/process_mail.pl from ${PATHNAME} into ${XML}/${FILE}.xml errors to ${XML}/${FILE}.errors
+${LOGGER} -t ${LOGGERTAG} "$0 converting to XML via git-to-freshports-xml.py"
 
-/usr/local/bin/perl ${SCRIPTDIR}/process_mail.pl < ${PATHNAME} >    \
-       ${XML}/${FILE}.xml 2>${XML}/${FILE}.errors
+echo 
+# output file 1to commit_hash.process_id so that if we process the same hash again, it does not conflict
+${LOGGER} -t ${LOGGERTAG} ${SCRIPTDIR}/git-to-freshports-xml.py --path ${FRESHPORTS_JAIL_BASE_DIR}${PORTSDIR} --commit ${COMMIT_HASH} --output ${XML}/${FILE}.xml
+
+exit
+
+${SCRIPTDIR}/git-to-freshports-xml.py --path ${FRESHPORTS_JAIL_BASE_DIR}${PORTSDIR} --commit ${COMMIT_HASH} --output ${XML}/${FILE}.xml 2>${XML}/${FILE}.errors
 RESULT=$?
 
 if [ -f ${XML}/${FILE}.errors ]
@@ -68,8 +73,7 @@
 
 ${LOGGER} -t ${LOGGERTAG} "$0 loading that XML into the database via load_xml_into_db.pl"
 
-/usr/local/bin/perl ${SCRIPTDIR}/load_xml_into_db.pl ${XML}/${FILE}.xml > \
-               ${OUTPUT}/${FILE}.loading 2>${OUTPUT}/$FILE.errors
+/usr/local/bin/perl ${SCRIPTDIR}/load_xml_into_db.pl ${XML}/${FILE}.xml ${OUTPUT}/${FILE}.loading 2>${OUTPUT}/$FILE.errors
 RESULT=$?
 
 if [ -f ${OUTPUT}/$FILE.errors ]

What’s next

I need to alter the script which invokes freebsd-git.sh so that it passes in the git commit hash.

Something like this, with $i set to something obtained from git-show-commit.sh as shown above.

[dan@devgit-ingress01:~/src/fp-daemon] $ svn di
Index: fp-daemon.sh
===================================================================
--- fp-daemon.sh	(revision 5375)
+++ fp-daemon.sh	(working copy)
@@ -92,7 +92,7 @@
 				else
 					${LOGGER} -t ${LOGGERTAG} "processing ${i}"
 
-					/bin/sh ./freebsd-cvs.sh ${i}
+					/bin/sh ./freebsd-git.sh ${i}
 
 					RESULT=$?
 					${LOGGER} -t ${LOGGERTAG} "result=$RESULT"
[dan@devgit-ingress01:~/src/fp-daemon] $ 

Small steps.

Jun 282020
 

Recently, a new use of ZFS snapshots was introduced for a cache on FreshPorts. This approach involves creating a snapshot on the empty cache, then rolling back to that snapshot when the cache needs to be cleared.

This idea came from this Tweet after much discussion on how to properly and safely delete directory contents without hitting a race condition. The zfs rollback is much neater.

This approach is now in place on all FreshPorts servers.

Except one

The original FreshPorts server was in a small tower chassis. The first deployed-to-a-datacenter server was a dual-Opteron with 8 x 74GB Raptor drives. This host was deployed in 2006 and is still in use. It remained the primary FreshPorts server until November 2017 when it was replaced by a new system from iXsystems.

More about the Opteron server.

This server is not running ZFS and hardware RAID. Converting to ZFS remotely is not going to happen.

I have been putting off upgrading this server to the latest FreshPorts code because of this.

Last night I realized I could deploy that code by creating a ZFS file system within a regular UFS file.

The ZFS file

From man zpool, don’t do this at home:

     file    A regular file. The use of	files as a backing store is strongly
             discouraged. It is	designed primarily for experimental purposes,
             as	the fault tolerance of a file is only as good the file system
             of	which it is a part. A file must	be specified by	a full path.

For this specific purpose, I am accepting the risk.

First step, enable and start ZFS.

[dan@supernews:~] $ sudo sysrc zfs_enable="YES"
[dan@supernews:~] $ sudo service zfs start
[dan@supernews:~] $ zfs list
no datasets available

When I went looking for how to create the file, I found Ubuntu 16.04 – Using Files To Test ZFS and the truncate command was used.

How big should this file be? Let’s look at production.

[dan@x8dtu-nginx01:~] $ zfs list main_tank/data/freshports/backend/cache/packages
NAME                                               USED  AVAIL  REFER  MOUNTPOINT
main_tank/data/freshports/backend/cache/packages  7.90M  3.47T  7.84M  /var/db/freshports/cache/packages

[dan@x8dtu-nginx01:~] $ sudo du -ch -d0  /var/db/freshports/cache/packages
8.3M	/var/db/freshports/cache/packages
8.3M	total

Let’s say that is 9MB.

How many files are in there?

[dan@x8dtu-nginx01:~] $ sudo find /var/db/freshports/cache/packages | wc -l
    1356

There are packages for 1356 ports cached. Let’s say there are 40,000 packages. That would be 9MB / 1356 * 40,000 = 265 MB

Let’s call it 1GB for good measure. This is only for caching.

After consulting man truncate, I settled on this command:

[dan@supernews:~/tmp] $ truncate -s 1G packages
[dan@supernews:~/tmp] $ ls -l packages
-rw-r--r--  1 dan  dan  1073741824 Jun 28 12:59 packages
[dan@supernews:~/tmp] $ 

Before I start messing with the website cache, let’s turn off the services which use it:

[dan@supernews:/var/db/freshports/cache] $ sudo service nginx stop
Stopping nginx.
Waiting for PIDS: 78270.

[dan@supernews:/var/db/freshports/cache] $ sudo svc -d /var/service/freshports
[dan@supernews:/var/db/freshports/cache] $ sudo svc -d /var/service/fp-listen
[dan@supernews:/var/db/freshports/cache] $ 

The two svc items are the commit processing daemon and the cache clearing daemon, respectively.

This moves the file into place:

[dan@supernews:/var/db/freshports] $ sudo mv ~/tmp/packages freshports-zfs
[dan@supernews:/var/db/freshports] $ sudo chown root:wheel freshports-zfs

Creating the zpool

This creates the zpool:

$ sudo zpool create freshports /var/db/freshports/freshports-zfs
$ zpool list
NAME         SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
freshports   960M  87.5K   960M        -         -     0%     0%  1.00x  ONLINE  -

It works, just like any other zpool:

[dan@supernews:/var/db/freshports/cache] $ zpool status
  pool: freshports
 state: ONLINE
  scan: none requested
config:

	NAME                                 STATE     READ WRITE CKSUM
	freshports                           ONLINE       0     0     0
	  /var/db/freshports/freshports-zfs  ONLINE       0     0     0

errors: No known data errors
[dan@supernews:/var/db/freshports/cache] $ sudo zpool scrub freshports
[dan@supernews:/var/db/freshports/cache] $ zpool status
  pool: freshports
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:00 with 0 errors on Sun Jun 28 14:29:26 2020
config:

	NAME                                 STATE     READ WRITE CKSUM
	freshports                           ONLINE       0     0     0
	  /var/db/freshports/freshports-zfs  ONLINE       0     0     0

errors: No known data errors
[dan@supernews:/var/db/freshports/cache] $ 

Next up, this zpool needs a filesystem.

Creating the filesystem

I don’t need atime, we don’t need compression here, and let’s set the mountpoint.

Compression is usually a good thing. I’m deliberately choosing not to use it. I can’t justify. It just feels wrong to do this on top of UFS. This isn’t anything about UFS. It’s just me.

$ sudo zfs create -o atime=off -o compression=off -o mountpoint=/var/db/freshports/cache/packages freshports/packages

$ zfs list
NAME                  USED  AVAIL  REFER  MOUNTPOINT
freshports            175K   832M    23K  /freshports
freshports/packages    23K   832M    23K  /var/db/freshports/cache/packages

Oh wait, I don’t like having /freshports mounted.

$ sudo zfs set mountpoint=none canmount=off freshports

$ zfs list
NAME                  USED  AVAIL  REFER  MOUNTPOINT
freshports            188K   832M    23K  none
freshports/packages    23K   832M    23K  /var/db/freshports/cache/packages

What do the permissions look like?

$ ls -l /var/db/freshports/cache
total 27619
drwxrwxr--  27 www         freshports       512 Apr 20 08:19 categories
drwxrwxr-x   3 www         freshports       512 Jan 23 14:22 daily
drwxr-xr-x   2 www         freshports       512 Apr 20 08:21 general
drwxr-xr-x   2 freshports  freshports       512 Jun 28 14:33 html
drwxrwxr-x   2 www         freshports       512 Jun 28 06:10 news
drwxr-xr-x   2 root        wheel              2 Jun 28 14:34 packages
drwxrwxr--  14 www         freshports       512 Jun 23 12:00 packages.old
drwxr-xr-x   2 www         freshports       512 Nov  9  2018 pages
drwxrwxr-x  41 www         freshports      2048 May 18 22:42 ports
-rw-r--r--   1 www         www         28234059 Jun  6 05:42 searchlog.txt
drwxrwxr-x   2 www         freshports       512 Jun 28 14:33 spooling
[dan@supernews:/var/db/freshports/cache] $ 

That’s wrong, fixing it:

$ sudo chown www:freshports packages

Snapshot now!

I must snapshot now, before making any data changes which I do not want to rollback.

$ sudo zfs snapshot freshports/packages@empty

Edit 2020-08-25

Missing from this post, as I discovered last night, delegation.

Right now, it does not work:

$ echo zfs rollback freshports/packages@empty | sudo su -fm freshports
cannot rollback 'freshports/packages': permission denied

I must allow the freshports user to do the rollback.

sudo zfs allow freshports rollback freshports/packages

Now it does:

$ echo zfs rollback freshports/packages@empty | sudo su -fm freshports
$ 

It only took me two months and a failed FreshPorts server to find out.

Moving over the data

Now I move over the existing cache:

$ sudo mv -i packages.old/ packages/
$ cd packages/
$ ls
packages.old

Or… not. Damn.

$ sudo mv packages.old/* 
$ sudo rmdir packages.old/
$ ls
comms        devel        java         mail         multimedia   www
databases    emulators    lang         misc         net-p2p      x11-toolkits

Back into service

Let’s start up everything I stopped.

$ sudo svc -u /var/service/freshports
$ sudo svc -u /var/service/fp-listen
$ sudo service nginx start
Performing sanity check on nginx configuration:
nginx: the configuration file /usr/local/etc/nginx/nginx.conf syntax is ok
nginx: configuration file /usr/local/etc/nginx/nginx.conf test is successful
Starting nginx.

Next?

Next, this server needs to have the latest code installed. That’s not for today.

Jun 212020
 

git is coming to FreeBSD.

The initial FreshPorts work for git started in September, then was abandoned for unknown reasons. It just fell off the RADAR. This week, it came back to my attention.

References

Some of the initial research was over Twitter. I’m saving those references here for future reference.

Today, I created new new FreeBSD jails for this work:

  1. devgit-ingress01 – for processing of incoming FreeBSD commits
  2. devgit-nginx01 – for displaying those commits

As the names suggests, these jails are specific to git. The website is not yet publicly available, but will be soon.

Current status

At present, we have a script which will process “GIT commit entries into XML files digestible by the FreshPorts backend”.

This script has been used to process a git commit.

Current task list

This is the current task list:

  • pull in the local changes from the proof of concept jail – this includes both ingress and www code.
  • Branch the ingress code – the www code has already been branched and is in place on devgit-nginx01
  • get devgit-nginx01 online so the public can see it – see https://devgit.freshports.org
  • Incorporate the git-to-freshports.py scripts into the FreshPorts automation
  • package the code into git-specific instances
  • Create test, stage, and prod instances for the git-specific code – only dev exists so far
  • Create the database instances for those new environments
  • Adjust the backups to cover the 8 new jails and 4 new databases – freshports.devgit is being backed up
  • create certs for the four new environments

I am sure other tasks will come to mind over the next few days, but I always like to have a starting point.

Welcome to the next adventure.