Dan Langille

Oct 102017
 

The previous post on this blog was nearly 10 months ago. However, things have not stood still. There have been a few issues fixed and some are still in progress.

The biggest thing underway is a major rework of the backend, the part that processes the incoming commits. There is nothing major; no huge code rewrites.

The basics remain:

  • FreshPorts gets an email from one of the commit mailing list.
  • That email is converted to XML and saved to disk, in a queue.
  • The queue is processed, one commit at a time.
  • An svn up is run.
  • A bunch of make -V occur
  • The database is updated.
  • Profit.

In current production, all of this occurs on one host.

In the new system, I will be taking advantage / exploiting many features of FreeBSD which were not available when FreshPorts started.

The hardware changes

The main change is ZFS. The existing system is on a hardware RAID card with 8 drives. This system was deployed in mid-2006 and it has been a great worker.

The new system boots off a pair of SSDs on ZFS zroot. I plan to make use of beadm for upgrades.

Also present are two 5TB drives, also in a ZFS mirror.

The OS has been installed, but so has most of the FreshPorts code. Over the past few weeks I have been redesigning the back-end to cater for the use of multiple jails. Why jails? I want to separate the main components of the system. This makes it easier to upgrade individual components without affecting the others. It also increase security by isolating various components to their respective worlds.

Jails

There are four jails on this server:

  1. nginx01 – runs the website, and php-fpm, because the website is based on PHP
  2. pg01 – Run PostgreSQL, the database which stores all the good stuff. No public IP address. Accessed over lo1 via a 127.1.0.0/24 address.
  3. mx-ingress01 – accepts incoming email and passes it on to the next stage.
  4. ingress01 – accepts email from mx-ingress01 and performs the basic steps mentioned in the first section of this blog post. No public IP address.

Each of these jails have been created by hand, but the software and daemons have been configured via Ansible.

Porting the code makes some stuff easier

When it came time to deploy the backend FreshPorts code into ingress01, I created a port. This is a departure from current production where code is deployed via svn up.

With the use of a port, many other things came with it:

  • All the backend code goes into /usr/local/libexec/freshports<./li>
  • Main configuration files are now in /usr/local/etc/freshports with symlinks to there from /usr/local/libexec/freshports.
  • /var/db/freshports is the main location for any data external to the database on pg01.
  • /var/db/freshports/cache is the cache for the webpages, with multiple directories for different classes of pages.
  • /var/db/ingress/message-queues/incoming is where the incoming XML data arrives. The ingress user writes here, and the freshports user reads and then moves the file to another queue. All of this occurs in the ingress01 jail.
  • The freshports user has /var/db/freshports/message-queues/ for commit message storage (i.e. the XML). Subdirectories include archive, recent, and retry.
  • The ports-jail is at /var/db/freshports/ports-jail and is where the freshports user does a chroot before running any make -V commands.
  • Within the above named directory, we have /var/db/freshports/ports-jail/var/db/repos which is where the svn up commands are run.
  • Logging is consolidated. It used to all go to /var/log/messages. Now it goes to /var/log/freshports/ and into a number of files contained therein. The port also installed entries to /usr/local/etc/syslog.d/ and to /usr/local/etc/newsyslog.d/ to ensure logs are both captured and rotated.

All of these directories are created by the port. That creating reduces the configuration time. I had scripts, but they were each run manually. Now, it’s fully automated, complete with permission, and user creation.

All of this is handled by the freshports-scripts port.

The freshports daemon

There has long been a freshports deamon, a small python script which looks for incoming XML files and processes them. This is powered by daemontools.

I suppose it could be replaced by a rc.d script, but I might do that another day.

This service is handled by the newly-created freshports-services port.

Current status

As of now, commits are coming in and being processed automatically. The website is being updated and all seems well. There are no known problems.

Remaining work

Before this website goes live, a number of items remain to be completed.

  1. I am not sure that UPDATING and MOVED commits are being processed properly. EDIT 2017.10.14 – done
  2. The fp-listen daemon, which listens for completed commits and clears the appropriate cache bits, is not running. EDIT 2017.10.11 – done
  3. There are a number of cronjob tasks which I should verify are running properly. They always ran as my user id, and they need to run as the freshports user.
  4. I need to get the website running with https under nginx. EDIT 2017.10.14 –
    done

None of these are huge tasks, just time consuming.

Summary

All of these changes are for good. They were not done in the past because there was no reason for them. Now that I’m trying to split tasks up into different jails, the changes make sense and the work is being done.

These changes will also make the inevitable transition under the control of the FreeBSD project. That will not happen soon, but it will happen. I can’t run FreshPorts for ever.

Dec 162016
 

Tonight, I was contacted by jrm on IRC, who told me that FreshPorts was missing something on a search. Yep. It was not in there.

I looked on dev and I saw the same thing.

Finding the cause

I started by enabling debug on dev. I found this SQL:

  SELECT count(*)
  FROM ports P LEFT OUTER JOIN ports_vulnerable    PV  ON PV.port_id       = P.id
               LEFT OUTER JOIN commit_log          CL  ON P.last_commit_id = CL.id
               LEFT OUTER JOIN repo                R   ON CL.repo_id       = R.id
               LEFT OUTER JOIN commit_log_branches CLB ON CL.id            = CLB.commit_log_id
                          JOIN system_branch       SB  ON SB.branch_name   = 'head'
                                                      AND SB.id            = CLB.branch_id,
       categories C, element E

    WHERE P.category_id  = C.id
      AND P.element_id   = E.id  AND 
     E.name ILIKE '%en-aspell%' and E.status = 'A' 

I started by removing one LEFT OUTER JOIN at a time until I found the cause. It was SB.id = CLB.branch_id.

OK, but why?

Let’s look at some of the SQL first.

One line 3, we start with P.last_commit_id, which is the last commit for this port.

From there, we join to the commit_log table to find the repo_id for that commit on line 4.

Line 5 gives us the commit_log_branch table entry for that commit, and we know now which branch this commit touched.

Lines 6 and 7 will restrict the search to head.

I started examining the values from these JOINs. I discovered that some commits did not have entries in the commit_log_branches table. These were commits which occurred before that table was created.

The problem arose when a port whose last commit occurred before the creation of that table. These ports would never be included because, according to the database, there was no such commit on head for them.

The solution

The solution is to add entries to the commit_log_branches table. The following SQL accomplished that.

insert into commit_log_branches
SELECT distinct P.last_commit_id, (select SB.id  FROM system_branch SB where branch_name = 'head')
  FROM ports P LEFT OUTER JOIN commit_log CL ON P.last_commit_id = CL.id
  WHERE NOT EXISTS (SELECT * FROM commit_log_branches CLB WHERE CL.id            = CLB.commit_log_id)
 and P.last_commit_id is not null;

While there, I did this for good measure:

create unique index commit_log_branches_pk on commit_log_branches(commit_log_id, branch_id);

The table now looks like this:

freshports.org=# \d commit_log_branches
 Table "public.commit_log_branches"
    Column     |  Type   | Modifiers 
---------------+---------+-----------
 commit_log_id | integer | not null
 branch_id     | integer | not null
Indexes:
    "commit_log_branches_pk" UNIQUE, btree (commit_log_id, branch_id)
Foreign-key constraints:
    "commit_log_branch_branch_id_fkey" FOREIGN KEY (branch_id) REFERENCES system_branch(id) ON UPDATE CASCADE ON DELETE CASCADE
    "commit_log_branch_commit_log_id_fkey" FOREIGN KEY (commit_log_id) REFERENCES commit_log(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=# 
Oct 302016
 

Committing to a branch (e.g. 2016Q4) when the port in question has slave ports is getting tricky, especially if the port was added after the branch was created.

Case in point, www/linux-c7-flashplugin11 was added to the tree on 19 Oct 2016. The 2016Q4 branch was created on 3 Oct, and therefore that port does not exist on the branch at all.

www/linux-c7-flashplugin11 is a slave port of www/linux-c6-flashplugin11, which was created on 22 Sep 2014 and is there on the 2016Q4 branch.

Clearly, the slave port will not be on the branch.

As of the time of writing, there have been two MFH on the master port. Should the slave port be MFH as well? No.

No, because the slave port does not exist on the branch, and anyone following the branch will not be using that newly added slave port.

The issue for FreshPorts is knowing whether or not a given port is on the branch.

Background on branches

When FreshPorts sees a commit on a branch, the first thing it does is create that port on the branch if it does not already exist there. The copy is based on head and (glossing over all the gory details), the values are copied over verbatim. This is not entirely accurate, but it is close enough for this discussion. Technically, a copy occurs in the database which ensures the directory structure for the port exists on the branch. Because this is based on head, if any new files have been added to the port since the branch was taken, those same files will now exist on the branch. The values for the port (name, description, etc) will be based on extract via make -V performed on the ports tree after an svn up. Thus, the values displayed on the port page will reflect what is on the branch, and not what is in head.

The hypothetical solution

I think the only clean way for FreshPorts to know if the slave port exists on the branch is inspection of the local working copy of the repo after the svn up has completed. If it exists, the database entries for the slave port should be refreshed based on what is found in the repo. If it does not exist, no further action is required for that slave port.

This may be easy, or it may involve some rejigging of the code.

Aug 232016
 

I was asked if this was a known bug. It’s not. I don’t think anyone knows about it.

dvl: ping. FreshPorts truncates COMMENT. Known issue? https://www.freshports.org/devel/include-what-you-use/

Here’s what I found there:

Tool for use with clang to analyze

Here is what the Makefile contains:

$ grep COMMENT Makefile 
COMMENT=	Tool for use with clang to analyze #includes in C and C++ source files

Here is what the command returns:

$ make -V COMMENT
Tool for use with clang to analyze

It appears the line is treated as containing a comment.

Jul 252016
 

FreshPorts uses a cache for every port page (e.g. https://www.freshports.org/sysutils/bacula-server/). When an update to a port occurs, we must clear the cache for that port. This sounds simple, and it is. The various ways in which updates occur complicates the situation.

This post is all about fixing one edge case: the addition or removal of a port dependency (e.g. RUN_DEPENDS, LIB_DEPENDS).

When port A depends on port B, this fact is listed on both pages, but adding/removing a dependency is not properly handled (see this bug). Tonight, I have coded a fix. I’d like it to be less complex, but it has been fixed.

The table

This is the table in question:

freshports.org=# \d port_dependencies
         Table "public.port_dependencies"
         Column         |     Type     | Modifiers 
------------------------+--------------+-----------
 port_id                | integer      | not null
 port_id_dependent_upon | integer      | not null
 dependency_type        | character(1) | not null
Indexes:
    "port_dependencies_pkey" PRIMARY KEY, btree (port_id, port_id_dependent_upon, dependency_type)
Foreign-key constraints:
    "port_dependencies_port_id_dependent_upon_fkey" FOREIGN KEY (port_id_dependent_upon) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE
    "port_dependencies_port_id_fkey" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=# 

Using the terminology from the first sentence, port A and port B are represented by port_id and port_id_dependent_upon respectively. Any new row or any deleted row must invalidate the cache for both port A and port B.

Keeping track of cache to clear

This is the table we use to clear the ports cache:

freshports.org=# \d cache_clearing_ports
                                     Table "public.cache_clearing_ports"
   Column   |            Type             |                             Modifiers                             
------------+-----------------------------+-------------------------------------------------------------------
 id         | integer                     | not null default nextval('cache_clearing_ports_id_seq'::regclass)
 port_id    | integer                     | not null
 category   | text                        | not null
 port       | text                        | not null
 date_added | timestamp without time zone | not null default now()
Indexes:
    "cache_clearing_ports_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "cache_clearing_ports_port_id_fkey" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=# 

We add rows to this table to indicate that a given port must be cleared from the cache.

On INSERT

On an insert to the port_dependencies table, let’s do this:

CREATE OR REPLACE FUNCTION port_dependencies_insert_clear_cache() RETURNS TRIGGER AS $$
   DECLARE
      l_cache_clearing_ports_id   int8;
      l_port      text;
      l_category  text;
   BEGIN
        --
        -- This function handles the addition of a new dependency.
        -- yes, we need to clear the cache for both port_id and port_id_dependent_upon
        -- from the cache_clearing_ports.  I figure there is a shorter way to this but
        -- I cannot think it through right now.
        --

        IF TG_OP = 'INSERT' THEN
         -- handle port A (port_id)
         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = NEW.port_id;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = NEW.port_id;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (NEW.port_id, l_category, l_port);
          END IF;

         -- handle port B (port_id_dependent_upon)
         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = NEW.port_id_dependent_upon;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = NEW.port_id_dependent_upon;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (NEW.port_id_dependent_upon, l_category, l_port);
          END IF;

          NOTIFY port_updated;
      END IF;

      -- when a port changes, add an entry to the cache clearing table
      RETURN NEW;
   END
$$ LANGUAGE 'plpgsql';

  DROP TRIGGER IF EXISTS port_dependencies_insert_clear_cache ON port_dependencies;
CREATE TRIGGER port_dependencies_insert_clear_cache
    AFTER INSERT on port_dependencies
    FOR EACH ROW
    EXECUTE PROCEDURE port_dependencies_insert_clear_cache();

That function is rather long, and I know it can be simplified, but that’s not on tonight’s agenda. If you are so inclined, I am happy to hear your suggestions. Specifically, I think I can do an INSERT INTO … WHERE NOT EXISTS and do the SELECT & INSERT in one step, without the need for IF FOUND.

On DELETE

Here is the code for the DELETE.

CREATE OR REPLACE FUNCTION port_dependencies_delete_clear_cache() RETURNS TRIGGER AS $$
   DECLARE
      l_cache_clearing_ports_id   int8;
      l_port      text;
      l_category  text;
   BEGIN
        --
        -- This function handles the deletion of a existing dependency.
        -- yes, we need to clear the cache for both port_id and port_id_dependent_upon
        -- from the cache_clearing_ports.  I figure there is a shorter way to this but
        -- I cannot think it through right now.
        --

        IF TG_OP = 'DELETE' THEN
         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = OLD.port_id;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = OLD.port_id;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (OLD.port_id, l_category, l_port);
          END IF;

         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = OLD.port_id_dependent_upon;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = OLD.port_id_dependent_upon;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (OLD.port_id_dependent_upon, l_category, l_port);
          END IF;

          NOTIFY port_updated;
      END IF;

      -- when a port changes, add an entry to the cache clearing table
      RETURN OLD;
   END
$$ LANGUAGE 'plpgsql';

  DROP TRIGGER IF EXISTS port_dependencies_delete_clear_cache ON port_dependencies;
CREATE TRIGGER port_dependencies_delete_clear_cache
    AFTER DELETE on port_dependencies
    FOR EACH ROW
    EXECUTE PROCEDURE port_dependencies_delete_clear_cache();

It is nearly identical to the first function. Again, those of you who want to suggest improvements there, feel free.

I will run this in dev for a while and see how it goes. Initial testing has been promising.

NOTE: there are no updates to this table, by design. Instead when we see a new commit to a port, we do a DELETE from port_dependencies WHERE port_id = :a.

Jul 182016
 

Recently, I’ve been adding more and more file content into the database, mostly because it’s useful. For example, we add the contents of the distinfo, because that information is useful. Next, it was pkg-plist, and before I was done that, someone suggested added in the Makefile.

Neither pkg-plist nor Makefile will be perfect:

  1. Not all ports have a pkg-plist
  2. Some ports use Makefile.inc, and we are not capturing that

Why store these files?

Because we can, and then we can search on it.

Ever wondered how many ports do not have a pkg-plist file?

freshports.org=# select count(*) from ports where pkg_plist is not null and status = 'A';;
 count 
-------
 17632
(1 row)

freshports.org=# select count(*) from ports where pkg_plist is null and status = 'A';;
 count 
-------
 11733
(1 row)

freshports.org=#

How many ports use OSVERSION?

freshports.org=# select count(*) from ports where makefile ilike '%OSVERSION%';
 count 
-------
   658
(1 row)

Yes, you can do this with find and grep, but these queries are much faster than that.

We can create all kinds of queries with this data, and allow you to save the output in plain text form.

Loading this up on dev

I loaded this up on dev first. Then I looked at how many Makefiles I had.

freshports.org=# select count(*) from ports where Makefile is not null and status = 'A';;
 count 
-------
 29349
(1 row)

freshports.org=# select count(*) from ports where Makefile is null and status = 'A';;
 count 
-------
    10
(1 row)

Ten ports without Makefiles? Who are they?

freshports.org=# select id, element_pathname(element_id) from ports where Makefile is null and status = 'A';;
  id   |                    element_pathname                     
-------+---------------------------------------------------------
 39093 | /ports/branches/2014Q1/print/typetools
 39450 | /ports/branches/2014Q2/lang/twelf
 39268 | /ports/branches/2014Q1/graphics/ipe
 38946 | /ports/branches/2014Q1/textproc/hs-lhs2tex
 39958 | /ports/branches/2014Q4/multimedia/gmp-api
 40059 | /ports/branches/2014Q4/multimedia/openh264
 40223 | /ports/branches/2016Q1/sysutils/p5-Schedule-Cron-Events
 39208 | /ports/branches/2014Q1/textproc/dblatex
 40070 | /ports/branches/2015Q1/textproc/p5-HTML-FormatExternal
 39110 | /ports/branches/2014Q1/www/chromium
(10 rows)

Ahh, that explains these errors I saw while processing some ports. I won’t worry about they because they are all old branches.

Jul 18 02:48:20 jester FreshPorts[69764]: error executing make command for multimedia/gmp-api: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q4 multimedia/gmp-api 2>/tmp/FreshPorts.multimedia.gmp-api.make-error.2016.7.18.2.48.20.69764  produced this error:  Error message is: cd: /usr/local/repos/PORTS-2014Q4/multimedia/gmp-api: No such file or directory make: cannot open /usr/local/repos/PORTS-2014Q4/multimedia/gmp-api/Makefile. Make results are :  make: stopped in /  (/usr/local/FreshPorts/scripts) 
Jul 18 02:50:33 jester FreshPorts[69764]: error executing make command for multimedia/openh264: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q4 multimedia/openh264 2>/tmp/FreshPorts.multimedia.openh264.make-error.2016.7.18.2.50.33.69764  produced this error:  Error message is: cd: /usr/local/repos/PORTS-2014Q4/multimedia/openh264: No such file or directory make: cannot open /usr/local/repos/PORTS-2014Q4/multimedia/openh264/Makefile. Make results are :  make: stopped in /  (/usr/local/FreshPorts/scripts) 
Jul 18 03:12:40 jester FreshPorts[69764]: error executing make command for print/typetools: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q1 print/typetools 2>/tmp/FreshPorts.print.typetools.make-error.2016.7.18.3.12.40.69764  produced this error:  Error message is: make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.tex.mk" line 92: malformed TEX_DEFAULT: tetex  (/usr/local/FreshPorts/scripts) 
Jul 18 03:35:37 jester FreshPorts[69764]: error executing make command for sysutils/p5-Schedule-Cron-Events: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2016Q1 sysutils/p5-Schedule-Cron-Events 2>/tmp/FreshPorts.sysutils.p5-Schedule-Cron-Events.make-error.2016.7.18.3.35.37.69764  produced this error:  Error message is: cd: /usr/local/repos/PORTS-2016Q1/sysutils/p5-Schedule-Cron-Events: No such file or directory make: cannot open /usr/local/repos/PORTS-2016Q1/sysutils/p5-Schedule-Cron-Events/Makefile. Make results are :  make: stopped in /  (/usr/local/FreshPorts/scripts) 
Jul 18 03:38:48 jester FreshPorts[69764]: error executing make command for textproc/dblatex: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q1 textproc/dblatex 2>/tmp/FreshPorts.textproc.dblatex.make-error.2016.7.18.3.38.48.69764  produced this error:  Error message is: make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.tex.mk" line 92: malformed TEX_DEFAULT: tetex  (/usr/local/FreshPorts/scripts) 
Jul 18 03:39:50 jester FreshPorts[69764]: error executing make command for textproc/hs-lhs2tex: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q1 textproc/hs-lhs2tex 2>/tmp/FreshPorts.textproc.hs-lhs2tex.make-error.2016.7.18.3.39.50.69764  produced this error:  Error message is: make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.tex.mk" line 92: malformed TEX_DEFAULT: tetex  (/usr/local/FreshPorts/scripts) 
Jul 18 03:41:25 jester FreshPorts[69764]: error executing make command for textproc/p5-HTML-FormatExternal: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2015Q1 textproc/p5-HTML-FormatExternal 2>/tmp/FreshPorts.textproc.p5-HTML-FormatExternal.make-error.2016.7.18.3.41.25.69764  produced this error:  Error message is: cd: /usr/local/repos/PORTS-2015Q1/textproc/p5-HTML-FormatExternal: No such file or directory make: cannot open /usr/local/repos/PORTS-2015Q1/textproc/p5-HTML-FormatExternal/Makefile. Make results are :  make: stopped in /  (/usr/local/FreshPorts/scripts) 
Jul 18 03:48:07 jester FreshPorts[69764]: error executing make command for www/chromium: This command (FreshPorts code 1):  /usr/local/bin/sudo /usr/sbin/chroot -u dan /usr/local/FreshPorts/ports-jail /make-port.sh /usr/local/repos/PORTS-2014Q1 www/chromium 2>/tmp/FreshPorts.www.chromium.make-error.2016.7.18.3.48.7.69764  produced this error:  Error message is: make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" line 1516: Cannot open /usr/local/repos/PORTS-2014Q1/Mk/Uses/tar.mk make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" line 5281: warning: duplicate script for target "-depends" ignored make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" line 5278: warning: using previous script for "-depends" defined here make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" line 5281: warning: duplicate script for target "-depends" ignored make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" line 5278: warning: using previous script for "-depends" defined here make: "/usr/local/repos/PORTS-2014Q1/Mk/bsd.port.mk" lin

What next?

Right now, the data is loading up… Makefile, pkg-plist, distinfo, etc. The script should be done at about 0800 UTC.

Tomorrow morning, I will restart commit processing, and clear the cache so you can see all the pretty new stuff.

In the coming weeks, the search page will be amended to allow searching on the new fields.

Jul 152016
 

About two weeks ago, I wrote a function for creating a port. This will take the port from head and copy it to a branch. This is vital because of the reasons discussed here: creating a new port and copying a port from head to a branch.

This function worked and was useful, but it was missed one vital part: population of the ports_categories table. This table tells us what categories the port is in. Now that I think of it, I may have to populate other tables, such as:

  1. ports_vulnerable
  2. ports_moved
  3. ports_updating_xref
  4. commit_log_ports_vuxml

Those can wait for another day.

Existing bugs fixed

While debugging this issue, I noticed that CopyPortFromHeadToBranch() was documented as returning the port id, but was returning the element id. The difference is that all ports are elements, but not all elements are ports. The code which invokes CopyPortFromHeadToBranch() actually creates the port, while the function just copies stuff around in the element table.

The diff is as follow:

-CREATE OR REPLACE FUNCTION CopyPortFromHeadToBranch(text, text, text) returns int4 AS $$
+CREATE OR REPLACE FUNCTION CopyPortFilesFromHeadToBranch(text, text, text) returns int4 AS $$
 DECLARE
   a_category_name ALIAS for $1;
   a_port_name     ALIAS for $2;
   a_branch_name   ALIAS for $3;
  
-  l_r             element_id_pathname%rowtype;
-  l_port_id       int4;
- 
+  l_r                      element_id_pathname%rowtype;
+  l_element_id_of_new_port int4;
+
   BEGIN
- 
+
+  -- this query will add the new port... the key is the element_add() function call
+
   FOR l_r IN
   WITH RECURSIVE all_descendents AS ( 
   SELECT id, name, parent_id, directory_file_flag, status
@@ -3243,20 +3248,22 @@
     FROM element E
     JOIN all_descendents AD
       ON (E.parent_id = AD.id)
-)
-    SELECT element_add(replace(element_pathname(id), '/ports/head/', '/ports/branches/' || a_branch_name || '/'), directory_file_flag),
-                       replace(element_pathname(id), '/ports/head/', '/ports/branches/2016Q2/')
+  )
+    SELECT element_add(replace(element_pathname(id), '/ports/head/', '/ports/branches/' || a_branch_name || '/'), directory_file_flag)
       FROM all_descendents
      WHERE status = 'A'
   LOOP
   END LOOP;
- 
+
   IF FOUND THEN
-    SELECT pathname_id('/ports/branches/' || a_branch_name || '/' || a_category_name || '/' || a_port_name || '/')
-      INTO l_port_id;
+    -- based on the element_id of the pathname for newly created port, grab the port id
+     SELECT pathname_id('/ports/branches/' || a_branch_name || '/' || a_category_name || '/' || a_port_name || '/')
+      INTO l_element_id_of_new_port;
+
   END IF;
- 
-  return l_port_id;
+
+  return l_element_id_of_new_port;
+
   END;
 $$  LANGUAGE 'plpgsql';
 
@@ -3278,12 +3285,27 @@
  
   BEGIN
  
-  SELECT CopyPortFromHeadToBranch(a_category_name, a_port_name, a_branch_name)
+  SELECT CopyPortFilesFromHeadToBranch(a_category_name, a_port_name, a_branch_name)
     INTO l_element_id_of_new_port;
  
   IF FOUND THEN
     SELECT createport(l_element_id_of_new_port, getcategory(a_category_name))
       INTO l_port_id;
+
+    SELECT PA.id
+      FROM ports_active PA
+     WHERE PA.element_id = l_element_id_of_new_port
+      INTO l_port_id;
+
+    -- we need to set up the port_categories table first...
+
+    RAISE WARNING 'about to call GetPort and I only want stuff from head, so branch agnostic is OK.';
+
+    INSERT INTO ports_categories(port_id, category_id)
+     SELECT l_port_id, PC.category_id
+       FROM ports_categories PC
+      WHERE PC.port_id = GetPort(a_category_name, a_port_name);
+
   END IF;
   
   RETURN l_port_id;

I will outline the problems fixed above:

  • line 2 – Rename the function to reflect what it does
  • line 11 – we are returning the element id of the new port
  • line 28 – remove this extraneous parameter from the SELECT
  • line 65 – grab the port id based on the element id and use that to copy values from the ports_categories from the old code to the new port
  • line 74 – I had tried to do this INSERT within the CopyPortFilesFromHeadToBranch() function, but that can’t be done because the port is not created there, it is created on line 62. Once I realized that issue, which was perpetuated by the incorrectly named variable on line 39, it all became clear.

The working code

Here is the code which http://dev.freshports.org/ is using now.

--
-- this function copies a port files from /ports/head/CATEGORY/PORT to /ports/branches/BRANCH
-- It returns the id of the new element for that port.
-- This function DOES not create the port itself (i.e. no entries added to port table).
--
CREATE OR REPLACE FUNCTION CopyPortFilesFromHeadToBranch(text, text, text) returns int4 AS $$
DECLARE
  a_category_name ALIAS for $1;
  a_port_name     ALIAS for $2;
  a_branch_name   ALIAS for $3;

  l_r                      element_id_pathname%rowtype;
  l_element_id_of_new_port int4;

  BEGIN

  -- this query will add the new port... the key is the element_add() function call

  FOR l_r IN
  WITH RECURSIVE all_descendents AS ( 
  SELECT id, name, parent_id, directory_file_flag, status
    FROM element
    WHERE id = (select pathname_id('/ports/head/' || a_category_name || '/' || a_port_name || '/'))
  UNION
  SELECT E.id, E.name, E.parent_id, E.directory_file_flag, E.status
    FROM element E
    JOIN all_descendents AD
      ON (E.parent_id = AD.id)
  )
    SELECT element_add(replace(element_pathname(id), '/ports/head/', '/ports/branches/' || a_branch_name || '/'), directory_file_flag)
      FROM all_descendents
     WHERE status = 'A'
  LOOP
  END LOOP;

  IF FOUND THEN
    -- based on the element_id of the pathname for newly created port, grab the port id
     SELECT pathname_id('/ports/branches/' || a_branch_name || '/' || a_category_name || '/' || a_port_name || '/')
      INTO l_element_id_of_new_port;

  END IF;

  return l_element_id_of_new_port;

  END;
$$  LANGUAGE 'plpgsql';


--
-- this function create a port on a branch using head as a starting point.
-- i.e. from /ports/head/CATEGORY/PORT to /ports/branches/BRANCH/CATEGORY/PORT
-- is created.
-- It returns the id of the new port.
--
CREATE OR REPLACE FUNCTION CreatePort(text, text, text) returns int4 AS $$
DECLARE
  a_category_name ALIAS for $1;
  a_port_name     ALIAS for $2;
  a_branch_name   ALIAS for $3;

  l_element_id_of_new_port int4;
  l_port_id                int4;

  BEGIN

  SELECT CopyPortFilesFromHeadToBranch(a_category_name, a_port_name, a_branch_name)
    INTO l_element_id_of_new_port;

  IF FOUND THEN
    SELECT createport(l_element_id_of_new_port, getcategory(a_category_name))
      INTO l_port_id;

    SELECT PA.id
      FROM ports_active PA
     WHERE PA.element_id = l_element_id_of_new_port
      INTO l_port_id;

    -- we need to set up the port_categories table first...

    RAISE WARNING 'about to call GetPort and I only want stuff from head, so branch agnostic is OK.';

    INSERT INTO ports_categories(port_id, category_id)
     SELECT l_port_id, PC.category_id
       FROM ports_categories PC
      WHERE PC.port_id = GetPort(a_category_name, a_port_name);

  END IF;

  RETURN l_port_id;

  END;
$$  LANGUAGE 'plpgsql';
Jul 032016
 

As you know, I’ve been working on branches this long holiday weekend. Today, I managed to get the display of a port on a branch to correctly.

Along the way, I discovered a downside to storing raw data in the database. Take for example this output:

freshports.org=# select name, depends_run from ports_active where depends_run is not null limit 1;
     name      |                          depends_run                          
---------------+---------------------------------------------------------------
 p5-Net-NBName | /usr/local/bin/perl5.20.2:/usr/local/PORTS-head/lang/perl5.20
(1 row)

This is the output of make -V RUN_DEPENDS. You will notice that this repo is not in /usr/ports. In itself, that is OK, except when you change the directory, like I did earlier this month, when I moved everything into /usr/local/repos. After time, this field throughout the database will contain a mixture of the two root directories. This won’t do, because the website needs to strip off the root directory to get at the category/port combination at the end of the file.

Yes, I could just try taking the last of the string, starting at the second to last slash, but that’s not what I want to do. I want to store the raw data, because that’s what make -V gave me. Thus, I will update the data to make sure it is up to date with current directories.

The solution I used was:

update ports set depends_run   = replace(depends_run,   ':/usr/local/', ':/usr/local/repos/'), 
                 depends_lib   = replace(depends_lib,   ':/usr/local/', ':/usr/local/repos/'), 
                 depends_build = replace(depends_build, ':/usr/local/', ':/usr/local/repos/');

update ports set depends_run   = replace(depends_run,   ':/usr/local/repos/repos/', ':/usr/local/repos/'), 
                 depends_lib   = replace(depends_lib,   ':/usr/local/repos/repos/', ':/usr/local/repos/'), 
                 depends_build = replace(depends_build, ':/usr/local/repos/repos/', ':/usr/local/repos/');       

That second update was to reverse any updates to already-correct values.

This change was on my development server. I won’t need to do this update on the production server because that repo has not changed locations.

Jul 022016
 

In my two previous posts I talked about creating a new port and copying a port from head to a branch.

The goal of this post is the creation of a new function: CreatePortOnBranch($category_name, $port_name, $CommitBranch)

The failed start

I started out with this stored procedure:

CREATE OR REPLACE FUNCTION CreatePort(text, text, text) returns int4 AS $$
DECLARE
  a_category_name ALIAS for $1;
  a_port_name     ALIAS for $2;
  a_branch_name   ALIAS for $3;

  l_element_id_of_new_port int4;
  l_port_id                int4;

  BEGIN

WITH RECURSIVE all_descendents AS (
  SELECT id, name, parent_id, directory_file_flag, status
    FROM element
    WHERE id = (select pathname_id('/ports/head/sysutils/bacula-client-static/'))
  UNION
  SELECT E.id, E.name, E.parent_id, E.directory_file_flag, E.status
    FROM element E
    JOIN all_descendents AD
      ON (E.parent_id = AD.id)
)
   SELECT element_add(replace(element_pathname(id), '/ports/head/', '/ports/branches/2016Q2/'), directory_file_flag), 
                      replace(element_pathname(id), '/ports/head/', '/ports/branches/2016Q2/')
      FROM all_descendents
     WHERE status = 'A';

  SELECT createport(l_element_id_of_new_port, getcategory(a_category_name))
    INTO l_port_id;
 
  RETURN l_port_id;

  END;
$$  LANGUAGE 'plpgsql';

Running it gave this message:

# select CreatePort('sysutils', 'bacula-server', '2016Q2');
ERROR:  query has no destination for result data
HINT:  If you want to discard the results of a SELECT, use PERFORM instead.
CONTEXT:  PL/pgSQL function createport(text,text,text) line 12 at SQL statement

I tried PERFORM, but failed to find a suitable syntax. I found out why. I asked on IRC:

I am trying to use PERFORM in order to discard the result set in a plpgsql function.. I think I want PERFORM, but I’m failing. https://gist.github.com/dlangille/3f464ac2537028f0602551f1bf4db2ba

The replies I got (NOTE: the entire channel log is not shown here):

[Jul 02 12:33]  <RhodiumToad>	perform and with have ... problems
[Jul 02 12:33]  <RhodiumToad>	PERFORM works by replacing the word PERFORM with SELECT and then parsing the query
[Jul 02 12:34]  <RhodiumToad>	so PERFORM WITH ...  never works
[Jul 02 12:36]  <dvl>	RhodiumToad: Sounds like I should create a function with just that WITH RECURSION in it, and call that instead.
[Jul 02 12:36]  <RhodiumToad>	that would probably be simpler
[Jul 02 12:36]  <Myon>	enricostn: that's not a locales problem, that's a version mismatch problem, 9.3 vs 9.5
[Jul 02 12:36]  <johto>	you could also do  PERFORM 1 FROM (<query>) ss;  because there are no wCTEs
[Jul 02 12:37]  <RhodiumToad>	yeah, but that's really ugly
[Jul 02 12:37]  <enricostn>	Myon: oh... would you please so kind to post there and shade some light? Would be really appreciated.
[Jul 02 12:38]  <dvl>	I also think dividing this up into distinct functions will have code benefits later.

The two new functions and a new datatype

Based on the above discussion, I created these functions:

--
-- this function copies a port from /ports/head/CATEGORY/PORT to /ports/branches/BRANCH
-- It returns the id of the new port.
--
CREATE OR REPLACE FUNCTION CopyPortFromHeadToBranch(text, text, text) returns int4 AS $$
DECLARE
  a_category_name ALIAS for $1;
  a_port_name     ALIAS for $2;
  a_branch_name   ALIAS for $3;

  l_r             element_id_pathname%rowtype;
  l_port_id       int4;

  BEGIN

  FOR l_r IN
  WITH RECURSIVE all_descendents AS ( 
  SELECT id, name, parent_id, directory_file_flag, status
    FROM element
    WHERE id = (select pathname_id('/ports/head/' || a_category_name || '/' || a_port_name || '/'))
  UNION
  SELECT E.id, E.name, E.parent_id, E.directory_file_flag, E.status
    FROM element E
    JOIN all_descendents AD
      ON (E.parent_id = AD.id)
)
    SELECT element_add(replace(element_pathname(id), '/ports/head/', '/ports/branches/' || a_branch_name || '/'), directory_file_flag),
                       replace(element_pathname(id), '/ports/head/', '/ports/branches/2016Q2/')
      FROM all_descendents
     WHERE status = 'A'
  LOOP
  END LOOP;

  IF FOUND THEN
    SELECT pathname_id('/ports/branches/' || a_branch_name || '/' || a_category_name || '/' || a_port_name || '/')
      INTO l_port_id;
  END IF;

  return l_port_id;
  END;
$$  LANGUAGE 'plpgsql';

With this helper function created, the original function becomes:

--
-- this function create a port on a branch using head as a starting point.
-- i.e. from /ports/head/CATEGORY/PORT to /ports/branches/BRANCH/CATEGORY/PORT
-- is created.
-- It returns the id of the new port.
--
CREATE OR REPLACE FUNCTION CreatePort(text, text, text) returns int4 AS $$
DECLARE
  a_category_name ALIAS for $1;
  a_port_name     ALIAS for $2;
  a_branch_name   ALIAS for $3;

  l_element_id_of_new_port int4;
  l_port_id                int4;

  BEGIN

  SELECT CopyPortFromHeadToBranch(a_category_name, a_port_name, a_branch_name)
    INTO l_element_id_of_new_port;

  IF FOUND THEN
    SELECT createport(l_element_id_of_new_port, getcategory(a_category_name))
      INTO l_port_id;
  END IF;
 
  RETURN l_port_id;

  END;
$$  LANGUAGE 'plpgsql';

Here is the first test of that function:

select CreatePort('sysutils', 'bacula-docs', '2016Q2');

 createport 
------------
      41762
(1 row)

Let’s see the full pathname of that port:

# select element_pathname(element_id) from ports where id = 41762;
              element_pathname               
---------------------------------------------
 /ports/branches/2016Q2/sysutils/bacula-docs
(1 row)

Success.