Dan Langille

Jul 022016
 

In my previous post, I wrote about the need to copy a port from head to a branch in some circumstances, specifically when a master-port is updated and the slave ports do not already exist in the FreshPorts database on that branch. This is a side-effect of FreshPorts only taking action for commits, and not repo-copies.

Get new pathnames

Expanding upon the recursive query from that post, I think I have a starting point for a copy-port function. Let me illustrate with this query. Assume we need to duplicate sysutils/bacula-client from head to branch 2016Q2.

WITH RECURSIVE all_descendents AS (
  SELECT id, name, parent_id, directory_file_flag, status
    FROM element
    WHERE id = (select pathname_id('/ports/head/sysutils/bacula-client/'))
  UNION
  SELECT E.id, E.name, E.parent_id, E.directory_file_flag, E.status
    FROM element E
    JOIN all_descendents AD
      ON (E.parent_id = AD.id)
)
   SELECT directory_file_flag, element_pathname(id) AS pathname, replace(element_pathname(id), '/ports/head/', '/ports/branches/2016Q2/')
      FROM all_descendents
     WHERE status = 'A'
 ORDER BY pathname;

 directory_file_flag |                  pathname                   |                        replace                         
---------------------+---------------------------------------------+--------------------------------------------------------
 D                   | /ports/head/sysutils/bacula-client          | /ports/branches/2016Q2/sysutils/bacula-client
 F                   | /ports/head/sysutils/bacula-client/Makefile | /ports/branches/2016Q2/sysutils/bacula-client/Makefile
(2 rows)

The first column indicates file or directory. The second column is the name of the existing element in head. The third column is the name of the element to be created in the branch.

Add new pathnames to the database

Next, how do use that information to create the new elements?

We use the existing function: element_add($in_element_pathname, $element_directory_file_flag)

Let’s incorporate that into the query above and get:

WITH RECURSIVE all_descendents AS (
  SELECT id, name, parent_id, directory_file_flag, status
    FROM element
    WHERE id = (select pathname_id('/ports/head/sysutils/bacula-client/'))
  UNION
  SELECT E.id, E.name, E.parent_id, E.directory_file_flag, E.status
    FROM element E
    JOIN all_descendents AD
      ON (E.parent_id = AD.id)
)
   SELECT element_add(replace(element_pathname(id), '/ports/head/', '/ports/branches/2016Q2/'), directory_file_flag), replace(element_pathname(id), '/ports/head/', '/ports/branches/2016Q2/'), directory_file_flag
      FROM all_descendents
     WHERE status = 'A';

 element_add |                        replace                         | directory_file_flag 
-------------+--------------------------------------------------------+---------------------
      739280 | /ports/branches/2016Q2/sysutils/bacula-client          | D
      739281 | /ports/branches/2016Q2/sysutils/bacula-client/Makefile | F

-- Let's confirm the new elements:

WITH RECURSIVE all_descendents AS (
  SELECT id, name, parent_id, directory_file_flag, status
    FROM element
    WHERE id = (select pathname_id('/ports/branches/2016Q2/sysutils/bacula-client/'))
  UNION
  SELECT E.id, E.name, E.parent_id, E.directory_file_flag, E.status
    FROM element E
    JOIN all_descendents AD
      ON (E.parent_id = AD.id)
)
   SELECT id, name, parent_id, directory_file_flag, status, element_pathname(id) AS pathname
      FROM all_descendents
     WHERE status = 'A'
 ORDER BY pathname;

   id   |     name      | parent_id | directory_file_flag | status |                        pathname                        
--------+---------------+-----------+---------------------+--------+--------------------------------------------------------
 739280 | bacula-client |    732848 | D                   | A      | /ports/branches/2016Q2/sysutils/bacula-client
 739281 | Makefile      |    739280 | F                   | A      | /ports/branches/2016Q2/sysutils/bacula-client/Makefile
(2 rows)

There they are!

Create the port

This creates the directory structure (i.e. pathnames) but it does not create the port. We can do that next with createport($port_element_id, $port_category_id). Let’s try this:

select createport(pathname_id(‘/ports/branches/2016Q2/sysutils/bacula-client’), getcategory(‘sysutils’));

Before we run that, let’s break down the individual components, because I found an error and I want to demonstrate it.

select pathname_id('/ports/branches/2016Q2/sysutils/bacula-client');
 pathname_id 
-------------
      739280
(1 row)

You will see 739280 mentioned above in the output from creating the elements on the branch.

Next we grab the category details:

select getcategory('sysutils');
 getcategory 
-------------
            
(1 row)

Eh? Woah, that’s not good.

Fix the category code

Let’s look at that code:

CREATE OR REPLACE FUNCTION GetCategory (text ) RETURNS int4 AS $$
   DECLARE
      category_name ALIAS for $1;
      pathname             text;   
      category_element_id  int4;
      category_id          int4;
   BEGIN
      pathname            := 'ports/' || category_name;
      category_element_id := Pathname_ID(pathname);
      if NOT category_element_id IS NULL THEN
         select id
           into category_id
           from categories
          where element_id = category_element_id;
      END IF;
      return category_id;
   END;
$$ LANGUAGE 'plpgsql';

Ahh, look at line 8. That pathname needs to be prefixed with ‘ports/head/’. This change is related to Full pathnames, which was work carried out a few years ago. This function was used only by some one-off maintenance scripts.

After making that change, the results are as expected:

# select getcategory('sysutils');
 getcategory 
-------------
          20
(1 row)

Back to creating the port

Let’s create that port:

# select createport(pathname_id('/ports/branches/2016Q2/sysutils/bacula-client'), getcategory('sysutils'));
 createport 
------------
      41761
(1 row)

 select element_pathname(element_id) from ports where id = 41761;
               element_pathname                
-----------------------------------------------
 /ports/branches/2016Q2/sysutils/bacula-client
(1 row)

We have a port on the branch. It needs to be refreshed from the Makefile (via make -V), but that is for a later stage.

I think we are well on our way to copying a port upon demand.

Jun 292016
 

The tracking of branches has turned out to be more involved than I thought. Many functions are already branch-aware and return / do the right thing. Tonight, I tried my recent idea of using head to get the slave ports for a given master port on a branch. So far, that works. The hard part now is dealing with a slave port on a branch which had not yet had any commits on that branch.

First, FreshPorts takes action only upon recent of a commit. If a file has been created by a repo copy, then FreshPorts knows nothing about it.

That’s bad news for branches, which are similar to a repo-copy in concept.

Compare these two URLs:

  1. https://svnweb.freebsd.org/ports/head/sysutils/bacula-server/Makefile
  2. https://svnweb.freebsd.org/ports/branches/2016Q2/sysutils/bacula-server/Makefile

You will see revision 412349 on both pages.

Ideally, FreshPorts would do it’s own internal equivalent of a repocopy when it sees the 2016Q2 branch created, but in the current database design, that would duplicate the content of head for each branch, four times a year. That’s not a good approach.

The current system of adding new elements (each file in a commit is an element, e.g. /ports/branches/2016Q2/sysutils/bacula-server/Makefile) is based upon processing a commit. Each commit has an associated revision number. Thus, when adding a new element, we add a reference to that revision name. This won’t work for the situation we have under consideration.

I think the new approach needs a function such as: $port->CreatePortOnBranch($category_name, $port_name, $CommitBranch); which will take all the elements until a given point in head and duplicate them on a branch. This seems to be the easiest solution so far. I think I can easier create a store procedure which pulls back all children under a given point in the tree.

Using the recursion query I created a few days ago as a starting point, it only took me a few minute to create this query:

WITH RECURSIVE all_descendents AS (
  SELECT id, name, parent_id, directory_file_flag, status
    FROM element
    WHERE id = (select pathname_id('/ports/head/sysutils/bacula-server/'))
  UNION
  SELECT E.id, E.name, E.parent_id, E.directory_file_flag, E.status
    FROM element E
    JOIN all_descendents AD
      ON (E.parent_id = AD.id)
)
   SELECT id, name, parent_id, directory_file_flag, status, element_pathname(id) AS pathname
      FROM all_descendents
     WHERE status = 'A'
 ORDER BY pathname;

   id   |              name               | parent_id | directory_file_flag | status |                                 pathname                                 
--------+---------------------------------+-----------+---------------------+--------+--------------------------------------------------------------------------
 205194 | bacula-server                   |       218 | D                   | A      | /ports/head/sysutils/bacula-server
 205195 | Makefile                        |    205194 | F                   | A      | /ports/head/sysutils/bacula-server/Makefile
 475011 | Makefile.common                 |    205194 | F                   | A      | /ports/head/sysutils/bacula-server/Makefile.common
 212250 | distinfo                        |    205194 | F                   | A      | /ports/head/sysutils/bacula-server/distinfo
 212251 | files                           |    205194 | D                   | A      | /ports/head/sysutils/bacula-server/files
 240895 | bacula-dir.in                   |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/bacula-dir.in
 240896 | bacula-fd.in                    |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/bacula-fd.in
 240897 | bacula-sd.in                    |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/bacula-sd.in
 212252 | chio-bacula                     |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/chio-bacula
 682189 | patch-src_console_Makefile.in   |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/patch-src_console_Makefile.in
 682190 | patch-src_dird_Makefile.in      |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/patch-src_dird_Makefile.in
 682191 | patch-src_filed_Makefile.in     |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/patch-src_filed_Makefile.in
 682192 | patch-src_qt-console_bat.pro.in |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/patch-src_qt-console_bat.pro.in
 682193 | patch-src_stored_Makefile.in    |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/patch-src_stored_Makefile.in
 682194 | patch-src_tools_Makefile.in     |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/patch-src_tools_Makefile.in
 352748 | pkg-deinstall.client.in         |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/pkg-deinstall.client.in
 352749 | pkg-deinstall.server.in         |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/pkg-deinstall.server.in
 352750 | pkg-install.client.in           |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/pkg-install.client.in
 352751 | pkg-install.server.in           |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/pkg-install.server.in
 262399 | pkg-message.client.in           |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/pkg-message.client.in
 262400 | pkg-message.server.in           |    212251 | F                   | A      | /ports/head/sysutils/bacula-server/files/pkg-message.server.in
 205198 | pkg-plist                       |    205194 | F                   | A      | /ports/head/sysutils/bacula-server/pkg-plist
 205199 | pkg-plist.client                |    205194 | F                   | A      | /ports/head/sysutils/bacula-server/pkg-plist.client
(23 rows)

Comparing this to a find in the ports tree:

$ find . | sort 
.
./Makefile
./Makefile.common
./distinfo
./files
./files/bacula-barcodes
./files/bacula-dir.in
./files/bacula-fd.in
./files/bacula-sd.in
./files/chio-bacula
./files/patch-src_console_Makefile.in
./files/patch-src_dird_Makefile.in
./files/patch-src_filed_Makefile.in
./files/patch-src_qt-console_bat.pro.in
./files/patch-src_stored_Makefile.in
./files/patch-src_tools_Makefile.in
./files/pkg-deinstall.client.in
./files/pkg-deinstall.server.in
./files/pkg-install.client.in
./files/pkg-install.server.in
./files/pkg-message.client.in
./files/pkg-message.server.in
./pkg-descr
./pkg-plist
./pkg-plist.client

That’s close enough.

I think the pseudo code for creating the port on the branch will be:

ListOfFiles = GetFilesFor('/ports/head/sysutils/bacula-server');
for file in ${Files}
do
  NewFile = replace('^/ports/head/', '/ports/branches/2016Q2/', ${file});
  AddToElementTable(${NewFile});
done

NewPort(''/ports/branches/2016Q2/sysutils/bacula-server');

Again, I’ll think on it. I’m too tired to even review what I wrote above…

Jun 282016
 

I used to think branches were simple when it came to FreshPorts, but I was wrong. Consider slave ports for example.

If we get a commit for databases/mariadb100-server, we have to update the slave ports, but only the slave ports on the branch, not the slave ports on head.

That sounds easy enough, but keep in mind that FreshPorts only knows about ports which have had a commit. If the first commit is to the master port, then there is no slave port on the branch for FreshPorts.

But wait, you’ll think, the branch is just a branch and it’s a copy of the repo from at the time the branch was … branched, the slave port is always there.

Well, yes, in the repo, but FreshPorts is not a copy of the repo, it is a reflection of commits and that data is extracted via make -V (for the most part).

When that commit for databases/mariadb100-server, there might not be any ports on the branch which have it as a master port. However, there might be commits on other branches, which means the existing SQL might return something like this:

SELECT PA.id          AS slave_port_id,
       PA.name        AS slave_port_name,
       PA.category_id AS slave_category_id,
       PA.category    AS slave_category_name,
       element_pathname(EP.element_id)
  FROM ports_active PA, element_pathname EP
 WHERE PA.master_port = 'databases/mariadb100-server'
   AND PA.element_id  = EP.element_id;

 slave_port_id |  slave_port_name  | slave_category_id | slave_category_name |                  element_pathname                  
---------------+-------------------+-------------------+---------------------+----------------------------------------------------
         35630 | mariadb100-client |                32 | databases           | /ports/head/databases/mariadb100-client
         40269 | mariadb100-client |                32 | databases           | /ports/branches/2015Q3/databases/mariadb100-client
         41023 | mariadb100-client |                32 | databases           | /ports/branches/2015Q4/databases/mariadb100-client
(3 rows)

As you can see, we have three different slave ports, none of which are on our branch (which is this particular case, is 2016Q2.

I think we need to have a query which takes the path into consideration, so we get only the slave ports on our branch. For example:

SELECT PA.id          AS slave_port_id,
       PA.name        AS slave_port_name,
       PA.category_id AS slave_category_id,
       PA.category    AS slave_category_name,
       element_pathname(EP.element_id)
  FROM ports_active PA, element_pathname EP
 WHERE PA.master_port = 'databases/mariadb100-server'
   AND EP.pathname    ilike '/ports/branches/2016Q2/%'
   AND PA.element_id  = EP.element_id;

 slave_port_id | slave_port_name | slave_category_id | slave_category_name | element_pathname 
---------------+-----------------+-------------------+---------------------+------------------
(0 rows)

This is correct. However, it is not ideal for our purposes. We want a list of ports, whether or not they are already on our branch.

Union to the rescue.

SELECT PA.id          AS slave_port_id,
       PA.name        AS slave_port_name,
       PA.category_id AS slave_category_id,
       PA.category    AS slave_category_name,
       element_pathname(EP.element_id)
  FROM ports_active PA, element_pathname EP
 WHERE PA.master_port = 'databases/mariadb100-server'
   AND EP.pathname    ilike '/ports/branches/2016Q2/%'
   AND PA.element_id  = EP.element_id

UNION

SELECT PA.id          AS slave_port_id,
       PA.name        AS slave_port_name,
       PA.category_id AS slave_category_id,
       PA.category    AS slave_category_name,
       element_pathname(EP.element_id)
  FROM ports_active PA, element_pathname EP
 WHERE PA.master_port = 'databases/mariadb100-server'
   AND EP.pathname    ilike '/ports/head/%'
   AND PA.element_id  = EP.element_id

ORDER BY slave_category_name, slave_port_name;

 slave_port_id |  slave_port_name  | slave_category_id | slave_category_name |            element_pathname             
---------------+-------------------+-------------------+---------------------+-----------------------------------------
         35630 | mariadb100-client |                32 | databases           | /ports/head/databases/mariadb100-client
(1 row)

Now we have the list we need, and it will be clear which ports do not exist on the branch and we must first create the port instead of fetching it. This point becomes clearer when you try the same query on databases/postgresql94-server

SELECT PA.id          AS slave_port_id,
       PA.name        AS slave_port_name,
       PA.category_id AS slave_category_id,
       PA.category    AS slave_category_name,
       element_pathname(EP.element_id)
  FROM ports_active PA, element_pathname EP
 WHERE PA.master_port = 'databases/postgresql94-server'
   AND EP.pathname    ilike '/ports/branches/%'
   AND PA.element_id  = EP.element_id
UNION

SELECT PA.id          AS slave_port_id,
       PA.name        AS slave_port_name,
       PA.category_id AS slave_category_id,
       PA.category    AS slave_category_name,
       element_pathname(EP.element_id)
  FROM ports_active PA, element_pathname EP
 WHERE PA.master_port = 'databases/postgresql94-server'
   AND EP.pathname    ilike '/ports/head/%'
   AND PA.element_id  = EP.element_id
ORDER BY slave_category_name, slave_port_name, element_pathname;

 slave_port_id |    slave_port_name    | slave_category_id | slave_category_name |                   element_pathname                   
---------------+-----------------------+-------------------+---------------------+------------------------------------------------------
         40432 | postgresql94-client   |                32 | databases           | /ports/branches/2015Q2/databases/postgresql94-client
         39921 | postgresql94-client   |                32 | databases           | /ports/branches/2016Q1/databases/postgresql94-client
         34655 | postgresql94-client   |                32 | databases           | /ports/head/databases/postgresql94-client
         34657 | postgresql94-contrib  |                32 | databases           | /ports/head/databases/postgresql94-contrib
         34678 | postgresql94-docs     |                32 | databases           | /ports/head/databases/postgresql94-docs
         34677 | postgresql94-plperl   |                32 | databases           | /ports/head/databases/postgresql94-plperl
         34675 | postgresql94-plpython |                32 | databases           | /ports/head/databases/postgresql94-plpython
         34676 | postgresql94-pltcl    |                32 | databases           | /ports/head/databases/postgresql94-pltcl
(8 rows)

If you look only at slave_port_name and slave_category_name, you can see we have duplicates. The ideal situation takes values from head first, then removes any values found in the branch we are interested in, and for this query, we are interested in just one branch.

I’m positive we can do this with just one query.

I thought about this issue overnight. I think the solution is as follows.

  1. Get the list of master ports from head
  2. Look for those same port names on the branch
  3. If the port exists, confirm it still has the same master port (i.e. the one from step #1)
  4. If not the same same master port, ignore this ‘slave’ port
  5. If port exists, refresh it
  6. If port does not exist on branch, create it, and refresh it

NOTE:

  • Unless otherwise mentioned, we are dealing with the branch upon with the commit occurred
  • When refreshing, we refer to the commit we are processing now

I’ll have to think about this some more.

Jun 272016
 

Marie Helene asked a very good question:

Does the “for run” dependency list only list direct dependencies, or depends of depends too?

The answer is yes, only direct dependencies, but all the information is there to provide a complete list of all dependencies.

The table looks like this:

freshports.org=# \d port_dependencies
         Table "public.port_dependencies"
         Column         |     Type     | Modifiers 
------------------------+--------------+-----------
 port_id                | integer      | not null
 port_id_dependent_upon | integer      | not null
 dependency_type        | character(1) | not null
Indexes:
    "port_dependencies_pkey" PRIMARY KEY, btree (port_id, port_id_dependent_upon, dependency_type)
Foreign-key constraints:
    "port_dependencies_port_id_dependent_upon_fkey" FOREIGN KEY (port_id_dependent_upon) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE
    "port_dependencies_port_id_fkey" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=#

Here is the list of dependencies for sysutils/bacula-server:

freshports.org=# select * from port_dependencies where port_id = (select getport('sysutils/bacula-server'));
 port_id | port_id_dependent_upon | dependency_type 
---------+------------------------+-----------------
   13990 |                    954 | B
   13990 |                  35865 | B
   13990 |                    954 | R
   13990 |                  15203 | L
   13990 |                  13986 | L
   13990 |                  35866 | L
   13990 |                  32910 | L
(7 rows)

freshports.org=# 

You’ll see that one port is listed twice, both as a Run and as a Library dependency. The following output is more useful:

freshports.org=# select *, categoryport(port_id_dependent_upon) from port_dependencies where port_id = (select getport('sysutils/bacula-server')) order by categoryport;
 port_id | port_id_dependent_upon | dependency_type |         categoryport          
---------+------------------------+-----------------+-------------------------------
   13990 |                  15203 | L               | archivers/lzo2
   13990 |                  32910 | L               | databases/postgresql93-client
   13990 |                  35866 | L               | devel/gettext-runtime
   13990 |                  35865 | B               | devel/gettext-tools
   13990 |                    954 | R               | security/openssl
   13990 |                    954 | B               | security/openssl
   13990 |                  13986 | L               | sysutils/bacula-client
(7 rows)

freshports.org=# 

Dependencies of dependencies

What we want now is a list of dependencies for each of the above mentioned dependencies. For example:

freshports.org=# select *, categoryport(port_id_dependent_upon) from port_dependencies where port_id = (select getport('archivers/lzo2'));
 port_id | port_id_dependent_upon | dependency_type | categoryport 
---------+------------------------+-----------------+--------------
(0 rows)

freshports.org=# select *, categoryport(port_id_dependent_upon) from port_dependencies where port_id = (select getport('databases/postgresql93-client'));
 port_id | port_id_dependent_upon | dependency_type | categoryport 
---------+------------------------+-----------------+--------------
(0 rows)

freshports.org=# select *, categoryport(port_id_dependent_upon) from port_dependencies where port_id = (select getport('devel/gettext-runtime'));
 port_id | port_id_dependent_upon | dependency_type |    categoryport     
---------+------------------------+-----------------+---------------------
   35866 |                  34803 | R               | print/indexinfo
   35866 |                   1849 | L               | converters/libiconv
(2 rows)

freshports.org=# 

And so on, for each port. Clearly, this calls for recursion. My first thought was to create a stored procedure and call it from within itself. I have done this for other situations in FreshPorts, but when I searched, I found this RECURSIVE Query Is Recursive post by Harold Giménez. It took me little time to create this query:

WITH RECURSIVE all_depends AS (
  SELECT  port_id, port_id_dependent_upon, dependency_type
    FROM port_dependencies
    WHERE port_id = (select getport('sysutils/bacula-server'))
  UNION
  SELECT PD.port_id, PD.port_id_dependent_upon, PD.dependency_type
    FROM port_dependencies PD
    JOIN all_depends AD
      ON (PD.port_id = AD.port_id_dependent_upon)
)
   SELECT categoryport(port_id) AS ThisPort, categoryport(port_id_dependent_upon) AS IsDependentUpon, dependency_type AS DependencyType
     FROM all_depends
 ORDER BY ThisPort, IsDependentUpon;

        thisport        |        isdependentupon        | dependencytype 
------------------------+-------------------------------+----------------
 devel/gettext-runtime  | converters/libiconv           | L
 devel/gettext-runtime  | print/indexinfo               | R
 devel/gettext-tools    | converters/libiconv           | L
 devel/gettext-tools    | devel/gettext-runtime         | L
 devel/gettext-tools    | print/indexinfo               | R
 security/openssl       | lang/perl5.20                 | B
 sysutils/bacula-server | archivers/lzo2                | L
 sysutils/bacula-server | databases/postgresql93-client | L
 sysutils/bacula-server | devel/gettext-runtime         | L
 sysutils/bacula-server | devel/gettext-tools           | B
 sysutils/bacula-server | security/openssl              | B
 sysutils/bacula-server | security/openssl              | R
 sysutils/bacula-server | sysutils/bacula-client        | L
(13 rows)

I think we could build a small tree based upon that and display it on the web page.

Jun 182016
 

This afternoon, Bert JW Regeer wished that FreshPorts “would show information from the quarterly branches too…”.

I knew I had done some work on this and that there was a way to display branch information. I check the source code and found the ?branch= parameter, but there was nothing in the database for this branch.

I discovered several missing items. I apologize that this list will not refer to something you can check. It is mostly for my own notes when i come back to this later.

To try getting one commit into the try, I did this:

cd /usr/FreshPorts/ports-jail/usr/local</li>
sudo mkdir PORTS-2016Q2
sudo chown dan PORTS-2016Q2
cd PORTS-2016Q2
svn co svn://svn.freebsd.org/ports/branches/2016Q2 .

That took some time…

I grabbed a raw email from https://docs.freebsd.org/mail/current/svn-ports-branches.html and used it for testing. It seems that branches.pm needs to contain a hash of all branches we are using. Once I added 2016Q2 to that file, the commit was processed and loaded up.

That allows http://www.freshports.org/?branch=2016Q2 to display something.

Short term

In the short term, this needs to be done:

  1. svn co of each branch of the ports tree (now underway)
  2. Get copy of all emails sent to the svn-ports-branches email list (i.e. https://docs.freebsd.org/mail/archive/svn-ports-branches.html
  3. Remove requirement for each quarterly branch to be listed in branches.pm
  4. Automate the process of checking out each quarterly branch
Apr 032016
 

See the issue raised here.

On 1 April 2016, we had some large commits (you can see them all here). This ground the front page to a halt. But as you can see, that link above loads rather quickly.

Why?

At first, I thought it was because the date-based page was caching the results. I was wrong. It was merely faster overall.

I started looking at the code and discovered that it was not caching at all. I soon fixed that. Part of the code was already written, and the existing code just needed to start using it.

I also added cache clearing support, which involved adding a new table (cache_clearing_dates).

Why is date.php faster than index.php?

If you compare the output of today’s commits (e.g. https://www.freshports.org/date.php?date=2016/4/3) with the home page, the output is the same, or at least it looks the same to me. I don’t see any differences.

Let’s examine the code.

date.php uses the Commits class (see line 113).

index.php uses the LatestCommits class (see line 159).

I suspect the difference is in there.

The SQL for the Commits class (used by date.php) is relatively easy to follow for someone with intermediate SQL skills. It’s about 60 lines.

I know what the code for LatestCommits (i.e. index.php) looks like; it is a juggernaut. It is deceiving. It’s a mere 10 lines long but there is much below the surface, and done mostly via stored procedures.

I think I will amend the Commits class to handle a date and a limit.

Dec 292015
 

Sir l33tname asked:

That’s a good question. It took some more questions on IRC for me to figure it out. Here is what I know.

The port in question is net-p2p/btsync and when I looked at it, I could see no reason for not building a package. When I asked on IRC, Dmitry Marakasov pointed out LICENSE_PERMS was the issue. Specifically, it did not contain pkg-mirror . Further conversation revealed that the best action for FreshPorts involved the following:

make -V _LICENSE_RESTRICTED -V NO_PACKAGE -V MANUAL_PACKAGE_BUILD

If _LICENSE_RESTRICTED contains delete-package or either of NO_PACKAGE and MANUAL_PACKAGE_BUILD is non-blank, do not expect a package to be present.

This requires three new fields in the FreshPorts database, one for each of those values. It will also require a little bit of code on the website to display a Package not available message.

I’ll add this to the list, and I have raised an issue.

Sep 112015
 

Today, I sat down with lattera & Shirkdog and converted the website repository from subversion to git. That repo is sitting there, ready to go. That process took about an hour.

We ran this command:

git svn clone --ignore-paths="^(api.freshports.org|backend|convert|daemontools|database-schema|dataconversion|db-conversion|develop|ports|scripts|scripts-fp2|secure|walkports)" svn+ssh://dan@svn.int.unixathome.org/usr/local/svn/repos/freshports-1

Yeah, I didn’t want to include all that that code in the repo. Various reasons, but mostly because they are not related to the website.

What’s in there?

I found some passwords in there. They are in sample configuration files. They are neither production nor development passwords. At one they, they were used.

You’ll also find cookie encoding functions. If you know someone’s login name, you can figure create your own cookie and login as them.

What do I have to do first?

Here’s what I have to do before I’m ready to upload the code to GitHub.

  • check passwords
  • change cookie encoding

In the meantime, if you have any questions, we’re on IRC at #FreshPorts on FreeNode.

Yeah, it’s more vulnerable now

I admit it. FreshPorts is now easier to attack, given you have all the web source. Or will, soon.

Aug 092015
 

A great feature of FreshPorts is the ability to see all the ports which are dependent upon a given port. For example, it allows you to see what ports depend upon security/polarssl. This list is stored in a the port_dependencies table, which looks like this:

freshports.org=# \d port_dependencies
         Table "public.port_dependencies"
         Column         |     Type     | Modifiers 
------------------------+--------------+-----------
 port_id                | integer      | not null
 port_id_dependent_upon | integer      | not null
 dependency_type        | character(1) | not null
Indexes:
    "port_dependencies_pkey" PRIMARY KEY, btree (port_id, port_id_dependent_upon, dependency_type)
Foreign-key constraints:
    "port_dependencies_port_id_dependent_upon_fkey" FOREIGN KEY (port_id_dependent_upon) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE
    "port_dependencies_port_id_fkey" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=# 

This information is gathered when a port is updated. The data is used by both ports (i.e. port_id and port_id_dependent_upon) and is shown on the page for both ports. One port lists the other either under This port is required by: or under Dependencies (e.g. Build dependencies).

There is a bug where the dependent-upon port does not correctly display all dependent ports. The root cause is a caching issue. The page for a given port is cached until that port is updated, thus a dependent-upon-port will not refresh its list if a new dependency arises.

For example, consider security/polarssl on 14 Jul 2015 where you would have seen:

This port is required by:

for Libraries
emulators/dolphin-emu
security/openvpn-polarssl

On 5 Jul 2015, a new port was created security/openvpn-polarssl which was dependent upon security/polarssl. In this case, security/openvpn-polarssl would correctly display its dependent ports, however security/polarssl did not.

It was Avengence and mandree who pointed this problem out and led to the issue in question.

It was clearly a caching issue as demonstrated by clearing the cache for security/polarssl which then correctly included the new port.

The solution

The solution centered around stored procedures (SP). I created a new SP based upon an existing one.


CREATE OR REPLACE FUNCTION port_dependencies_clear_cache() RETURNS TRIGGER AS $$
   DECLARE
      l_cache_clearing_ports_id   int8;
      l_port      text;
      l_category  text;
   BEGIN
      IF TG_OP = 'UPDATE' OR TG_OP = 'INSERT' THEN
         INSERT INTO cache_clearing_ports (port_id, category, port)
            SELECT distinct id, category, name
              FROM port_dependencies PD, ports_all PA
             WHERE PD.port_id                = NEW.id
               AND PD.port_id_dependent_upon = PA.id
               AND NOT EXISTS (SELECT port_id 
                                 FROM cache_clearing_ports
                                WHERE port_id = PD.port_id_dependent_upon);

          NOTIFY port_updated;
      END IF;

      IF TG_OP = 'DELETE' THEN
         INSERT INTO cache_clearing_ports (port_id, category, port)
            SELECT distinct id, category, name
              FROM port_dependencies PD, ports_all PA
             WHERE PD.port_id                = OLD.id
               AND PD.port_id_dependent_upon = PA.id
               AND NOT EXISTS (SELECT port_id 
                                 FROM cache_clearing_ports
                                WHERE port_id = PD.port_id_dependent_upon);

          NOTIFY port_updated;
      END IF;

      -- when a port changes, add an entry to the cache clearing table
      RETURN NEW;
   END
$$ LANGUAGE 'plpgsql';

  DROP TRIGGER port_dependencies_clear_cache ON ports;
CREATE TRIGGER port_dependencies_clear_cache
    AFTER UPDATE OR INSERT OR DELETE on ports
    FOR EACH ROW
    EXECUTE PROCEDURE port_dependencies_clear_cache();

This is a trigger which will be invoked each time there is an update, insert, or delete on the ports table (see line 40). The SP will queue up a cache clear for all ports which the given port is dependent upon. This queueing is accomplished by the highlighting code, one being slightly different from the other because of the source of the port id for a DELETE (i.e. you get it off the OLD record).

The test

Here is the test I ran to confirm the correct action:

freshports.org=# begin;
BEGIN
freshports.org=# select * from cache_clearing_ports;
 id | port_id | category | port | date_added 
----+---------+----------+------+------------
(0 rows)

freshports.org=# update ports set patch_depends = '***' where id = 36926;
UPDATE 1
freshports.org=# select * from cache_clearing_ports;
   id    | port_id | category  |       port       |         date_added         
---------+---------+-----------+------------------+----------------------------
 2151680 |   15203 | archivers | lzo2             | 2015-08-09 10:35:03.292856
 2151681 |   32213 | security  | easy-rsa         | 2015-08-09 10:35:03.292856
 2151682 |   31511 | devel     | pkgconf          | 2015-08-09 10:35:03.292856
 2151683 |   25927 | security  | polarssl         | 2015-08-09 10:35:03.292856
 2151684 |   36926 | security  | openvpn-polarssl | 2015-08-09 10:35:03.292856
(5 rows)

freshports.org=# rollback;
ROLLBACK
freshports.org=# 

In this test, I do a fake update to port.id = 36926 (security/openvpn-polarssl). After the update, you can see that the cache_clearing_ports now lists all the dependent ports and itself. These cache entries will be removed when the fp-list wakes up and processes the table.

I had planned to work on the above issue during my flight home from DefCon, but that is tomorrow.

Jul 022015
 

FreshPorts is a tender beast. It is both massive and sensitive. One example of that sensitivity came to light today.

Jeffrey Bouquet wrote in to tell me of a problem with paths on FreshSource (our sister website, which displays more, but less). The problem has been corrected, but I took screen shots to illustrate this post.

Looking

Looking at FreshSource and scrolling down, I saw this:

Screen Shot 2015-07-02 at 11.49.27 AM

The top commit does not list the files. The second commit does. Oh.

I started looking at those two commits in FreshPorts and I found:

Screen Shot 2015-07-02 at 11.52.40 AM

URL for above commit

The path name is listed, but that’s just text. Notice that the Links column is empty.

Compare that with the previous commit:

Screen Shot 2015-07-02 at 11.46.08 AM

URL to above commit

As you see, the Links column is populated.

Source

Digging around in the code for this output, I noticed a call to freshports_MessageIdToRepoName($message_id) and that is what clued me in.

message_id

If you examine the two URLs to commits above, you will notice a difference in the message_id paramter:

  • message_id=201507011852.t61Iqbv2094356@repo.freebsd.org
  • message_id=201507011757.t61Hv8u7065602@svn.freebsd.org

Ahh, they changed the hostname for outgoing commit emails.

The fix

A few minutes later and some quick tests, we had a fix:

 $ svn di freshports.php 
Index: freshports.php
===================================================================
--- freshports.php	(revision 4718)
+++ freshports.php	(working copy)
@@ -35,7 +35,7 @@
 
 DEFINE('SPONSORS', 'Servers and bandwidth provided by<br><a href="http://www.nyi.net/" TARGET="_new">New York Internet</a>, <a href="http://www.supernews.com/"  TARGET="_new">SuperNews</a>, and <a href="http://www.rootbsd.net/" TARGET="_new">RootBSD</a>');
 
-DEFINE('FRESHPORTS_ENCODING', 'UTF-8');
+DEFINE('FRESHPORTS_ENCODING', 'ISO-8859-15');
 
 if ($Debug) echo "'" . $_SERVER['DOCUMENT_ROOT'] . '/../classes/watchnotice.php<br>';
 
@@ -2093,7 +2093,9 @@
 {
   $repo = array(
             '/\@svn.freebsd.org$/i'     => FREEBSD_REPO_SVN,
-            '/\@repoman.freebsd.org$/i' => FREEBSD_REPO_CVS);
+            '/\@repoman.freebsd.org$/i' => FREEBSD_REPO_CVS,
+            '/\@repo.freebsd.org$/i'    => FREEBSD_REPO_SVN,
+);
 
   # given a message id, figure out what repo it came from
   $RepoName = '';

Why bother?

FreshPorts predates the change from cvs to svn. The data in FreshPorts does not care what type of revision management tool you use, it does link back to the repo for your browsing pleasure. If FreshPorts knows the repo related to a given commit, it can link. If not, it can’t.