Packages for latest and quarterly

Since the last past, I’ve added quarterly packages to the system.

I want to outline the changes while I remember the.

I will start by outline the tables we have.

abi

The abi table contains the usual suspects. Entries in this table are added manually, when a new release comes out.

See https://pkg.freebsd.org for examples of where these values are obtained from.

freshports.dev=# SELECT * FROM abi ORDER BY name;
 id |        name        | active |         last_checked          |         last_updated          
----+--------------------+--------+-------------------------------+-------------------------------
  7 | FreeBSD:11:aarch64 | t      | 2020-04-10 21:51:28.219155+00 | 2020-04-10 21:51:28.219155+00
  6 | FreeBSD:11:amd64   | t      | 2020-04-10 21:51:28.107421+00 | 2020-04-10 21:51:28.107421+00
  4 | FreeBSD:11:i386    | t      | 2020-04-10 21:05:01.431882+00 | 2020-04-10 21:05:01.431882+00
  9 | FreeBSD:12:aarch64 | t      | 2020-04-10 21:51:28.361931+00 | 2020-04-10 21:51:28.361931+00
  1 | FreeBSD:12:amd64   | t      | 2020-04-10 18:22:10.545416+00 | 2020-04-10 18:24:56.829453+00
  8 | FreeBSD:12:i386    | t      | 2020-04-10 21:51:28.278548+00 | 2020-04-10 21:51:28.278548+00
  2 | FreeBSD:13:aarch64 | t      | 2020-04-10 18:39:12.554194+00 | 2020-04-10 18:39:12.554194+00
 11 | FreeBSD:13:amd64   | t      | 2020-04-10 21:51:28.479757+00 | 2020-04-10 21:51:28.479757+00
 10 | FreeBSD:13:i386    | t      | 2020-04-10 21:51:28.421228+00 | 2020-04-10 21:51:28.421228+00
(9 rows)

freshports.dev=# \d abi
                                      Table "public.abi"
    Column    |           Type           | Collation | Nullable |           Default            
--------------+--------------------------+-----------+----------+------------------------------
 id           | integer                  |           | not null | generated always as identity
 name         | text                     |           | not null | 
 active       | boolean                  |           | not null | true
 last_checked | timestamp with time zone |           | not null | CURRENT_TIMESTAMP
 last_updated | timestamp with time zone |           | not null | CURRENT_TIMESTAMP
Indexes:
    "abi_pkey" PRIMARY KEY, btree (id)
    "abi_name_idx" UNIQUE, btree (name)
Referenced by:
    TABLE "packages" CONSTRAINT "packages_abi_id_fk" FOREIGN KEY (abi_id) REFERENCES abi(id) NOT VALID
    TABLE "packages_last_checked" CONSTRAINT "packages_last_checked_abi_id_fk" FOREIGN KEY (abi_id) REFERENCES abi(id)
    TABLE "packages_raw_original" CONSTRAINT "packages_raw_abi_id_fk" FOREIGN KEY (abi_id) REFERENCES abi(id) NOT VALID

freshports.dev=# 

The last_checked and last_updated columns are now deprecated. You will see their respective replacements in the packages_last_checked table.

packages

The packages contains the latest package version available for a given ABI on a given branch (e.g. latest, quarterly, 2020Q2).

freshports.dev=# SELECT * FROM packages ORDER BY package_name LIMIT 10;
   id   | abi_id | port_id | package_version | package_name | branch_id 
--------+--------+---------+-----------------+--------------+-----------
 186113 |      4 |   31186 | 0.0.23b_9       | 0ad          |         1
 204653 |      6 |   31186 | 0.0.23b_9       | 0ad          |         1
 278713 |      8 |   31186 | 0.0.23b_9       | 0ad          |         1
 371391 |     10 |   31186 | 0.0.23b_9       | 0ad          |         1
 403274 |     11 |   31186 | 0.0.23b_9       | 0ad          |         1
 495736 |      1 |   31186 | 0.0.23b_9       | 0ad          |         1
 590825 |      1 |   31186 | 0.0.23b_9       | 0ad          |       142
 622854 |      4 |   31186 | 0.0.23b_9       | 0ad          |       142
 653983 |      6 |   31186 | 0.0.23b_9       | 0ad          |       142
 712933 |      8 |   31186 | 0.0.23b_9       | 0ad          |       142
(10 rows)

freshports.dev=# \d packages
                             Table "public.packages"
     Column      |  Type   | Collation | Nullable |           Default            
-----------------+---------+-----------+----------+------------------------------
 id              | bigint  |           | not null | generated always as identity
 abi_id          | integer |           | not null | 
 port_id         | integer |           | not null | 
 package_version | text    |           | not null | 
 package_name    | text    |           | not null | 
 branch_id       | integer |           | not null | 
Indexes:
    "packages_pkey" PRIMARY KEY, btree (id)
    "fki_packages_abi_id_fk" btree (abi_id)
    "fki_packages_port_id_fk" btree (port_id)
    "packages_package_name_idx" btree (package_name)
Foreign-key constraints:
    "packages_abi_id_fk" FOREIGN KEY (abi_id) REFERENCES abi(id) NOT VALID
    "packages_branch_id_fk" FOREIGN KEY (branch_id) REFERENCES system_branch(id) NOT VALID
    "packages_port_id_fk" FOREIGN KEY (port_id) REFERENCES ports(id) NOT VALID

packages_raw

The packages_raw table is a plain text version of the packages table. The import script loads data into this table before it is loaded into the packages table.

The loading from the temporary table (packages_raw) ito the permanent table (packages) is easily done with a single database query. Doing the same work within an external program would be non-trivial, slower, and error-prone.

The conversion process

This query takes the raw data from the temporary table (packages_raw) into the permanent table (packages).

INSERT INTO packages(abi_id, branch_id, port_id, package_name, package_version)
SELECT GetABIId(abi),
       GetBranchId(branch),
       GetPort(package_origin),
       package_name,
       package_version                                                               
  FROM packages_raw
 WHERE GetPort(package_origin) IS NOT NULL
   AND GetBranchId(branch)     IS NOT NULL
   AND GetABIId(abi)           IS NOT NULL ;

At present, this query does all of the data and has no filtering (e.g. by ABI or branch).

packages_last_checked

The packages_last_checked table keeps track of work done or to be done.


freshports.dev=# SELECT * FROM packages_last_checked;
 abi_id | branch_id |         last_checked          |       repo_date        | import_date 
--------+-----------+-------------------------------+------------------------+-------------
      7 |         1 | 2020-04-12 16:56:25.824973+00 | 2018-09-29 04:24:00+00 | 
      7 |       142 | 2020-04-12 16:56:25.824973+00 | 2020-04-07 12:03:00+00 | 
      6 |         1 | 2020-04-12 16:56:25.824973+00 | 2020-04-11 00:16:00+00 | 
      6 |       142 | 2020-04-12 16:56:25.824973+00 | 2020-04-12 03:20:00+00 | 
      4 |         1 | 2020-04-12 16:56:25.824973+00 | 2020-04-10 18:43:00+00 | 
      4 |       142 | 2020-04-12 16:56:25.824973+00 | 2020-04-12 02:36:00+00 | 
      9 |         1 | 2020-04-12 16:56:25.824973+00 | 2018-10-11 05:47:00+00 | 
      9 |       142 | 2020-04-12 16:56:25.824973+00 | 2020-04-02 05:09:00+00 | 
      1 |         1 | 2020-04-12 16:56:25.824973+00 | 2020-04-11 01:18:00+00 | 
      1 |       142 | 2020-04-12 16:56:25.824973+00 | 2020-04-12 03:56:00+00 | 
      8 |         1 | 2020-04-12 16:56:25.824973+00 | 2020-04-10 19:12:00+00 | 
      8 |       142 | 2020-04-12 16:56:25.824973+00 | 2020-04-12 03:36:00+00 | 
      2 |         1 | 2020-04-12 16:56:25.824973+00 | 2020-01-04 12:39:00+00 | 
      2 |       142 | 2020-04-12 16:56:25.824973+00 |                        | 
     11 |         1 | 2020-04-12 16:56:25.824973+00 | 2020-04-11 02:22:00+00 | 
     11 |       142 | 2020-04-12 16:56:25.824973+00 |                        | 
     10 |         1 | 2020-04-12 16:56:25.824973+00 | 2020-04-10 23:14:00+00 | 
     10 |       142 | 2020-04-12 16:56:25.824973+00 |                        | 
(18 rows)

freshports.dev=# \d packages_last_checked
                   Table "public.packages_last_checked"
    Column    |           Type           | Collation | Nullable | Default 
--------------+--------------------------+-----------+----------+---------
 abi_id       | integer                  |           | not null | 
 branch_id    | integer                  |           | not null | 
 last_checked | timestamp with time zone |           |          | 
 repo_date    | timestamp with time zone |           |          | 
 import_date  | timestamp with time zone |           |          | 
Indexes:
    "packages_last_checked_pkey" PRIMARY KEY, btree (abi_id, branch_id)
    "packages_last_checked_branch_id_idx" btree (branch_id)
Foreign-key constraints:
    "packages_last_checked_abi_id_fk" FOREIGN KEY (abi_id) REFERENCES abi(id)

freshports.dev=# 

When the repo_date column is empty, no repo was found for this ABI & branch combination.

Note the import_date column is empty. The script which populates the packages table does not yet update that value.

The check_repos_for_new_stuff.py script updates the last_checked and repo_date columns.

To find repos which need to be updated, use this query:

SELECT *
  FROM packages_last_checked 
 WHERE repo_date > import_date 
   OR (repo_date IS NOT NULL AND import_date IS NULL);

What’s next?

The import-raw-abi.sh script is the basis of future work. The pseudo code is similar to this:

SELECT ABI, branch values which need updating
FOR EACH combination
DO
   fetch https://pkg.freebsd.org/$abi/$branch/packagesite.txz
    unxz packagesite.txz
    tar -xf packagesite.tar

    ls -l ./packagesite.yaml
    jq -rc --arg ABI "$abi" --arg BRANCH "$branch" '[$ABI, $BRANCH, .origin, .name, .version] | @tsv' < ./packagesite.yaml > packagesite.csv
  
    ~/bin/import-via-copy-packagesite-all-raw-fields.py -i packagesite.csv

   Update the packages_last_checked.import_date for this combination
DONE

NOTES on this:

  • The launching of this script can be trigger based. When an update occurs to the packages_last_checked, observe if new data is detected, and set work-to-be-done accordingly.
  • I now think that import and conversion from packages_raw to packages can be done separately.
  • Add another column to the packages_last_checked, call it import_processed.
  • When you find a column with import_date > import_processed, that indicates you need to run the conversion process (see a previous section).
Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

Leave a Comment

Scroll to Top