This is the latest in a series of posts dealing with displaying what packages are available for a given FreeBSD port.
The Packages – how is this data stored? post may help you follow along. It outlines the data and the tables used in this post.
Last night I managed to get the right data out for ports which have multiple flavors. I used a few procedures each of one calling another, eventually getting the results.
That effort has now been replaced with a single query.
In this post, we will be looking at the www/py-django-storages port, which has three flavors.
The query
This query came to me from RhodiumToad, who has a wide breadth of knowledge and is liberal with his help. What you see below has been adjusted since then, but it closely represents his delivery.
WITH pkg AS (SELECT * FROM packages WHERE port_id = 28303) SELECT abi.name AS abi, pn.package_name, pkg.package_set, pkg.package_version FROM abi CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn LEFT JOIN pkg on (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name) ORDER BY package_name, abi, package_set;
This blog post is mainly to document the above and go through it step by step. I’ll need this later because I know I won’t remember it. I also want to learn more about this query and apply it to future development.
I’m going to step through the query, outlining what each part does.
WITH pkg
A WITH query “provides a way to write auxiliary statements for use in a larger query”.
WITH pkg AS (SELECT * FROM packages WHERE port_id = 28303) SELECT abi.name AS abi, pn.package_name, pkg.package_set, pkg.package_version FROM abi CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn LEFT JOIN pkg on (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name) ORDER BY package_name, abi, package_set;
I think of it as a temporary table for this query. It is defining the data which will be used by the rest of the query.
I will refer to this later as the pkg table.
This is what we get:
freshports.dev=# SELECT * FROM packages WHERE port_id = 28303 ORDER BY package_name; id | abi_id | port_id | package_version | package_name | branch_id | branch_name | package_set ---------+--------+---------+-----------------+----------------------+-----------+-------------+------------- 4373305 | 1 | 28303 | 1.9.1 | py27-django-storages | | | quarterly 4369003 | 1 | 28303 | 1.9.1 | py27-django-storages | | | latest 4340907 | 7 | 28303 | 1.9.1 | py27-django-storages | | | quarterly 4392074 | 10 | 28303 | 1.9.1 | py27-django-storages | | | latest 4388153 | 11 | 28303 | 1.9.1 | py27-django-storages | | | latest 4346130 | 6 | 28303 | 1.9.1 | py27-django-storages | | | latest 4382725 | 2 | 28303 | 1.8 | py27-django-storages | | | latest 4349958 | 6 | 28303 | 1.9.1 | py27-django-storages | | | quarterly 4353741 | 4 | 28303 | 1.9.1 | py27-django-storages | | | latest 4380869 | 8 | 28303 | 1.9.1 | py27-django-storages | | | quarterly 4376671 | 8 | 28303 | 1.9.1 | py27-django-storages | | | latest 4357905 | 4 | 28303 | 1.9.1 | py27-django-storages | | | quarterly 4339390 | 7 | 28303 | 1.5.1 | py27-django-storages | | | latest 4361606 | 9 | 28303 | 1.5.1 | py27-django-storages | | | latest 4363708 | 9 | 28303 | 1.9.1 | py27-django-storages | | | quarterly 4339245 | 7 | 28303 | 1.5.1 | py36-django-storages | | | latest 4361337 | 9 | 28303 | 1.5.1 | py36-django-storages | | | latest 4369021 | 1 | 28303 | 1.9.1 | py37-django-storages | | | latest 4373197 | 1 | 28303 | 1.9.1 | py37-django-storages | | | quarterly 4365567 | 9 | 28303 | 1.9.1 | py37-django-storages | | | quarterly 4357806 | 4 | 28303 | 1.9.1 | py37-django-storages | | | quarterly 4376807 | 8 | 28303 | 1.9.1 | py37-django-storages | | | latest 4380779 | 8 | 28303 | 1.9.1 | py37-django-storages | | | quarterly 4353810 | 4 | 28303 | 1.9.1 | py37-django-storages | | | latest 4349931 | 6 | 28303 | 1.9.1 | py37-django-storages | | | quarterly 4383395 | 2 | 28303 | 1.8 | py37-django-storages | | | latest 4388136 | 11 | 28303 | 1.9.1 | py37-django-storages | | | latest 4346104 | 6 | 28303 | 1.9.1 | py37-django-storages | | | latest 4392033 | 10 | 28303 | 1.9.1 | py37-django-storages | | | latest 4342889 | 7 | 28303 | 1.9.1 | py37-django-storages | | | quarterly (30 rows) freshports.dev=#
I sorted the data just to make it easier to follow.
The above represents all the packages available for www/py-django-storages from the FreeBSD repos.
How many different builds do we have for each package_name?
SELECT package_name, count(package_name) FROM packages WHERE port_id = 28303 GROUP BY package_name ORDER BY package_name;
That query produces this result:
package_name | count ----------------------+------- py27-django-storages | 15 py36-django-storages | 2 py37-django-storages | 13 (3 rows)
Those numbers will become relevant later.
SELECT DISTINCT
SELECT DISTINCT eliminates duplicate rows from the result. Don’t do what I initially did and use brackets around your first item. Consider SELECT DISTINCT to be a single keyword. It is not a function.
WITH pkg AS (SELECT * FROM packages WHERE port_id = 28303) SELECT abi.name AS abi, pn.package_name, pkg.package_set, pkg.package_version FROM abi CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn LEFT JOIN pkg on (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name) ORDER BY package_name, abi, package_set;
We need results for all packages on all ABI. To get there, we need a list of the packages first.
Combining that SELECT DISTINCT into the WITH clause, we get:
WITH pkg AS (SELECT * FROM packages WHERE port_id = 28303) SELECT DISTINCT package_name FROM pkg; package_name ---------------------- py27-django-storages py36-django-storages py37-django-storages (3 rows)
This is the pkg table.
All ABI * package combinations
We want to show people that a package is not available on a given ABI. We also want all tables to have the same number of rows so that scanning across the page compares like-to-like. For that, we want all combinations please.
abi CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn
Here is a list of all packages beside all ABI:
WITH pkg AS (SELECT * FROM packages WHERE port_id = 28303) SELECT * FROM abi CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn id | name | active | package_name ----+--------------------+--------+---------------------- 1 | FreeBSD:12:amd64 | t | py27-django-storages 2 | FreeBSD:13:aarch64 | t | py27-django-storages 4 | FreeBSD:11:i386 | t | py27-django-storages 6 | FreeBSD:11:amd64 | t | py27-django-storages 7 | FreeBSD:11:aarch64 | t | py27-django-storages 8 | FreeBSD:12:i386 | t | py27-django-storages 9 | FreeBSD:12:aarch64 | t | py27-django-storages 10 | FreeBSD:13:i386 | t | py27-django-storages 11 | FreeBSD:13:amd64 | t | py27-django-storages 1 | FreeBSD:12:amd64 | t | py36-django-storages 2 | FreeBSD:13:aarch64 | t | py36-django-storages 4 | FreeBSD:11:i386 | t | py36-django-storages 6 | FreeBSD:11:amd64 | t | py36-django-storages 7 | FreeBSD:11:aarch64 | t | py36-django-storages 8 | FreeBSD:12:i386 | t | py36-django-storages 9 | FreeBSD:12:aarch64 | t | py36-django-storages 10 | FreeBSD:13:i386 | t | py36-django-storages 11 | FreeBSD:13:amd64 | t | py36-django-storages 1 | FreeBSD:12:amd64 | t | py37-django-storages 2 | FreeBSD:13:aarch64 | t | py37-django-storages 4 | FreeBSD:11:i386 | t | py37-django-storages 6 | FreeBSD:11:amd64 | t | py37-django-storages 7 | FreeBSD:11:aarch64 | t | py37-django-storages 8 | FreeBSD:12:i386 | t | py37-django-storages 9 | FreeBSD:12:aarch64 | t | py37-django-storages 10 | FreeBSD:13:i386 | t | py37-django-storages 11 | FreeBSD:13:amd64 | t | py37-django-storages (27 rows)
With 9 rows in the ABI table and 3 distinct package names we get 27 rows.
I will refer to this as the abi_x_package_name data.
LEFT JOIN
We have a list of all the
Now we do a LEFT JOIN. That gets us all the abi_x_package_name combinations we need with, or without, the corresponding package information.
LEFT JOIN pkg on (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name)
This takes the entries from pkg and joins it with everything from the abi table.
This takes us to the entire query now:
WITH pkg AS (SELECT * FROM packages WHERE port_id = 28303) SELECT abi.name AS abi, pn.package_name, pkg.package_set, pkg.package_version FROM abi CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn LEFT JOIN pkg on (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name) ORDER BY package_name, abi, package_set; abi | abi_id | package_name | package_set | package_version --------------------+--------+----------------------+-------------+----------------- FreeBSD:11:aarch64 | 7 | py27-django-storages | latest | 1.5.1 FreeBSD:11:aarch64 | 7 | py27-django-storages | quarterly | 1.9.1 FreeBSD:11:amd64 | 6 | py27-django-storages | latest | 1.9.1 FreeBSD:11:amd64 | 6 | py27-django-storages | quarterly | 1.9.1 FreeBSD:11:i386 | 4 | py27-django-storages | latest | 1.9.1 FreeBSD:11:i386 | 4 | py27-django-storages | quarterly | 1.9.1 FreeBSD:12:aarch64 | 9 | py27-django-storages | latest | 1.5.1 FreeBSD:12:aarch64 | 9 | py27-django-storages | quarterly | 1.9.1 FreeBSD:12:amd64 | 1 | py27-django-storages | latest | 1.9.1 FreeBSD:12:amd64 | 1 | py27-django-storages | quarterly | 1.9.1 FreeBSD:12:i386 | 8 | py27-django-storages | latest | 1.9.1 FreeBSD:12:i386 | 8 | py27-django-storages | quarterly | 1.9.1 FreeBSD:13:aarch64 | 2 | py27-django-storages | latest | 1.8 FreeBSD:13:amd64 | 11 | py27-django-storages | latest | 1.9.1 FreeBSD:13:i386 | 10 | py27-django-storages | latest | 1.9.1 FreeBSD:11:aarch64 | 7 | py36-django-storages | latest | 1.5.1 FreeBSD:11:amd64 | 6 | py36-django-storages | | FreeBSD:11:i386 | 4 | py36-django-storages | | FreeBSD:12:aarch64 | 9 | py36-django-storages | latest | 1.5.1 FreeBSD:12:amd64 | 1 | py36-django-storages | | FreeBSD:12:i386 | 8 | py36-django-storages | | FreeBSD:13:aarch64 | 2 | py36-django-storages | | FreeBSD:13:amd64 | 11 | py36-django-storages | | FreeBSD:13:i386 | 10 | py36-django-storages | | FreeBSD:11:aarch64 | 7 | py37-django-storages | quarterly | 1.9.1 FreeBSD:11:amd64 | 6 | py37-django-storages | latest | 1.9.1 FreeBSD:11:amd64 | 6 | py37-django-storages | quarterly | 1.9.1 FreeBSD:11:i386 | 4 | py37-django-storages | latest | 1.9.1 FreeBSD:11:i386 | 4 | py37-django-storages | quarterly | 1.9.1 FreeBSD:12:aarch64 | 9 | py37-django-storages | quarterly | 1.9.1 FreeBSD:12:amd64 | 1 | py37-django-storages | latest | 1.9.1 FreeBSD:12:amd64 | 1 | py37-django-storages | quarterly | 1.9.1 FreeBSD:12:i386 | 8 | py37-django-storages | latest | 1.9.1 FreeBSD:12:i386 | 8 | py37-django-storages | quarterly | 1.9.1 FreeBSD:13:aarch64 | 2 | py37-django-storages | latest | 1.8 FreeBSD:13:amd64 | 11 | py37-django-storages | latest | 1.9.1 FreeBSD:13:i386 | 10 | py37-django-storages | latest | 1.9.1 (37 rows) freshports.dev=#
From here, we start pivoting the data so we have latest and quarterly on the same line.
Pivot
By now, we have gone through the entire query. Next is the pivot.
WITH pkg AS (SELECT * FROM packages WHERE port_id = 28303) SELECT abi.name AS abi, pn.package_name, max(pkg.package_version) FILTER (WHERE pkg.package_set = 'latest') AS package_version_latest, max(pkg.package_version) FILTER (WHERE pkg.package_set = 'quarterly') AS package_version_quarterly FROM abi CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn LEFT JOIN pkg ON (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name) GROUP BY pn.package_name, abi.name ORDER BY pn.package_name, abi.name;
The filter controls which input rows are fed to the aggregate function max. I could have use min(), or any other similar function here because there is only one value which will match it. The GROUP BY clause is what controls this. For each package_name and abi.name combination, there will be 1 or two lines.
The output is the expected 27 lines of information:
abi | package_name | package_version_latest | package_version_quarterly --------------------+----------------------+------------------------+--------------------------- FreeBSD:11:aarch64 | py27-django-storages | 1.5.1 | 1.9.1 FreeBSD:11:amd64 | py27-django-storages | 1.9.1 | 1.9.1 FreeBSD:11:i386 | py27-django-storages | 1.9.1 | 1.9.1 FreeBSD:12:aarch64 | py27-django-storages | 1.5.1 | 1.9.1 FreeBSD:12:amd64 | py27-django-storages | 1.9.1 | 1.9.1 FreeBSD:12:i386 | py27-django-storages | 1.9.1 | 1.9.1 FreeBSD:13:aarch64 | py27-django-storages | 1.8 | FreeBSD:13:amd64 | py27-django-storages | 1.9.1 | FreeBSD:13:i386 | py27-django-storages | 1.9.1 | FreeBSD:11:aarch64 | py36-django-storages | 1.5.1 | FreeBSD:11:amd64 | py36-django-storages | | FreeBSD:11:i386 | py36-django-storages | | FreeBSD:12:aarch64 | py36-django-storages | 1.5.1 | FreeBSD:12:amd64 | py36-django-storages | | FreeBSD:12:i386 | py36-django-storages | | FreeBSD:13:aarch64 | py36-django-storages | | FreeBSD:13:amd64 | py36-django-storages | | FreeBSD:13:i386 | py36-django-storages | | FreeBSD:11:aarch64 | py37-django-storages | | 1.9.1 FreeBSD:11:amd64 | py37-django-storages | 1.9.1 | 1.9.1 FreeBSD:11:i386 | py37-django-storages | 1.9.1 | 1.9.1 FreeBSD:12:aarch64 | py37-django-storages | | 1.9.1 FreeBSD:12:amd64 | py37-django-storages | 1.9.1 | 1.9.1 FreeBSD:12:i386 | py37-django-storages | 1.9.1 | 1.9.1 FreeBSD:13:aarch64 | py37-django-storages | 1.8 | FreeBSD:13:amd64 | py37-django-storages | 1.9.1 | FreeBSD:13:i386 | py37-django-storages | 1.9.1 | (27 rows)
Why 27? 3 package names, 9 ABI.
Much better
This is a much better approach than the procedures I used before. I like it.