A recent commit to the FreeBSD ports tree brought in a new category: filesystems
See also the followup to this blog post: Fixing the category creation code
This commit has been processed by FreshPorts, the results are only partially good. For example, this screenshot shows the deleted ports (moved from their old categories to the new categories):
Scrolling down, I found the references to the new category but they are all listed as individual files. This tells me the system does not recognize them as existing within a known category. Conclusion: the data is there, just not yet complete.
If you click on one of those links, it takes you to a non-port page:
Confirming my ideas
So far, it seems like filesystems was not added to the categories table.
Let’s look at that table:
freshports.dvl=# select * from categories order by name; id | is_primary | element_id | name | description -----+------------+------------+------------------+------------------------------------------------------------ 85 | t | 171607 | accessibility | Ports to help disabled users. 64 | f | | afterstep | Ports to support the AfterStep window manager. 88 | t | 159346 | arabic | Ported software for the Arabic market. 23 | t | 350 | archivers | Utilities for archiving and unarchiving data. 26 | t | 410 | astro | Applications related to astronomy. 25 | t | 386 | audio | Audio utilities - most require a supported sound card. 42 | t | 2710 | benchmarks | Utilities for measuring system performance. 36 | t | 869 | biology | Software related to biology. 136 | f | | budgie | no description supplied 35 | t | 830 | cad | Computer Aided Design utilities. 39 | t | 1660 | chinese | Ported software for the Chinese market. 41 | t | 2191 | comms | Communications utilities. 27 | t | 423 | converters | Format conversion utilities. 32 | t | 582 | databases | Database software. 33 | t | 802 | deskutils | Various Desktop utilities. 10 | t | 84 | devel | Software development utilities and libraries. 84 | t | 148762 | dns | DNS client and server utilities. 118 | f | | docs | no description supplied 1 | t | 2 | editors | Common text editors. 135 | f | | education | no description supplied 63 | f | | elisp | Things related to Emacs Lisp. 22 | t | 245 | emulators | Utilities for emulating other OS types. 119 | f | | enlightenment | no description supplied 125 | f | | erlang | no description supplied 54 | t | 118514 | finance | Monetary, financial and related applications. 47 | t | 16545 | french | Ported software for French countries. 13 | t | 140 | ftp | FTP client and server utilities. 3 | t | 18 | games | Various and sundry amusements. 115 | f | | geography | Geography related ports. 44 | t | 3747 | german | Ported software for Germanic countries. 58 | f | | gnome | Components of the Gnome Desktop environment. 101 | f | | gnustep | Software for GNUstep desktop environment. 4 | t | 29 | graphics | Graphics libraries and utilities. 96 | f | | hamradio | Software for amateur radio. 77 | f | | haskell | Software related to the Haskell language. 46 | t | 11329 | hebrew | Ported software for Hebrew language. 51 | t | 118517 | hungarian | Ported software for the Hungarian market. 62 | f | | ipv6 | IPv6 related software. 6 | t | 39 | irc | Internet Relay Chat utilities. 12 | t | 129 | japanese | Ported software for the Japanese market. 34 | t | 815 | java | Java language support. 55 | f | | kde | Software for the K Desktop Environment. 131 | f | | kde-applications | no description supplied 137 | f | | kde-devel | no description supplied 129 | f | | kde-frameworks | no description supplied 128 | f | | kde-kde4 | no description supplied 133 | f | | kde-plasma | no description supplied 114 | f | | kld | Kernel loadable modules. 37 | t | 1109 | korean | Ported software for the Korean market. 15 | t | 171 | lang | Computer languages. 66 | f | | linux | Linux programs that can be run under binary compatibility. 90 | f | | lisp | Things related to pure lisp. 19 | t | 201 | mail | Electronic mail packages and utilities. 124 | f | | mate | no description supplied 16 | t | 176 | math | Mathematical computation software. 45 | t | 6412 | mbone | Applications and utilities for the MBONE. 7 | t | 42 | misc | Miscellaneous utilities. 52 | t | 118520 | multimedia | Multimedia software. 8 | t | 50 | net | Networking utilities. 95 | t | 229588 | net-im | Instant messaging software. 92 | t | 173566 | net-mgmt | Network management utilities. 98 | t | 236506 | net-p2p | Peer to peer networking software. 134 | f | | net-vpn | no description supplied 17 | t | 179 | news | USENET News support software. 76 | f | | offix | This is a virtual category. No description is available. 40 | t | 2143 | palm | Software support for the Palm(tm) series. 93 | f | | paralell | This is a virtual category. No description is available. 68 | f | | parallel | Applications dealing with parallelism in computing. 89 | f | | pear | Utilities/modules that fall into the PEAR system. 94 | f | | perl | This is a virtual category. No description is available.
Nope, it’s not there. At all. Note the entries with ‘no description supplied’ – that might come up later. They are all virtual categories.
Did the tree entry for filesystems get created?
freshports.dvl=# select *, element_pathname(id) from element where name = 'filesystems'; id | name | parent_id | directory_file_flag | status | element_pathname ---------+-------------+-----------+---------------------+--------+------------------------------------------------------------------ 134583 | filesystems | 134566 | F | A | /base/head/contrib/file/Magdir/filesystems 972964 | filesystems | 972846 | F | A | /base/stable/12/contrib/file/magic/Magdir/filesystems 312102 | filesystems | 200655 | D | A | /doc/head/zh_CN.GB2312/books/handbook/filesystems 309639 | filesystems | 77344 | D | A | /doc/head/en_US.ISO8859-1/books/handbook/filesystems 309644 | filesystems | 77691 | D | A | /doc/head/de_DE.ISO8859-1/books/handbook/filesystems 309723 | filesystems | 271177 | D | A | /doc/head/mn_MN.UTF-8/books/handbook/filesystems 309840 | filesystems | 300813 | D | A | /doc/head/hu_HU.ISO8859-2/books/handbook/filesystems 310134 | filesystems | 288608 | D | A | /doc/head/el_GR.ISO8859-7/books/handbook/filesystems ... 1225573 | filesystems | 1220819 | D | A | /doc/head/documentation/content/zh-cn/books/handbook/filesystems 1225694 | filesystems | 1218497 | D | A | /doc/head/documentation/content/zh-tw/books/handbook/filesystems 1439631 | filesystems | 464087 | D | A | /ports/head/filesystems (68 rows)
Yes, it’s 1439631. Does that element have any children?
freshports.dvl=# select *, element_pathname(id) from element where parent_id = 1439631; id | name | parent_id | directory_file_flag | status | element_pathname ---------+----------------------+-----------+---------------------+--------+---------------------------------------------- 1439632 | Makefile | 1439631 | F | A | /ports/head/filesystems/Makefile 1439633 | R-cran-fs | 1439631 | D | A | /ports/head/filesystems/R-cran-fs 1439637 | acfgfs | 1439631 | D | A | /ports/head/filesystems/acfgfs ... 1440492 | zrepl-dsh2dsh | 1439631 | D | A | /ports/head/filesystems/zrepl-dsh2dsh 1440499 | zrepl | 1439631 | D | A | /ports/head/filesystems/zrepl 1440514 | zxfer | 1439631 | D | A | /ports/head/filesystems/zxfer (141 rows)
Yes. Yes it does. So all the raw elements are there. Good.
When/where/how are new categories created?
FreshPorts is fairly big. I don’t keep it all in my head. Eventually I found it:
[17:10 pg03 dvl ~/src/freshports/database-schema] % grep -ri 'into categories' * freshports.wam:Ports are divided into categories.\ sp.txt: INSERT INTO categories (id, is_primary, element_id, name, description) sp.txt: INSERT INTO categories (id, is_primary, name, description)
Ahh, there is it, in stored procedures.
CREATE OR REPLACE FUNCTION CreateCategory(text, text, bool) returns int4 AS $$ DECLARE category_name ALIAS for $1; category_description ALIAS for $2; category_is_primary ALIAS for $3; pathname text; category_element_id int4; category_id int4; BEGIN category_id := nextval('categories_id_seq'); IF category_is_primary THEN pathname := 'ports/' || category_name; category_element_id := Pathname_ID(pathname); IF category_element_id is NULL THEN category_element_id := Element_Add(pathname, 'D'); END IF; INSERT INTO categories (id, is_primary, element_id, name, description) values (category_id, category_is_primary, category_element_id, category_name, category_description); ELSE INSERT INTO categories (id, is_primary, name, description) values (category_id, category_is_primary, category_name, category_description); END IF; return category_id; END; $$ LANGUAGE 'plpgsql';
Where is that stored procedure invoked?
[17:11 pg03 dvl ~/src/freshports/database-schema] % grep -ri CreateCategory * ri.txt: CategoryID := CreateCategory(Category, 'no description supplied', FALSE); sp.txt:CREATE OR REPLACE FUNCTION CreateCategory(text, text, bool) returns int4 AS $$ sp.txt: CategoryID := CreateCategory(Category, 'no description supplied', FALSE);
From both triggers (ri.txt) and procedures (sp.txt).
Triggers
When a new entry is added to the ports table, the ports_categories table is updated (to indicate what categories that port is in). If the category does not exist, it is created.
Idea: perhaps this is where perhaps we have fallen down: we didn’t add anything to the ports table because these are non-ports….
This is the trigger:
CREATE OR REPLACE FUNCTION ports_categories_set() RETURNS TRIGGER AS $$ DECLARE CategoryCount integer; CategoryNames text; Category text; CategoryID int8; UpdateNeeded boolean; BEGIN DELETE FROM ports_categories WHERE port_id = new.id; IF new.categories IS NOT NULL THEN CategoryCount := 0; CategoryNames := new.categories; LOOP CategoryCount := CategoryCount + 1; Category := split_part(CategoryNames, ' ', CategoryCount); -- let us not get carried away.... IF Category = '' OR CategoryCount > 100 THEN EXIT; END IF; SELECT id INTO CategoryID FROM categories WHERE name = Category; IF NOT FOUND THEN RAISE NOTICE ' we need to create category % ', Category; CategoryID := CreateCategory(Category, 'no description supplied', FALSE); END IF; INSERT INTO ports_categories (port_id, category_id) SELECT ports.id, CategoryID FROM ports WHERE ports.id = new.id AND NOT exists (SELECT * FROM ports_categories WHERE port_id = new.id AND category_id = CategoryID); END LOOP; END IF; RETURN NEW; END $$ LANGUAGE 'plpgsql';
The above trigger gets invoked by:
DROP TRIGGER IF EXISTS ports_ports_categories ON ports; CREATE TRIGGER ports_ports_categories AFTER INSERT OR UPDATE ON ports FOR EACH ROW EXECUTE PROCEDURE ports_categories_set();
That seems reasonable.
Now for the stored procedures
These is where I found CreateCategory() in sp.txt
Notice how the trigger and the procedure both have the same name: ports_categories_set – that gave me some concern.
Of more concern, the following procedure seems to be a trigger and references trigger-like things: e.g. IF TG_OP = ‘UPDATE’ THEN
What’s going on?
CREATE OR REPLACE FUNCTION public.ports_categories_set() RETURNS trigger LANGUAGE plpgsql AS $$ DECLARE CategoryCount integer; CategoryNames text; Category text; CategoryID int8; UpdateNeeded boolean; BEGIN UpdateNeeded := TRUE; IF TG_OP = 'UPDATE' THEN IF old.categories = new.categories THEN UpdateNeeded := FALSE; END IF; END IF; -- RAISE NOTICE 'ports_categories_set() invoked for % ', element_pathname(new.element_id); -- RAISE NOTICE 'categories is % ', new.categories; IF UpdateNeeded THEN DELETE FROM ports_categories WHERE port_id = new.id; IF new.categories IS NOT NULL THEN CategoryCount := 0; CategoryNames := new.categories; LOOP CategoryCount := CategoryCount + 1; Category := split_part(CategoryNames, ' ', CategoryCount); -- let us not get carried away.... IF Category = '' OR CategoryCount > 100 THEN EXIT; END IF; SELECT id INTO CategoryID FROM categories WHERE name = Category; IF NOT FOUND THEN RAISE NOTICE ' we need to create category % ', Category; CategoryID := CreateCategory(Category, 'no description supplied', FALSE); END IF; -- RAISE NOTICE 'adding entry for % ', Category; INSERT INTO ports_categories (port_id, category_id) SELECT ports.id, CategoryID FROM ports WHERE ports.id = new.id AND NOT exists (SELECT * FROM ports_categories WHERE port_id = new.id AND category_id = CategoryID); -- RAISE NOTICE ' category % is % with id % for port id %', CategoryCount, Category, CategoryID, new.id; END LOOP; END IF; END IF; RETURN NEW; END $$;
Comparing the two procedures, they achieve the same goal. I think the sp.txt function should be deleted. It acts only upon UPDATE. Nothing would happen on INSERT, for example. That point is consistent with our situation.
Background
When I refresh the database, I always run the scripts in this order:
\i ri.txt \i sp.txt
I would explain the sp.txt version of the function to overwrite what is in ri.txt.
However, that is not consistent with the output of:
pg_dump –schema-only freshports.dvl > ~/tmp/freshports.dvl.sql
When I look at the output, I see the ri.txt (i.e. no IF TG_OP = ‘UPDATE’ THEN).
So why we’re in this situation, I can only guess that I ran it the other way around.
Looking into the other files in this directory, yes, I did just that:
[19:24 pg03 dvl ~/src/freshports/database-schema] % cat updates-2024-06-22.README.txt -- To upgrade the FreshPorts database to user_cookie table: begin; \i datatype.txt \i updates-2024-06-22.ddl \i updates-2024-06-22.permissions.for.user_cookies \i updates-2024-06-22.sql \i sp.txt \i ri.txt
That explains why the database does not have the IF TG_OP = ‘UPDATE’ THEN version of the code, which I don’t think we want as as trigger anyway. I have not checked all the databases, but I’m confident that that is what I will find.
Did any ports get created?
Did anything get added to the ports table?
No, I think not.
freshports.dvl=# select name, categories from ports_active order by id desc limit 10; name | categories ------------------------------+--------------------- py-stig | net-p2p py-powa-collector | databases python libskiasharp | graphics lgogdownloader | games gcr3 | security gnome py-youtube-transcript-api | www python libfobos | comms hamradio py-pangocffi | x11-toolkits python rubygem-doorkeeper57-rails70 | security rubygems rubygem-cmath | math rubygems (10 rows) freshports.dvl=#
I also checked via:
freshports.dvl=# select * from ports where categories ilike '%filesystems%';
Nothing.
Time to review the log
The file in question is ~freshports/message-queues/recent/*6e2da9672f79f44048d597f0f61e4646cdeade9d*.log
…
I spent an hour or so scrolling through the 592162 lines logged for this commit. That’s a lot…
My conclusion: the trigger above recognizes when a new category is referenced by a port. However, that’s not when it needs to happen.
There is a section of the code which compiles a list of ports which have been touched by this commit. From modules/verifyport.pm:
print "STARTING _CompileListOfPorts ................................\n"; print "for a commit on 'branch': '$CommitBranch'\n"; foreach $value (@{$Files}) {
Let’s compare.
Comparing success and fail logs
I think that’s what I need to concentrate on. Consider these logs from the recent creation of net-p2p/py-stig:
this commit is on head FILE ==: Add, /ports/head/net-p2p/py-stig/Makefile, 23c4ba88d11fb5c1a9322d4f67cb9b40f64698f6, ports, net-p2p, py-stig, Makefile, 5415041 YES, this file is in the ports tree checking for category='net-p2p' sql = "select * from categories where name = 'net-p2p'" Category net-p2p has ID = 98 checking for port='net-p2p/py-stig' * * * not found in existing cache. we'll have to load/create that port! fetching by _FetchElementIDByPartialPathName: '/ports/head/net-p2p/py-stig' sql = 'select Pathname_ID('/ports/head/net-p2p/py-stig')' Element::FetchByName - here is what that SQL returned Element::FetchByName found: 1438426 done.... I found this element id for that pathname: 1438426 sql = 'SET CLIENT_ENCODING TO 'SQL_ASCII'; select ports.*, categories.name as category, element.name as name, element_pathname(element.id, FALSE) as element_pathname from ports, categories, element where ports.element_id = 1438426 and ports.category_id = categories.id and ports.element_id = element.id' port not retrieved with /ports/head/net-p2p/py-stig. This must be a new port. SETTING CATEGORY = 98
I want to look at the code and find out why that loop contains nothing for /filesystems/.
In comparison, the filesystems commit contains this:
this commit is on head FILE ==: Add, /ports/head/filesystems/Makefile, 6e2da9672f79f44048d597f0f61e4646cdeade9d, ports, filesystems, Makefile, 5416447 YES, this file is in the ports tree ... but this file is not part of a physical category on disk! this commit is on head FILE ==: Add, /ports/head/filesystems/R-cran-fs/Makefile, 6e2da9672f79f44048d597f0f61e4646cdeade9d, ports, filesystems, R-cran-fs, Makefile, 5416448 YES, this file is in the ports tree ... but this file is not part of a physical category on disk! this commit is on head FILE ==: Add, /ports/head/filesystems/R-cran-fs/distinfo, 6e2da9672f79f44048d597f0f61e4646cdeade9d, ports, filesystems, R-cran-fs, distinfo, 5416449 YES, this file is in the ports tree ... but this file is not part of a physical category on disk!
That’s the the cause. The code needs to recognize that it should create a new category and proceed.
Here’s the code in question, based on that message, from modules/verifyport.pm.
if ($subtree eq $FreshPorts::Config::ports_prefix && defined($category_name) && defined($port_name)) { print "YES, this file is in the ports tree\n"; # if this is a valid category if ( any {/$category_name/} @FreshPorts::Categories::categories ) { # find the port for this filename.... if ($ListOfPorts{"$category_name/$port_name"}) { print "but we have already seen the port $category_name/$port_name\n\n"; # we've already added this port to the list of ports for this commit } else { # # check that the category exists. and the port. # But we don't create any ports yet. # we do that, if necessary, later. # print "checking for category='$category_name'\n"; $category = $CategoriesChecked{$category_name}; if (!defined($category)) { $category = FreshPorts::Category->new($dbh); $category->{name} = $category_name; my $category_id = $category->FetchByName(); if (defined($category_id)) { print "Category $category_name has ID = $category_id\n"; } else { # we need to create this catgory. # remember to grab ports/<category>/pkg/COMMENT # actually, it's in ports/<category>/Makefile as a COMMENT print "creating new category $category_name\n"; FreshPorts::Utilities::ReportError('warning', "creating new category $category_name", 0); $category->{is_primary} = 1; $category_id = $category->save(); if (!defined($category_id)) { FreshPorts::Utilities::ReportError('warning', "failed to create new category $category_name", 1); } $CategoriesChecked{$category_name} = $category; } } else { print "found that category $category_name in the cache\n"; } print "checking for port='$category_name/$port_name'\n"; $port = $ListOfPorts{"$category_name/$port_name"}; # we won't create a new port based on "cat/port", because that could be a file in the category's directory. # instead, we want to ensure that "cat/port" refers to a directory, versus a file. # such a situation exists if $extra has some value. # see 201205251025.q4PAPOvV092118@repoman.freebsd.org where 'deskutils/svn.log' was accidently added # in a previous commit, and then removed. The previous code would add svn.log as a port, and then # add it as an element and a port. See _RecordPortsAndElements() where svn.log would be listed in # both CommitLogPorts and Files. # if (defined($extra)) { if (!$port) { print "* * * not found in existing cache. we'll have to load/create that port!\n"; $port = FreshPorts::Port->new($dbh, $RepoType); # this is all that's needed to retrieve a port which exists if ($CommitBranch eq $FreshPorts::Constants::HEAD) { $port->{partialpathname} = "/$subtree/$branch/$category_name/$port_name"; } else { $port->{partialpathname} = "/$subtree/branches/$branch/$category_name/$port_name"; } $port->FetchByPartialPathName(); # # the above fetch may have failed. # in which case, $port->{id} will not be defined # we will take advantage of that later. # for now, all we want is a complete list of ports. # if (!defined($port->{id})) { print "port not retrieved with $port->{partialpathname}. This must be a new port.\n"; # # these are the values needed to create a new port # $port->{category_id} = $category->{id}; $port->{name} = $port_name; $port->{category} = $category_name; # # we are creating a new port (probably), so we make it active. # we need this set for later use. # $port->SetActive(); } print "SETTING CATEGORY = $port->{category_id}\n"; $ListOfPorts{"$category_name/$port_name"} = $port; } else { print "found that port $category_name/$port_name in the cache\n"; } # # $port now contains the port for this file. # let's adjust the needs_refresh value. # # # if we just deleted the Makefile for this port, there's no sense in refreshing the port. # because it's been deleted. # if ($extra eq $FreshPorts::Constants::FILE_MAKEFILE && ($action eq $FreshPorts::Constants::REMOVE || $action eq $FreshPorts::Constants # # EDIT 2020-07-30 - for git processing, we want to delete the parent of $extra when we detect that the # port Makefile is being deleted. see https://news.freshports.org/2020/07/29/git-changing-libraries-gave-us-new-xml-options/ # This should be straight forward. # # we are deleted (local value, never actually saved to db) # # instead of settting $port->{deleted} = 1;, try this: re https://github.com/FreshPorts/freshports/issues/528 $port->SetDeleted(); print "THIS PORT HAS BEEN DELETED\n"; } } else { print "\$extra is not defined, therefore, this is not considered a port.\n"; } } } else { print "... but this file is not part of a physical category on disk!\n\n"; } } else { print "that file isn't in the ports tree\n"; }
In short, the problem is:
- the code notices it’s not a valid category
- it aborts and ignore the port entirely
- it looks like creating the category and proceeding should solve the problem entirely
- line 5 is the test and line 129 is the error message
Pseudo code for the fix
Here’s my idea:
if ( any {/$category_name/} @FreshPorts::Categories::categories ) { # we have a valid category - nothing to do now } else { # create category if ($port_name == $FreshPorts::Constants::FILE_MAKEFILE) { # yes, this is the Makefile for the new category - we are good to create a new category. # see line 35 above where we already create a new category $category = FreshPorts::Category->new($dbh); $category->{name} = $category_name; print "creating new category $category_name\n"; FreshPorts::Utilities::ReportError('warning', "creating new category $category_name", 0); $category->{is_primary} = 1; $category_id = $category->save(); if (!defined($category_id)) { FreshPorts::Utilities::ReportError('warning', "failed to create new category $category_name", 1); } } }
This is the first part of the solution. Now that I’m reading the category code, I see more changes are needed.
That will need another blog post.
In short, the problem is:
- the code notices it’s not a valid category
- it aborts and ignore the port entirely
- it looks like creating the category and proceeding should solve the problem entirely
- line 5 is the test and line 129 is the error message
Pseudo code for the fix
Here’s my idea:
if ( any {/$category_name/} @FreshPorts::Categories::categories ) { # we have a valid category - nothing to do now } else { # create category if ($port_name == $FreshPorts::Constants::FILE_MAKEFILE) { # yes, this is the Makefile for the new category - we are good to create a new category. # see line 35 above where we already create a new category $category = FreshPorts::Category->new($dbh); $category->{name} = $category_name; print "creating new category $category_name\n"; FreshPorts::Utilities::ReportError('warning', "creating new category $category_name", 0); $category->{is_primary} = 1; $category_id = $category->save(); if (!defined($category_id)) { FreshPorts::Utilities::ReportError('warning', "failed to create new category $category_name", 1); } } }
This is the first part of the solution. Now that I’m reading the category code, I see more changes are needed.
That will need another blog post.