This post is a work-in-progress. This paragraph will be removed when it is completed.
This post is mostly for my own benefit. It will track the redesign of process_vuxml.pl which follows on from How FreshPorts processes vuxml entries. In that previous post, defects were identified and a proposed change put forth. In this post, I’ll document how I intend to proceed.
Temp table
I’m going to create a temp table, consisting of:
- vid
- hash (I will also refer to that as a checksum)
The data from security/vuxml/vuln.xml will be imported into that table.
freshports.dvl=# \d vuxml_import Table "public.vuxml_import" Column | Type | Collation | Nullable | Default ----------+------+-----------+----------+--------- vid | text | | not null | checksum | text | | not null | Indexes: "vuxml_import_vid" PRIMARY KEY, btree (vid) freshports.dvl=#
It took me most of an hour to get to this point. First, I had to grant access to from this laptop to the database server via port 5432 (pf firewall rules). Second, I had to modify the PostgreSQL pg_hba.conf rules to all this laptop to connect. Both of those items involved Ansible configuration changes.
2025-08-02
I tried to use the PostgreSQL copy command from within Perl, but kept getting
Aug 3 01:35:29 pg01 postgres[98711]: [11-1] 2025-08-03 01:35:29.923 UTC [98711] LOG: statement: COPY vuxml_import FROM '/tmp/W_PTHFuxT7' WITH (FORMAT TEXT, DELIMITE R ' ', HEADER false) Aug 3 01:35:29 pg01 postgres[98711]: [12-1] 2025-08-03 01:35:29.924 UTC [98711] ERROR: could not open file "/tmp/W_PTHFuxT7" for reading: No such file or directory Aug 3 01:35:29 pg01 postgres[98711]: [12-2] 2025-08-03 01:35:29.924 UTC [98711] HINT: COPY FROM instructs the PostgreSQL server process to read a file. You may want a client-side facility such as psql's \copy. Aug 3 01:35:29 pg01 postgres[98711]: [12-3] 2025-08-03 01:35:29.924 UTC [98711] STATEMENT: COPY vuxml_import FROM '/tmp/W_PTHFuxT7' WITH (FORMAT TEXT, DELIMITER ' ', HEADER false)
I tried psql, and it just worked:
freshports.dvl=> \copy vuxml_import from '/tmp/W_PTHFuxT7' WITH (FORMAT TEXT, DELIMITER ' ', HEADER false) ; COPY 6020 freshports.dvl=> select * from vuxml_import; vid | checksum --------------------------------------+------------------------------------------------------------------ b945ce3f-6f9b-11f0-bd96-b42e991fc52e | e3772ac353d275f309f2a10e7688d48e894f771e83b29ad041e05e6b298b1160 95480188-6ebc-11f0-8a78-bf201f293bce | 67af8f935b1f35da8a8393df00ebfdf2d90871526f7adbe3a815916b99c73d65 f51077bd-6dd7-11f0-9d62-b42e991fc52e | 28fa973c0debcbc915d9b901e6f8a7e234ab026a23dc0e4e188e82c08db7a83c cd7f969e-6cb4-11f0-97c4-40b034429ecf | 226b7557d10dde55b5e75b49339bcb41d27d6a1c19d3941d9e9a204951c19a45 c37f29ba-6ae3-11f0-b4bf-ecf4bbefc954 | f6933743541a61306a034b57fedc8d8537b7029fd694bcc0d91beb621bb60aa0 ...
Let’s see if I can get the SQL going next time I tackle this.
Trying a copy from perl
I’m already doing a copy from within python to load data into the database. I’m going to do the same thing from within perl.
Here’s the code I grabbed from examples I found:
# UNLINK = false is for debugging purposes my $fh_vid = File::Temp->new(UNLINK => 0, PERMS => 0644); my $fname_vid = $fh_vid->filename; # stuff here where I write to $fh_vid... $fh_vid->close(); my $file_path = "$fname_vid"; my $table_name = "vuxml_import"; my $truncate_sql = "TRUNCATE vuxml_import"; my $copy_sql = "COPY $table_name FROM '$file_path' WITH (FORMAT TEXT, HEADER false)"; print "\$copy_sql='$copy_sql\n"; eval { $dbh->do($truncate_sql); $dbh->do($copy_sql); $dbh->do($); }; if ($@) { warn "Error during COPY: $@"; $dbh->rollback(); # Rollback on error } $dbh->commit(); # Commit the transaction if successful
I’m typing this after-the-fact, so there may be errors in the above, but that’s the general approach.
What is important is the error I encountered (and posted):
DBD::Pg::db do failed: ERROR: could not open file "/tmp/vnhvGB2yN0" for reading: No such file or directory HINT: COPY FROM instructs the PostgreSQL server process to read a file. You may want a client-side facility such as psql's \copy. at ./process_vuxml_new.pl line 134.
What? I’m sure it’s there, see:
[11:46 dvl-ingress01 dvl ~/scripts] % ls -l /tmp/vnhvGB2yN0 -rw-r--r-- 1 freshports wheel 614040 2025.08.03 11:46 /tmp/vnhvGB2yN0
As I typed that post, I realized. It’s a server side copy. The code expects to file in question to be on the PostgreSQL database server. Yes, it says so right there in the docs:
“Files named in a COPY command are read or written directly by the server, not by the client application.
I think I’ve been spoilt by my past use of the copy_from function in https://github.com/psycopg/psycopg2 – that’s rather distinct from using COPY direct from within an SQL command.
Now I have to decide:
- Do I switch this script to Python?
- Or do I invoke psql from within perl?
I think I’ll do the latter.
The latter
I try this from within psql:
freshports.dvl=> \copy vuxml_import from '/tmp/vnhvGB2yN0' WITH (FORMAT TEXT, HEADER false); COPY 6020
It works.
Next step, tell the freshports user to do the same thing. Let’s echo that in:
I try this from the command line: silently does nothing. No errors. No clues.
[12:15 dvl-ingress01 dvl ~/scripts] % echo "\copy vuxml_import from '/tmp/vnhvGB2yN0' WITH (FORMAT TEXT, HEADER false);" | psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl"
Back to basics: simple testing.
[12:17 dvl-ingress01 dvl ~/scripts] % echo "copy" copy [12:17 dvl-ingress01 dvl ~/scripts] % echo "\copy" [12:17 dvl-ingress01 dvl ~/scripts] % echo "\\copy" [12:17 dvl-ingress01 dvl ~/scripts] % echo "\\\copy" \copy
Oh, I need multiple escapes there.
This works:
[12:17 dvl-ingress01 dvl ~/scripts] % echo "\\\copy vuxml_import from '/tmp/vnhvGB2yN0' WITH (FORMAT TEXT, HEADER false);" | psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl" COPY 6020 [12:18 dvl-ingress01 dvl ~/scripts] %
And over here, we have the data:
freshports.dvl=# select count(*) from vuxml_import; count ------- 6020 (1 row) freshports.dvl=#
Let’s do that from within perl
Here’s what I tried:
my $truncate_sql = "TRUNCATE vuxml_import"; # our goal is this: # echo "\\\copy vuxml_import from '/tmp/vnhvGB2yN0' WITH (FORMAT TEXT, HEADER false);" | psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl" # sslcertmode=disable avoids could not open certificate file “/root/.postgresql/postgresql.crt”: Permission denied my $psql_command = "$PSQL \"sslmode=$FreshPorts::Config::ssl_mode host=$FreshPorts::Config::host user=$FreshPorts::Config::user dbname=$FreshPorts::Config::dbname sslcertmode=disable\""; my $copy_command = "\\copy vuxml_import from '$file_path' WITH (FORMAT TEXT, HEADER false);"; print "\$copy_command='$copy_command\n"; print "\$psql_command='$psql_command\n"; eval { $dbh->do($truncate_sql); print "committing truncate\n"; $dbh->commit(); print "echo \"$copy_command\" | $psql_command"; print "\n"; system("echo \"$copy_command\" | $psql_command"); print "copy command has finished\n"; }; if ($@) { warn "Error during COPY: $@"; $dbh->rollback(); # Rollback on error }
Let’s run that, and remember, most things run as the freshports user.
[18:28 dvl-ingress01 dvl ~/scripts] % echo ./process_vuxml_new.pl --filename=/jails/freshports/usr/ports/security/vuxml/vuln.xml | sudo su -fm freshports process_vuxml.pl starts (there is usually a delay before further output) There, the parsefile has completed calculating the sha256 for each vuln all sha256 saved for each vuln finished with eval() $copy_command='\copy vuxml_import from '/tmp/3YT08khXh4' WITH (FORMAT TEXT, HEADER false); $psql_command='/usr/local/bin/psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl sslcertmode=disable" committing truncate echo "\copy vuxml_import from '/tmp/3YT08khXh4' WITH (FORMAT TEXT, HEADER false);" | /usr/local/bin/psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl sslcertmode=disable" zsh: done echo ./process_vuxml_new.pl | zsh: hangup sudo su -fm freshports
This took me a while to figure out. I think the double pipe got to it. So I tried running it as the freshports user instead of piping it to the user.
Let’s run it directly.
[18:29 dvl-ingress01 dvl ~/scripts] % sudo su -l freshports $ cd /usr/local/libexec/freshports $ ./process_vuxml_new.pl --filename=/jails/freshports/usr/ports/security/vuxml/vuln.xml process_vuxml.pl starts (there is usually a delay before further output) There, the parsefile has completed calculating the sha256 for each vuln all sha256 saved for each vuln finished with eval() $copy_command='\copy vuxml_import from '/tmp/mn8wKvOJB1' WITH (FORMAT TEXT, HEADER false); $psql_command='/usr/local/bin/psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl sslcertmode=disable" committing truncate echo "\copy vuxml_import from '/tmp/mn8wKvOJB1' WITH (FORMAT TEXT, HEADER false);" | /usr/local/bin/psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl sslcertmode=disable" COPY 6020 copy command has finished Total time: 21 seconds Number of updates: 0 process_vuxml.pl finishes $
Good. Still making progress.
So, while I can’t launch it as myself, it does run as the required user.
The SQL
Now that I’ve uploaded a copy of the current vulns (into the vuxml_import table), now I can determine:
- Which vuln are new (present in vuxml_import table but not in vuxml table)
- Which vuln have changed (checksums differs between the two tables)
- Which vuln have been deleted (found in vuxml table but not in the vuxml_import table)
This should be very straight forward.
New entries
This will pull back all the vuxml_import rows which do not exist in vuxml.
-- find new vids SELECT vi.vid, vi.checksum FROM vuxml_import vi LEFT OUTER JOIN vuxml ON vuxml.vid = vi.vid WHERE vuxml.vid IS NULL;
Hmm, that brought back way more than expected:
freshports.dvl=# SELECT vi.vid, vi.checksum FROM vuxml_import vi LEFT OUTER JOIN vuxml ON vuxml.vid = vi.vid WHERE vuxml.vid IS NULL; vid | checksum --------------------------------------+------------------------------------------------------------------ 44101b31-8ffd-11e7-b5af-a4badb2f4699 | 741a95e0d1f9bfac6d4514b6758bded53fd8f7d448de535a1bbb1d392f8b3e1b c7e8e955-6c61-11e7-9b01-2047478f2f70 | e1b83ea4453814902fa8e4a56b374ecbd5249c2bf85d5b69f6cabcab17e336e8 04bc4e23-9a70-42cb-9fec-3613632d34bc | c904c24aa6c4adc66bf91e763f94d087089eb8e0e0a8d9ea96bcaa7be5225cf2 2aedd15f-ca8b-11e6-a9a5-b499baebfeaf | 2cc5e7ee3a244fa7cdea684486747ad2d919b64b200c60f9487a076cdfcdb241 cf0b5668-4d1b-11e6-b2ec-b499baebfeaf | 5a688bbacca0b23bf520e30557a56c7131a5567332bc4c47ba1202c4ac756284 61b8c359-4aab-11e6-a7bd-14dae9d210b8 | 09b36bf7b1d2c2061a7d996bd4a7ef2c08436938cfa8bb4e36c3d5b899bd99c2 c17fe91d-4aa6-11e6-a7bd-14dae9d210b8 | d976f5a4e002c4f7b9d856c784b839859f8c47b969810067b6e85a2875ef28e4 42ecf370-4aa4-11e6-a7bd-14dae9d210b8 | bfa6339440f3b1abc283670efe5e57092dea75c8eb11b000bd92c4fee834d254 95564990-1138-11e6-b55e-b499baebfeaf | a796d57493c2417c53f2deeca77ab9f8ad2c16c7443e5b256eb9508724925504 f99a4686-e694-11e3-9032-000c2980a9f3 | f8a3fded846a14c6fc31e19ae8d1e494e1527e0780af35c61da6936539721795 080c5370-886a-11e3-9533-60a44c524f57 | 9defe99d9c46fddacf27012fcb40602ecae43d7b2b6e0e1841d50b77ced3a81c cd082cc6-1548-4b8d-a3aa-a007be611a29 | ee2c57c980e00ce6008062f8a7570693c90795ccb921c3c11f80f5222bed98d9 6887828f-0229-11e0-b84d-00262d5ed8ee | 326745864d9f432d5ea112f7d221af0e8e1504b0be1661d22e4d41ba24e91393 b2a6fc0e-070f-11e0-a6e9-00215c6a37bb | cd343481576433a6c93b9c14c89a8f2e81639b313191fbbf31af53a6cbf857a5 f6b6beaa-4e0e-11df-83fb-0015587e2cc1 | 4852600a147b5cf07a31687ff84cb763f57369ef9d7692eef7b2473340b27ec0 ff284bf0-3f32-11dc-a79a-0016179b2dd5 | 3cf223f0873590bab26e0d1956f7248728133940d67bd2f376749987911d38a2 28ce7102-4039-11db-a838-00148584c7dd | 59c37c8659e39107cb7924da42075d9e6a85b46276787281eb779a5fb20c721d 0a4cd819-0291-11db-bbf7-000c6ec775d9 | 0d62cb10756022a599fbe910f5f68e5eebd8e2a58c629a17c8bf0e7c7c48279f af2a60ed-da3e-11da-93e0-00123ffe8333 | 31eb8272e0fdcedfc3312c555940840cdb58388d2209332168ba89f6bec6df8f e8c6ade2-6bcc-11d9-8e6f-000a95bc6fae | 96c1c306b528dc30613899ee26e1e84345831b91e707e2e1594b5317cd152a03 14e8f315-600e-11d9-a9e7-0001020eed82 | 909f2c18bf385bb050158f5beb49bfde1f61a1a41b1e8a01c4885158ddca6c68 1f8dea68-3436-11d9-952f-000c6e8f12ef | d79b521048462deb7839d614bd75f6674a85f2be6de838495b1af8aa67b3447e bef4515b-eaa9-11d8-9440-000347a4fa7d | 6cc0d6b6d1920c0612dee6a1cab21bad3ea8bdb5838861539ae56ef632999c47 4764cfd6-d630-11d8-b479-02e0185c0b53 | 9ee4350334b0f20cc499c6fbbe59d05c746bf115bb4b185afe7e4541122c8935 c7705712-92e6-11d8-8b29-0020ed76ef5a | 9fbb9dc9b2f05b5920d49c9aeb1eb900c12b9ab66f2fc7c0850dd3512a31b196 6fd9a1e9-efd3-11d8-9837-000c41e2cdad | a466a00766924a3c3e4bcbb9a7565c4563a10576bfe22f895aab25f1332d5170 3362f2c1-8344-11d8-a41f-0020ed76ef5a | 0da7c9fd91e88127d39ffe66794bf98a4c220d6cde5cf156f50b1529c9e4da6d (27 rows) freshports.dvl=#
Ahh, some are cancelled, and FreshPorts is not storing them. Perhaps we should.
For example, this is from /usr/ports/security/vuxml/vuln/2017.xml
<vuln vid="44101b31-8ffd-11e7-b5af-a4badb2f4699"> <cancelled/> </vuln>
The vuxml table does have a status field. All ‘A’:
freshports.dvl=# select distinct status from vuxml; status -------- A (1 row)
Perhaps I need to start using that field and store cancelled vid.
Detecting cancelled vid
I was trying to detect and ignore cancelled vid at the input side. Despite what I said above, I don’t want to start storing them now. I tried:
my $cancelled = $node->getAttributeNode('cancelled');
No, that’s not right. I posted and got some answers, and found a solution:
my $cancelled = $node->getElementsByTagName('cancelled'); if ($cancelled->getLength() > 0) { print "cancelled is $cancelled\n"; print "skipping $cancelled $vid\n"; next; }
This will skip over the cancelled nodes and they will never make it into the vuxml_import table. That query for new nodes now brings back no results. For my purposes, that is correct.
Changed entries
By changed entries, we mean the checksum differs. Let’s start with this:
-- modified vids SELECT vi.vid, vi.checksum FROM vuxml_import vi LEFT OUTER JOIN vuxml ON vuxml.vid = vi.vid WHERE vuxml.checksum <> vi.checksum;
Deleted entries
This is similar to finding new vids.
-- find deleted vids SELECT vi.vid, vi.checksum FROM vuxml_import vi LEFT OUTER JOIN vuxml ON vuxml.vid = vi.vid WHERE vuxml.vid IS NULL;
Let’s make a plan
Here’s the approach I think I’ll take:
- open and parse the vuln.xml file
- traverse the XML, ignoring cancelled vulns, saving vid and checksum to a file
- truncate vuxml_import (delete all rows) – coded
- load file contents into vuxml_import – coded
- remove deleted vids from vuxml – simple database query
- insert new vids – get list from database, run through vuln.xml
- update changed vids – same
I’ve already inspected
vuxml.pm
and it handles insert and update well. I don’t have to code that; just call the save() function.
What I’m not sure about: I’m sure that “run through vuln.xml” means a traversal of all nodes.
Traversal of all nodes
Below, I have cobbled together some of the existing code. I have left out a bunch of if statements and error handling, just to get an idea of what is done.
This sample code needs to be amended to adjust for the above mentioned plan. Right now, it saves everything as it goes along. What I’ve done is pulled various lines from process_vuxml.pl just to get a starting point.
$dbh = FreshPorts::Database::GetDBHandle(); # declare the parser and parse vuln.xml my $parser = new XML::DOM::Parser; my $doc = $parser->parsefile ($filename); eval { # for each node in the doc... for my $node ($doc->findnodes('/vuxml/vuln')) { # get the vuln id / vid my $vid = $node->getAttributeNode('vid')->getValue(); # calculate the checksum, which will be stored in the database my $csum = sha256_hex(Encode::encode_utf8($node->toString)); # convert the node to a string. my $nodeString = $node->toString(); # get a file handle to that string and open it my $fh = IO::String->new(); $fh->open($nodeString); # save the vuln to the database my $vuxml = FreshPorts::vuxml->new( $dbh ); my $p = FreshPorts::vuxml_parsing->new(Stream => $fh, DBHandle => $dbh, UpdateInPlace => 1); # always pass in the checksum from our calculation # this will parse the xml and save the data to the database, all in one go. $p->parse_xml($csum); if ($p->database_updated()) { print "yes, the database was updated for $vid\n"; } else { print "no, the database was NOT updated for $vid\n"; next; } $fh->close; } # for my $node }; # eval # added after seeing it at https://metacpan.org/dist/XML-DOM/view/lib/XML/DOM/Parser.pod $doc->dispose; if ($@) { print "We've got a problem."; print "$0: $@\n"; FreshPorts::CommitterOptIn::RecordErrorDetails("error processing vuxml", $0); die "$0: $@\n"; } print "committing\n"; $dbh->commit();
Next: what actually invokes vuxml->save()?
That always perplexes me, everything I look at that code.
It’s here, within update_database_vuxml
$vuxml_id = $vuxml->save();
That function is called by update_database().
In turn, it is called by handle_end().
Ahh, I always forget how XML::Parser works.
Delete the missing vuxml entries
This will delete all entries from vuxml not found in vuxml_import:
WITH deleted_vids AS ( SELECT V.vid, V.checksum FROM vuxml V LEFT OUTER JOIN vuxml_import vi ON V.vid = vi.vid WHERE vi.vid IS NULL ) DELETE from vuxml V USING deleted_vids as DV WHERE V.vid = DV.vid;
2025-08-09 – This seems to work
I seem to have a solution.
[20:21 dvl-ingress01 dvl ~/scripts] % cat process_vuxml_new.pl #!/usr/local/bin/perl -w # # $Id: process_vuxml.pl,v 1.6 2013-01-16 15:37:57 dan Exp $ # # Copyright (c) 2001-2012 DVL Software # # much of this file is based on contributions from Matthew Seamon # # @{#} $Id: process_vuxml.pl,v 1.6 2013-01-16 15:37:57 dan Exp $ # # Split up the vuln.xml file into sections for individual # vulnerabilities. Save into files using the vid guid field as name. # Calculate SHA256 checksum for the XML snippet and write out to an # index file. #use 5.10.1; use strict; use warnings; use XML::DOM::XPath; use Encode; use Digest::SHA qw(sha256_hex); use autodie qw(:default); use IO::String; use Getopt::Long; use File::Temp (); # for debugging use Data::Dumper; use FreshPorts::database; use FreshPorts::vuxml; use FreshPorts::vuxml_parsing; use FreshPorts::vuxml_mark_commits; my $PSQL = "/usr/local/bin/psql"; my $filename; my $dryrun; my $showchecksums; my $printnodes; my $showreasons; my $nodeString; my $NumUpdates = 0; # From https://perldoc.perl.org/perlunifaq.html#What-is-a-%22wide-character%22%3f # to handle: Wide character in print at /usr/local/lib/perl5/site_perl/FreshPorts/vuxml_parsing.pm line 234, <> chunk 1.\n sub populate_vuxml_import($;$) { my $dbh = shift; my $doc = shift; print "There, the parsefile has completed\n"; # UNLINK = false is for debugging purposes my $fh_vid = File::Temp->new(UNLINK => 0, PERMS => 0644); my $fname_vid = $fh_vid->filename; my $fh = IO::String->new(); # my $vuxml = FreshPorts::vuxml->new( $dbh ); print "calculating the sha256 for each vuln\n"; eval { for my $node ($doc->findnodes('/vuxml/vuln')) { if ($dryrun && !$showchecksums) { print '.'; } my $vid = $node->getAttributeNode('vid')->getValue(); my $cancelled = $node->getElementsByTagName('cancelled'); if ($cancelled->getLength() > 0) { print "cancelled is $vid - skipping that one for import\n"; next; } my $csum = sha256_hex(Encode::encode_utf8($node->toString)); # write vid and checksum to the file print $fh_vid "$vid\t$csum\n"; } # for my $node }; # eval print "all sha256 saved for each vuln\n"; $fh_vid->close(); print 'finished with eval()' . "\n"; # if something went wrong in the eval, abort and don't do a commit if ($@) { print "We've got a problem."; print "$0: $@\n"; FreshPorts::CommitterOptIn::RecordErrorDetails("error processing vuxml", $0); die "$0: $@\n"; } my $file_path = "$fname_vid"; my $table_name = "vuxml_import"; my $copy_sql = "COPY $table_name FROM '$file_path' WITH (FORMAT TEXT, DELIMITER '\t', HEADER false)"; my $truncate_sql = "TRUNCATE vuxml_import"; # our goal is this: # echo "\\\copy vuxml_import from '/tmp/vnhvGB2yN0' WITH (FORMAT TEXT, HEADER false);" | psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl" # sslcertmode=disable avoids could not open certificate file “/root/.postgresql/postgresql.crt”: Permission denied my $psql_command = "$PSQL \"sslmode=$FreshPorts::Config::ssl_mode host=$FreshPorts::Config::host user=$FreshPorts::Config::user dbname=$FreshPorts::Config::dbname sslcertmode=disable\""; my $copy_command = "\\copy vuxml_import from '$file_path' WITH (FORMAT TEXT, HEADER false);"; print "\$copy_command='$copy_command\n"; print "\$psql_command='$psql_command\n"; # save it to the database eval { $dbh->do($truncate_sql); print "committing truncate\n"; $dbh->commit(); print "echo \"$copy_command\" | $psql_command"; print "\n"; system("echo \"$copy_command;\" | $psql_command"); print "copy command has finished\n"; }; if ($@) { warn "Error during COPY: $@"; $dbh->rollback(); # Rollback on error } } # MAIN sub remove_deleted_vids($) { my $dbh = shift; my $DeleteMissingVuxml = "select DeleteMissingVuxml()"; # invoke the stored procedure: DeleteMissingVuxml() eval { $dbh->do($DeleteMissingVuxml); print "deleting missing vuxml\n"; $dbh->commit(); print "missing vuxml have been deleted\n"; }; if ($@) { warn "Error during delete: $@"; $dbh->rollback(); # Rollback on error } } sub get_list_of_modified_and_new_vids($) { # invoke the stored procedure: GetListOfVulnsForUpdate() which returns a list of new vulns # and vulns with new checksums my $dbh = shift; my %ListOfVIDs; my @row; my $VidsToUpdate = "select * from GetListOfVulnsForUpdate()"; # invoke the stored procedure: GetListOfVulnsForUpdate() eval { my $sth = $dbh->prepare($VidsToUpdate); $sth->execute || FreshPorts::Utilities::ReportError('warning', "Could not execute SQL statement\n--$VidsToUpdate--\n... maybe invalid?", 1); print "getting list of vulns to update\n"; while (@row = $sth->fetchrow_array()) { print "found vid: $row[0]\n"; $ListOfVIDs{$row[0]} = $row[0]; } $sth->finish(); $dbh->commit(); print "list of vulns to update has been obtained\n"; }; if ($@) { warn "Error during list of vulns: $@"; $dbh->rollback(); # Rollback on error } return %ListOfVIDs; } sub update_modified_vids($;$;$) # update_modified_vids($dbh, $doc, \%VIDsToUpdate); { # given a hash/array of vids, go through the vuln.xml files and # save each of them to the database. my $dbh = shift; my $doc = shift; my $VIDsRef = shift; # my $fh; my %VIDsToUpdate = %{$VIDsRef}; my %ProcessedVIDs; my $nodeString; # this code based on the existing process_vuxml.pl script my $fh = IO::String->new(); for my $node ($doc->findnodes('/vuxml/vuln')) { my $vid = $node->getAttributeNode('vid')->getValue(); # print "found vid: $vid\n"; my $cancelled = $node->getElementsByTagName('cancelled'); if ($cancelled->getLength() > 0) { print "cancelled is $vid - skipping that one again\n"; next; } # if this vid is to be updated if ($VIDsToUpdate{$vid}) { print "found vid to update: $vid\n"; # Keep track of what we have processed. # We use this to be sure we've processed everything. $ProcessedVIDs{$vid} = $vid; # Yes, we already calculated this the first time around # but we aren't storing it. Yet. # my $nodeString = $node->toString(); # my $csum = sha256_hex(Encode::encode_utf8($nodeString)); my $csum = sha256_hex(Encode::encode_utf8($node->toString)); # my $csum = sha256_hex(Encode::encode_utf8($node->toString)); $nodeString = $node->toString(); print "This is that node:\n$nodeString\n"; print "with csum=$csum\n"; $fh->open($nodeString) || die("FATAL: $0 could not \$fh->open" ); print "Getting a new vuxml_parsing\n"; my $p = FreshPorts::vuxml_parsing->new(Stream => $fh, DBHandle => $dbh, UpdateInPlace => 1); print "parsing the xml and setting \$csum\n"; # always pass in the checksum from our calculation $p->parse_xml($csum); if ($p->database_updated()) { print "yes, the database was updated for $vid\n"; $NumUpdates++; } else { print "no, the database was NOT updated for $vid\n"; next; } $fh->close; print 'invoking vuxml_mark_commits with ' . $vid . "\n"; my $CommitMarker = FreshPorts::vuxml_mark_commits->new(DBHandle => $dbh, vid => $vid); print 'invoking ProcessEachRangeRecord'. "\n"; my $i = $CommitMarker->ProcessEachRangeRecord(); print 'invoking ClearCachedEntries' . "\n"; $CommitMarker->ClearCachedEntries($vid); # # putting this in here while debugging # last; } } print "reconciling processed against our list\n"; # delete the ones we processed from the list of what we were looking for foreach my $key (keys %ProcessedVIDs) { delete $VIDsToUpdate{$key}; } print "Checking to see if we missed anything.\n"; # what do we have left? if (%VIDsToUpdate) { foreach my $key (keys %VIDsToUpdate) { print "Was unable to locate $key within the vuln.xml files.\n"; } print "FATAL: we did not find everything we expected.\n"; } else { print "We found and updated everything we expected.\n"; } } binmode STDOUT, ":encoding(UTF-8)"; #use feature qw(switch); print 'process_vuxml.pl starts' . "\n"; # Reads vuln.xml on stdin * NOT ANY MORE GetOptions ('filename:s' => \$filename, 'dryrun!' => \$dryrun, 'showchecksums!' => \$showchecksums, 'printnodes!' => \$printnodes, 'showreasons!' => \$showreasons); if ($dryrun) { print "this is a dry run\n"; } if ($showchecksums) { print "checksums will be displayed\n"; } if ($printnodes) { print "nodes will be displayed\n"; } if ($showreasons) { print "reasons will be displayed\n"; } my $start = time; print '(there is usually a delay before further output)' . "\n"; my $dbh = FreshPorts::Database::GetDBHandle() || die("FATAL: $0 could not get a db handle" ); my $parser = new XML::DOM::Parser; my $doc = $parser->parsefile ($filename); # save the incoming vuln.xml file to a staging database table populate_vuxml_import($dbh, $doc); remove_deleted_vids($dbh); # # NOTE inserts and updates are handled the same way # delete existing vid (if any) # insert # # Instead, we should collect new and modified together and do one traversal of the vuln.xlm document # my %VIDsToUpdate = get_list_of_modified_and_new_vids($dbh); print "invoking update_modified_vids()\n"; update_modified_vids($dbh, $doc, \%VIDsToUpdate); # added after seeing it at https://metacpan.org/dist/XML-DOM/view/lib/XML/DOM/Parser.pod $doc->dispose; my $end = time(); print "Total time: " . ($end - $start) . " seconds\n"; print "Number of updates: $NumUpdates\n"; if ($dryrun) { print "this was a dry run\n"; } print "committing\n"; $dbh->commit(); $dbh->disconnect(); undef $dbh; # # That's All Folks! # print 'process_vuxml.pl finishes' . "\n";
The database code:
[20:22 pg01 dvl ~/src/freshports-database-schema] % cat updates-2025-08-01-vuxml_import.sql -- for the COPY from files grant truncate on vuxml_import to commits; -- re https://news.freshports.org/2025/07/30/how-freshports-processes-vuxml-entries/ -- https://github.com/FreshPorts/freshports/issues/633 -- -- Table: public.vuxml_import -- DROP TABLE IF EXISTS public.vuxml_import; CREATE TABLE IF NOT EXISTS public.vuxml_import ( vid text COLLATE pg_catalog."default" NOT NULL, checksum text COLLATE pg_catalog."default" NOT NULL, CONSTRAINT vuxml_import_vid PRIMARY KEY (vid) ) TABLESPACE pg_default; ALTER TABLE IF EXISTS public.vuxml_import OWNER to postgres; REVOKE ALL ON TABLE public.vuxml_import FROM commits; REVOKE ALL ON TABLE public.vuxml_import FROM rsyncer; GRANT INSERT, DELETE, SELECT, UPDATE ON TABLE public.vuxml_import TO commits; GRANT ALL ON TABLE public.vuxml_import TO postgres; GRANT SELECT ON TABLE public.vuxml_import TO rsyncer; -- DROP INDEX IF EXISTS public.vuxml_import_vid; CREATE INDEX IF NOT EXISTS vuxml_import_vid ON public.vuxml_import USING btree (vid COLLATE pg_catalog."default" ASC NULLS LAST) TABLESPACE pg_default; grant truncate on vuxml_import to commits; -- delete the deleted vids. CREATE OR REPLACE FUNCTION DeleteMissingVuxml() returns void AS $$ -- if it's not in vuxml_import, remove it from vuxml WITH deleted_vids AS ( SELECT V.vid, V.checksum FROM vuxml V LEFT OUTER JOIN vuxml_import vi ON V.vid = vi.vid WHERE vi.vid IS NULL ) DELETE from vuxml V USING deleted_vids as DV WHERE V.vid = DV.vid; $$ LANGUAGE SQL VOLATILE; CREATE OR REPLACE FUNCTION GetListOfVulnsForUpdate() returns SETOF text AS $$ -- return a list of vid for updating SELECT vi.vid FROM vuxml_import vi LEFT OUTER JOIN vuxml ON vuxml.vid = vi.vid WHERE vuxml.checksum <> vi.checksum union SELECT vi.vid FROM vuxml_import vi LEFT OUTER JOIN vuxml ON vuxml.vid = vi.vid WHERE vuxml.vid IS NULL; $$ LANGUAGE SQL STABLE;
What’s next?
Start using this code on my host. That could just be a cp over the existing file.
Not today. Today we’re going to a wedding. Perhaps tomorrow.
The downside
If we do an import, and the import fails to populate vuxml_import with an accurate list of vulns, we can wind up deleting all vulns from the system. That is because the “delete missing vulns” process removes any vulns not found in vuxml_import. It is easy to reload the data, yet it is time consuming – hours, not minutes, to do that.
2025-08-14: For the record, processing all vuxml looks like this at the end:
reconciling processed against our list Checking to see if we missed anything. We found and updated everything we expected. Total time: 59167 seconds Number of updates: 5997 committing process_vuxml.pl finishes
Or about 16.5 hours…
And, about 20 minutes later, I realized none of that data has been saved. I must be missing a commit or have an extra rollback.
I just started another run. I’ll check it tomorrow morning.
2028-08-15: Success. This is what it looks like:
reconciling processed against our list Checking to see if we missed anything. We found and updated everything we expected. Total time: 63769 seconds Number of updates: 5998 committing process_vuxml.pl finishes $
The “Number of updates” differs because of a recent commit.
This looks good. Especially given:
freshports.dvl=# select count(*) from vuxml; count ------- 5999 (1 row) freshports.dvl=#
That number differs again because I ran a one-off (process just one vuln) yesterday to verify the database commit was occurring and saving the work.
This is the CPU load on the FreeBSD jail (dvl-ingress01):

The Full Monty
Let’s run it again, and save the output for future reference.
$ perl ./process_vuxml_new.pl --filename=/jails/freshports/usr/ports/security/vuxml/vuln.xml process_vuxml.pl starts (there is usually a delay before further output) There, the parsefile has completed Error in tempfile() using template freshports_vuxml_processing_XXXXX: Could not create temp file freshports_vuxml_processing_3wWXJ: Permission denied at ./process_vuxml_new.pl line 62. $ perl ./process_vuxml_new.pl --filename=/jails/freshports/usr/ports/security/vuxml/vuln.xml process_vuxml.pl starts (there is usually a delay before further output) There, the parsefile has completed calculating the sha256 for each vuln doing the findnodes() thing findnodes() has finished, iteration has started .............................................................................................................................................................. ....................................................................................................... 44101b31-8ffd-11e7-b5af-a4badb2f4699 - cancelled: skipping that one for import .... c7e8e955-6c61-11e7-9b01-2047478f2f70 - cancelled: skipping that one for import ........ 04bc4e23-9a70-42cb-9fec-3613632d34bc - cancelled: skipping that one for import ......... 2aedd15f-ca8b-11e6-a9a5-b499baebfeaf - cancelled: skipping that one for import ..................... cf0b5668-4d1b-11e6-b2ec-b499baebfeaf - cancelled: skipping that one for import 61b8c359-4aab-11e6-a7bd-14dae9d210b8 - cancelled: skipping that one for import c17fe91d-4aa6-11e6-a7bd-14dae9d210b8 - cancelled: skipping that one for import . 42ecf370-4aa4-11e6-a7bd-14dae9d210b8 - cancelled: skipping that one for import ....... 95564990-1138-11e6-b55e-b499baebfeaf - cancelled: skipping that one for import ......................................................................... f99a4686-e694-11e3-9032-000c2980a9f3 - cancelled: skipping that one for import ..... 080c5370-886a-11e3-9533-60a44c524f57 - cancelled: skipping that one for import .... cd082cc6-1548-4b8d-a3aa-a007be611a29 - cancelled: skipping that one for import ......................................................... 6887828f-0229-11e0-b84d-00262d5ed8ee - cancelled: skipping that one for import b2a6fc0e-070f-11e0-a6e9-00215c6a37bb - cancelled: skipping that one for import .......... f6b6beaa-4e0e-11df-83fb-0015587e2cc1 - cancelled: skipping that one for import ................................................ ff284bf0-3f32-11dc-a79a-0016179b2dd5 - cancelled: skipping that one for import ................ 28ce7102-4039-11db-a838-00148584c7dd - cancelled: skipping that one for import ... 0a4cd819-0291-11db-bbf7-000c6ec775d9 - cancelled: skipping that one for import ... af2a60ed-da3e-11da-93e0-00123ffe8333 - cancelled: skipping that one for import ..................................... e8c6ade2-6bcc-11d9-8e6f-000a95bc6fae - cancelled: skipping that one for import .... 14e8f315-600e-11d9-a9e7-0001020eed82 - cancelled: skipping that one for import ...... 1f8dea68-3436-11d9-952f-000c6e8f12ef - cancelled: skipping that one for import ........... bef4515b-eaa9-11d8-9440-000347a4fa7d - cancelled: skipping that one for import .. 4764cfd6-d630-11d8-b479-02e0185c0b53 - cancelled: skipping that one for import ...... c7705712-92e6-11d8-8b29-0020ed76ef5a - cancelled: skipping that one for import ...... 6fd9a1e9-efd3-11d8-9837-000c41e2cdad - cancelled: skipping that one for import 3362f2c1-8344-11d8-a41f-0020ed76ef5a - cancelled: skipping that one for import all sha256 saved for each vuln finished with eval() $copy_command='\copy vuxml_import from '/tmp/freshports_vuxml_processing_ixUmG' WITH (FORMAT TEXT, HEADER false); $psql_command='/usr/local/bin/psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl sslcertmode=disable" committing truncate echo "\copy vuxml_import from '/tmp/freshports_vuxml_processing_ixUmG' WITH (FORMAT TEXT, HEADER false);" | /usr/local/bin/psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl sslcertmode=disable" copy command has finished $output is 'COPY 5999' Imported 5999 items deleting missing vuxml vuln not present in xml files have been deleted from the database. getting list of vulns to update list of vulns to update has been obtained We have 0 vuxml entries to add/update invoking update_modified_vids() Nothing to update Total time: 22 seconds Number of updates: 0 committing process_vuxml.pl finishes $
Success. Let’s ship that and see what happens during the next commit.
[11:50 dvl-ingress01 dvl ~/scripts] % cp process_vuxml_new.pl process_vuxml.pl [11:50 dvl-ingress01 dvl ~/scripts] % diff -ruN process_vuxml.pl process_vuxml_new.pl [11:50 dvl-ingress01 dvl ~/scripts] %
psql \copy is silenting failing
I just started commit processing on dvl-ingress01. Immediately, the sanity checks failed:
FATAL: we imported items, which is less than our expected minimum value of 5900 FATAL: we were deleting way too may rows from vuxml: 5999 items, which is more than our expected minimum value of 10
Both of those should have terminated processing. Neither did. That’s the first problem to solve.
Next, I have to figure out why running the script from the command line works, but launching it from another script fails.
What comes to mind first is PATH.
The logs show:
$copy_command='\copy vuxml_import from '/tmp/freshports_vuxml_processing_HX1m1' WITH (FORMAT TEXT, HEADER false); $psql_command='/usr/local/bin/psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl sslcertmode=disable" committing truncate echo "\copy vuxml_import from '/tmp/freshports_vuxml_processing_HX1m1' WITH (FORMAT TEXT, HEADER false);" | /usr/local/bin/psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl sslcertmode=disable" copy command has finished $output is '' Imported items FATAL: we imported items, which is less than our expected minimum value of 5900
This fix may have to wait until Monday. The weekend is full.
I found time
I’ve taken a long lunch today and meeting with an architect friend. He’s doing plans, I’m looking at this.
This runs to success, so…:
$ pwd /usr/local/libexec/freshports $ id uid=10001(freshports) gid=10001(freshports) groups=10001(freshports) $ touch ~/signals/vuxml && ./process_vuxml.sh ...
process_vuxml.sh will launch process_vuxml.pl, I still suspect paths or something odd.
This works as my regular user, in a zsh shell
[18:14 dvl-ingress01 dvl ~/scripts] % echo "\\\copy vuxml_import from '/tmp/freshports_vuxml_processing_jDIF3' WITH (FORMAT TEXT, HEADER false);" | psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl" COPY 6001
As the freshports user in a bourne shell:
$ echo "\copy vuxml_import from '/tmp/freshports_vuxml_processing_jDIF3' WITH (FORMAT TEXT, HEADER false);" | psql "sslmode=require host=pg01.int.unixathome.org user=commits_dvl dbname=freshports.dvl" COPY 6001
and this works too:
$ touch ~/signals/vuxml && ./process_vuxml.sh $ touch ~/signals/vuxml && perl job-waiting.pl
Both result in
copy command has finished $output is 'COPY 6002'