Details of the vuxml processing problem

For several weeks, FreshPorts had a vuxml processing problem. In this blog post, I will share what I found.

Introduction

Incoming events for FreshPorts are handled by a small Python script which checks for flags and incoming commits. This ensures that work is carried out serially because parallel work can have unintended consequences. After each task is completed, the flags are checked, and if work is to be done, that task is performed before commit processing continues.

If an incoming commit touches the security/vuxml/vuln.xml file, a flag is set to indicate that vuxml processing is ready to proceed.

That processing is invoked via this call from the process_vuxml.sh script:

/usr/local/bin/perl ./process_vuxml.pl < ${VULNFILE} >> ${LOGFILE}

What triggered the issue?

On Tue Oct 24 21:59:23 2017 UTC, with Revision 4906, I committed a change to process_vuxml.pl with this commit message:

Stop using HANDLE and start using File::IO

This will hopefully avoid: Name "main::HANDLE" used only once: possible typo at ./process_vuxml.pl line 89.

This should have been included in revision 4905

That patch was:

--- scripts/trunk/process_vuxml.pl	2017/10/09 15:10:01	4836
+++ scripts/trunk/process_vuxml.pl	2017/10/24 21:59:23	4906
@@ -19,7 +19,9 @@
 use warnings;
 use Digest::SHA qw(sha256_hex);
 use autodie qw(:default);
+use IO::File;
 
+use committer_opt_in;
 use database;
 use vuxml;
 use vuxml_parsing;
@@ -38,6 +40,9 @@
     my %vulns;
     my @vulns;
 
+    my $fh;
+    my $p;
+
     # slurp vuln.xml whole.
     local $/;
 
@@ -85,25 +90,27 @@
 
                 if ($updateRequired)
                 {
-                    open(HANDLE, '<', \$vulns{$v});
-            		my $p = FreshPorts::vuxml_parsing->new(Stream        => *HANDLE,
-                                                           DBHandle      => $dbh,
-                                                           UpdateInPlace => 1);
-                                                           
-            		$p->parse_xml($csum);
+                    $fh = IO::File->new();
+                    if ($fh->open('< ' . $vulns{$v})) {
+                		$p = FreshPorts::vuxml_parsing->new(Stream        => $fh,
+                                                               DBHandle      => $dbh,
+                                                               UpdateInPlace => 1);
+
+                		$p->parse_xml($csum);
+
+                        if ($p->database_updated())
+                        {
+                            print "yes, the database was updated\n";
+                        }
+                        else
+                        {
+                            print "no, the database was NOT updated\n";
+                            next;
+                        }
 
-                    close HANDLE;
-                    # process $vulns{$v} via vuxml_processing
-                    
-                    if ($p->database_updated())
-                    {
-                        print "yes, the database was updated\n";
-                    }
-                    else
-                    {
-                        print "no, the database was NOT updated\n";
-                        next;
+                        $fh->close;
                     }
+                    # process $vulns{$v} via vuxml_processing
 
                     print 'invoking vuxml_mark_commits with ' . $v . "\n";
         			my $CommitMarker = FreshPorts::vuxml_mark_commits->new(DBHandle => $dbh,

Please note (line numbers refer to the left side of the page, not actual line numbers in the source file):

  1. Line 17 declares $fh, a file handle
  2. Line 18 declares $p, perviously declared on line 28

How are vulns processed

This is an overview of how FreshPorts processing a new vuln.xml file.

  1. Split the file up into individual entries
  2. If the vid for this vuln is not found in the database, add it.
  3. If the vid is found, compare the current checksum with the value in the database
  4. If the checksums are the same, ignore this vuln entry; it has not changed.
  5. If the checksum differs, delete the vuln from the database and process this vuln as if it were new

Symptoms & debugging

No new vuxml entries were being processed. While debugging the code I thought it was failing on this this:

if ($fh->open('<' . $vulns{$v})) {

You can see that version of the code in earlier revisions of that gist.

Eventually I discovered that the parameters to IO::File open and open are not the same. I changed the above code to:

if ($fh->open(\$vulns{$v}, '<')) {

The next hurdle was this error:

process_vuxml.pl:no element found at line 1, column 0, byte 0 at /usr/local/lib/perl5/site_perl/mach/5.24/XML/Parser.pm line 187.

It was another week or more before I got back to debugging.

I soon discovered that the first vuln would be processed properly, without error. The second would error.

That would lead me to think this was a scope issue; a stale variable or similar.

After several solutions, I settled up on this change, which seemed to be the simplest:

--- scripts/trunk/process_vuxml.pl	2017/12/08 23:53:09	5009
+++ scripts/trunk/process_vuxml.pl	2017/12/10 18:06:26	5014
@@ -59,6 +59,7 @@
 	my $dbh;
 	$dbh = FreshPorts::Database::GetDBHandle();
 	if ($dbh->{Active}) {
+        $fh = IO::File->new();
         my $vuxml = FreshPorts::vuxml->new( $dbh );
           
         eval {
@@ -90,7 +91,6 @@
 
                 if ($updateRequired)
                 {
-                    $fh = IO::File->new();
                     if ($fh->open(\$vulns{$v}, '<')) {
                         $p = FreshPorts::vuxml_parsing->new(Stream        => $fh,
                                                             DBHandle      => $dbh,

As shown, the creation of $fh was moved outside a loop. $p is a Class variable. I discovered this in the vuxml_parsing.pm code:

# Class variables

our ($VuXML);

# Call like this:
#
# $v = new FreshPorts::vuxml_parsing(DBHandle      => $dbh,
#                                    Stream        => IO::File,
#                                    UpdateInPlace => $UpdateInPlace);
#
# DBHandle is a database handle from the DBI module. Required (well,
# if you want to write to a database it's necessary)
#
# Stream is optional, and defaults to reading STDIN.  Pass either an
# IO::Handle or a glob: *GLOB or a glob ref: \*GLOB
#
# UpdateInPlace is optional, defaults to TRUE.  Indicates whether or not
# the code should attempt to update existing VuXML entries.
#

sub new
{
    my $caller = shift;
    my $class  = ref($caller) || $caller;
    my %args   = ref( $_[0] ) eq 'HASH' ? %{ shift() } : @_;
    my $self;

    # There can be only one! Since we keep $self hanging around as a
    # Class variable, more than one instance would be a disaster.
    # Therefore, all calls to new() (except the first or when
    # DESTROY() has been called) just return a reference to the same
    # object.

    if ( $VuXML && $VuXML->isa(__PACKAGE__) ) {
        $self = $VuXML;
        return bless $self, $class;
    }

    $VuXML = $self = {};
    bless $self, $class;

This means that the value for $fh initially passed into $p (on line 35 of the first patch on this page) is the same $fh used on every subsequent call. That is what led me to moving the $fh = IO::File->new(); line.

With that, everything worked.

The later code changes

While looking at this, I reverted some of the original patch. In particular:

  1. The $my declarations on lines 17 & 18 were reverted.
  2. The committer_opt_in from line 9 was reverted. It seems to be an accidental add; it was never used.

My thanks to those that helped along the way. This was not an easy thing for me to solve.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

Leave a Comment

Scroll to Top