Improving the parsing of /usr/ports/UPDATING

Shaun Amott wrote in to tell me of a bug with the /usr/ports/UPDATING code. FreshPorts parses this file each time it is committed and loads the data into a table. When a port is displayed, any notes from UPDATING are displayed with that port. The purpose is to alert the reader to any potential instructions for updating that port.

The general format of this file is:

YYYYMMDD
  AFFECTS: list of ports affect by this note
  AUTHOR: who wrote this

  body of the note

Shaun noted that FreshPorts was not handling the case where the AFFECTS clause is long and covers multiple lines. He noticed it when looking at www/horde. What he saw, in part, was:

2006-04-14

Affects: users of www/horde, www/horde-php5, www/horde-passwd,

Author: shaun@inerd.com

Reason:            deskutils/nag, deskutils/mnemo, deskutils/kronolith,
           mail/turba, mail/ingo, mail/imp, devel/chora

  The Horde ports no longer overwrite your existing configuration
  files during an upgrade: if you have modified any of the files in the
  config directory of any of these ports, they will be left untouched
  when you upgrade.

  It is recommended that the new .dist files are examined after an
  upgrade and any changes merged into your existing config files if
  necessary.

The problem is obvious when you read the REASON. That line should be up with AFFECTS.

When I went to look at the code in question, scripts/process_updating.pl, I noticed that this code went into CVS back on Aug 1 2004. It was Travis Campbell who wrote the script and gave it to me. At that time, UPDATING was still new; exactly 5 months old. We did not anticipate multiple lines for AFFECTS. It took me about an hour to alter the script and test it.

The part of the script in question looked like this:

# start going through the file;
for (my $i = 0; $i < scalar @lines; $i++) {
    # encounter a line with a date
    if ($lines[$i] =~ m/^(\d{8}):/) {
        my ($affects, $author, $msg);
        my $date = $1;
        # parse the stuff between lines with dates
        for (my $j = $i + 1; $j < scalar @lines; $j++) {
            my $line = $lines[$j];
            last if ($line =~ m/^\d{8}:/);
            last if ($line =~ m/^\$FreeBSD:/);
            if ($line =~ m/\s+AFFECTS:\s+(.*)$/){
                $affects = $1;
            } elsif ($line =~ m/\s+AUTHOR:\s+(.*)$/) {
                $author = $1;
            } else {
                $msg .= $line . "\n";
            }
        }

I have put the important line in bold.

The solution I came up with was to continue processing any lines after the first AFFECTS until we get to an AUTHOR. All that is encountered will be the AFFECTS clause. Immediately I found a problem with an entry that did not have an AUTHOR clause (e.g. 20051002). I altered my approach slightly and came up with this solution:

# start going through the file;
for (my $i = 0; $i < scalar @lines; $i++) {
    # encounter a line with a date
    if ($lines[$i] =~ m/^(\d{8}):/) {
        my ($affects, $author, $msg, $InAffects);
        my $date = $1;

        $InAffects = 0;
        $affects   = '';
        # parse the stuff between lines with dates
        for (my $j = $i + 1; $j < scalar @lines; $j++) {
            my $line = $lines[$j];
            last if ($line =~ m/^\d{8}:/);
            last if ($line =~ m/^\$FreeBSD:/);  # last line of file
            if ($line =~ m/\s+AFFECTS:\s+(.*)$/){
                $affects  = $1;
                $InAffects = 1;
            } elsif ($line =~ m/\s+AUTHOR:\s+(.*)$/) {
                $author    = $1;
                $InAffects = 0;
            } elsif ($InAffects) {
                # the AFFECTS section is usually terminated by the AUTHOR section.
                # sometimes there is no AUTHOR, and we have a blank line instead
                if ($line =~ m/\S+/) { # if the line contains something not-whitespace
                    # grab the non-whitespace
                    $line =~ m/^\s+(.*)$/;
                    # line it up under the AFFECTS: banner
                    $affects .= "\n         " . $1;
                } else {
                    $InAffects = 0;
                }
            } else {
                $msg .= $line . "\n";
            }
        }

Yes, you probably can see better ways of doing the above. :)

This successfully parsed UPDATING and correctly updated the entry for www/horde.

To ensure I was not introducing a new bug, I compared the ports_updating table before and after running the new script. The only differences found were those related to fixing this bug.

As a result of Shaun noticing this bug, ports which previously did not have UPDATING notes, now have the corrent entries displayed within FreshPorts. :)

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

1 thought on “Improving the parsing of /usr/ports/UPDATING”

  1. Please, would it be possible to get a copy of the complete script? Or give an address, where to fetch it from?
    I’ve just started to do the same by myself, but, of course, not yet so improved as your’s.

    Thanks in advance, Rick.

Leave a Comment

Scroll to Top