Variation in ListId values

FreshPorts processes commits via email. That is, email coming in from the mailing list are received by FreshPorts, parsed, and the resulting data loaded into the database.

Today a problem arose. I received these error messages:

This List-Id/Message-Id combination is not known to this script.
List-Id='SVN commit messages for the ports tree for head '
Message-Id='201407091415.s69EFKwL020997@svn.freebsd.org'

This List-Id/Message-Id combination is not known to this script.
List-Id='SVN commit messages for the ports tree for head '
Message-Id='201407091212.s69CCiqN066033@svn.freebsd.org'

This List-Id/Message-Id combination is not known to this script.
List-Id='"SVN commit messages for the entire doc trees \(except for " user" , " projects" , and " translations" \)" '
Message-Id='201407091109.s69B9Alk035048@svn.freebsd.org'

The issue: That’s not what FreshPorts is looking for in the List ID field. We look for exact matches. Looking at all the valves processed today, we see:

$ grep -h 'List-Id' *.txt.raw | sort | uniq | less
List-Id: "SVN commit messages for the entire doc trees \(except for "
List-Id: "SVN commit messages for the entire src tree \(except for "
List-Id: SVN commit messages for the ports tree for head

For some reason, and only on this particular FreshPorts installation, the ListId is slightly different. In most messages, the ListId is spread across two lines, but for these three messages, they were combined.

Here is the code change I made:

$ svn diff branches.pm
Index: branches.pm
===================================================================
--- branches.pm	(revision 4631)
+++ branches.pm	(working copy)
@@ -23,10 +23,18 @@
      'process' => 'process_svn_mail',
      'repo'    => $FreshPorts::Config::Repo_DOC,
      },
+  '"SVN commit messages for the entire doc trees \(except for " user" , " projects" , and " translations" \)" <svn-doc-all.freebsd.org>' => {
+     'process' => 'process_svn_mail',
+     'repo'    => $FreshPorts::Config::Repo_DOC,
+     },
   'SVN commit messages for the ports tree for head' => {
      'process' => 'process_svn_mail',
      'repo'    => $FreshPorts::Config::Repo_PORTS,
      },
+  'SVN commit messages for the ports tree for head ' => {
+     'process' => 'process_svn_mail',
+     'repo'    => $FreshPorts::Config::Repo_PORTS,
+     },
   'SVN commit messages for all the branches of the ports tree' => {
     'process' => 'process_svn_mail',
     'repo'    => $FreshPorts::Config::Repo_PORTS,

But to understand what is going on, here is the whole data structure:

# these are the mailing lists associated with those branches
%FreshPorts::Branches::MailingLists = (
  '"SVN commit messages for the entire src tree \(except for "' => {
     'process' => 'process_svn_mail',
     'repo'    => $FreshPorts::Config::Repo_SRC,
     },
  '"SVN commit messages for the entire doc trees \(except for "' => {
     'process' => 'process_svn_mail',
     'repo'    => $FreshPorts::Config::Repo_DOC,
     },
  '"SVN commit messages for the entire doc trees \(except for " user" , " projects" , and " translations" \)" <svn-doc-all.freebsd.org>' => {
     'process' => 'process_svn_mail',
     'repo'    => $FreshPorts::Config::Repo_DOC,
     },
  'SVN commit messages for the ports tree for head' => {
     'process' => 'process_svn_mail',
     'repo'    => $FreshPorts::Config::Repo_PORTS,
     },
  'SVN commit messages for the ports tree for head ' => {
     'process' => 'process_svn_mail',
     'repo'    => $FreshPorts::Config::Repo_PORTS,
     },
  'SVN commit messages for all the branches of the ports tree' => {
    'process' => 'process_svn_mail',
    'repo'    => $FreshPorts::Config::Repo_PORTS,
     },
  'CVS commit messages for the ports tree' => {
     'process' => 'process_cvs_mail',
     'repo'    => '',
     },
  'CVS commit messages for the doc and www trees' => {
     'process' => 'process_cvs_mail',
     'repo'    => '',
     },
  '\*\*OBSOLETE\*\* CVS commit messages for the entire tree' => {
     'process' => 'process_cvs_mail',
     'repo'    => '',
     },
  '\*\*OBSOLETE\*\* CVS commit messages for the src tree' => {
     'process' => 'process_cvs_mail',
     'repo'    => '',
     },
  'CVS commit messages for the projects tree' => {
     'process' => 'process_cvs_mail',
     'repo'    => '',
     },
);

#
# given a list id, grab the properties for it
#
sub ListProperties($)
{
 my $ListId = shift;
 my $hash;

 # strip off the leading header. 
 $ListId =~ s/List-Id:\s+//;

 $hash = $FreshPorts::Branches::MailingLists{$ListId};

 return $hash;
}

Yes, this is perl.

And yes, we should do this with a regex, matching the first part of the string as shown above. How would you do that?

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

Leave a Comment

Scroll to Top