Processing git and svn on the same ingress node

Some blog posts serve to help me think through to a solution. This blog post is just for that.

Today I realized the code needs to handle both git and svn. I thought I would have one cut-over date after which all commits would go through git. I see now that this isn’t the way to go. The code has to be ready to import both git and svn commits. But not from the same tree. We don’t want duplicates.

We have three repos:

  1. doc
  2. ports
  3. src

So what next?

Today, I’m going to take an XML file from dev and see if devgit can import it. I fully expect errors.

Use of uninitialized value $Updates{"commit_hash"} in concatenation (.) or string at /usr/local/lib/perl5/site_perl/FreshPorts/xml_munge_git.pm line 622.
Use of uninitialized value $Updates{"FileRevision"} in concatenation (.) or string at /usr/local/lib/perl5/site_perl/FreshPorts/xml_munge_git.pm line 623.
Use of uninitialized value $FileRevision in concatenation (.) or string at /usr/local/lib/perl5/site_perl/FreshPorts/xml_munge_git.pm line 624.
no value set for incoming RepoName at /usr/local/lib/perl5/site_perl/FreshPorts/xml_munge_git.pm line 555.

And we have them. Yes, there is no commit_hash in the subversion XML.

The ingress code is different, to handle a commit_hash, both long ( 1cabbda44f7f82543402b6a988976020afda2c46) and short (1cabbda).

There is no code in the git branch to handle importing svn commits.

What about keeping the two websites separate?

Let’s consider this scenario.

doc starts using git first. Then src, then ports.

Let git.freshports.org process the git commits. Let www.freshports.org process the svn commits.

When everything is transitioned to git, promote git.freshports.org to www.freshports.org

No, that won’t work. The databases are separate. The git website won’t have all the commits which were processed by the svn website….

It has to be one database

We know the git database can handle svn commits, because they are already present. We know the website can already display svn commits, because it is.

Yes, it’s just the ingress side which has to be updated now. Or rather, svn-specific code merged into the git branch.

This might be feasible if we keep svn and git processing completely separate and don’t try to make one do both.

Avoiding concurrency issues

FreshPorts has always process one commit at a time, in the order they are received. For cvs and svn, ‘order’ was defined by when the commit mailing list email was received. Commits received out of order will produce interesting results, such as a port version decreasing. There is no simple solution to that issue as far as I know.

For git, commits are again processed in order, according to however they appear in the tree, not commit date order.

But with two input streams, I’d rather avoid having two commits being processed at the same time. It is unlikely that any concurrency issues would arise, but I’d rather just avoid that.

That means two separate message queues and processing, but only one consumer of those two queues.

The svn outline

For svn, commits are processed like this:

  1. email arrives
  2. raw email is dumped into ~ingress/message-queues/incoming/2020.11.29.17.43.12.53448.txt
  3. the above is all handled by ~ingress/.mailfilter and these configuration settings in /usr/local/etc/postfix/main.cf
    mailbox_command = /usr/local/bin/maildrop -d ${USER}
    setgid_group = maildrop
    
  4. fp-daemon.sh sees 2020.11.29.17.43.12.53448.txt
  5. XML is created and dumped into 2020.11.29.17.43.12.53448.txt.xml but in ~freshports/message-queues/recent/
  6. XML is processed and loaded into the database.

The git outline

For git processing, there is no incoming email. Instead, we poll the local working copy of the git repo after a git fetch.

  1. The FreeBSD periodic system invokes /usr/local/etc/periodic/everythreeminutes/215.fp_check_git_for_commits
  2. If a new commit is found, it is extracted from the repo and a new file is created: ~ingress/message-queues/incoming/2020.10.01.19.50.02.000000.4796a64ade4267608e861f717e443c0290b73b70.xml – yes, that is a timestamp and a commit hash in that filename.
  3. The freshports daemon notices a new file in the incoming directory.
  4. XML is processed and loaded into the database.

Joining the two outlines

The solution I see is the modify both outlines so they stop at creating the XML file in different directories.

The freshports daemon then scans both directories and processes them accordingly.

The fp-daemon code (or more specifically, the code it invokes) will be modified so it only creates the XML and does not process it.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

Leave a Comment

Scroll to Top