Aug 012006
 

PostgreSQL is my database of choice. Any who has worked with commercial databases will find more in common with PostgreSQL than any other open source database.

I moved to PostgreSQL for the transactions, the stored procecdures, and the relational integrity Sure, other databases have begun to catch up, but they still don’t have the polish and professional feel of PostgreSQL.

With FreshPorts, there are certain things that occur within the database that require external processes to take action. For example, when a change to a port is processed, the cache entry for that port needs to be deleted. At present, the code that processes the incoming commits does that. The problem is, there are several (at least four) pieces of code that can alter a port:

  1. process_cvs_mail.pl
  2. process_moved.pl
  3. process_updating.pl
  4. process_vuxml.pl

There may be others (check this blog for references to cache) that I have forgotten about. To complicate the issue, each of these scripts polls a flag to see if they need to be run. And each of these scripts needs to clear the cache.

I’ve found a better way!

PostgreSQL has a LISTEN/NOTIFY directive. It allows one process to issue a NOTIFY command and another to LISTEN for it. Is it much like the OBSERVER design model.

Why Python? I’ve always liked Python, despite never having used it. I’ve always been looking for a project suitable for my first Python program.

I started this adventure by installing psycopg. It was recommend by a few and I stumbled across it independently. Then I started looking for my first simple python program. I took one from /usr/local/share/examples/py-psycopg and used it. I was able to connect to the database and select now() and get a result. Easy!

The next step was to find a working Python example that used LISTEN. From there, it was a simple matter of copy/paste. The result appears below:

DSN = 'dbname=test user=dan'

import sys, psycopg, select

print "Opening connection using dns:", DSN
conn = psycopg.connect(DSN)
conn.autocommit(1)

curs = conn.cursor()

curs.execute("listen test")

print "Waiting for 'NOTIFY test'"
while 1:
  select.select([curs],[],[])==([],[],[])
  curs.execute("SELECT 1")
  notifies = curs.notifies()
  for n in notifies:
    # in real life, do something with each...
    print n[0]

Here’s it working in anger:

$ python listen3.py
Opening connection using dns: dbname=test user=dan
Waiting for 'NOTIFY test'

On another console, I typed this:

$ psql test
Welcome to psql 8.1.3, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help with psql commands
       \g or terminate with semicolon to execute query
       \q to quit

test=# notify test;
NOTIFY
test=#

On the first screen, the word “test” will appear. Nifty!

Now, this gets more interesting. If you issue the NOTIFY within a transaction, the LISTENing process does not actually get a notification until the commit. This allows you to rollback a transaction without bothering the listeners. After all, if you rollback, it’s as if nothing happened.

What is even more interesting: if you issue multiple NOTIFY commands with the same name within one transaction, only one signal is received by each LISTENer. That’s nice. It means you can put your NOTIFY commands in different places without bothering to make sure JUST one is invoked. Nice. Very nice.

With the above approach, I plan to put a trigger on the Ports table. That will add an entry to the ports_cache_clearing table, and raise a ‘NOTIFY clear_ports_cache’. The listing Python script will then invoke another Python script to do something like this:

SELECT category, port from ports_cache_clearing;
FOR EACH category, port
   remove that cache file from disk
END FOR
DELETE FROM ports_cache_clearing;

You may think: what if a new entry is added to the ports_cache_clearing table between the SELECT and the DELETE. I would answer: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE (for more information).

Setting the transaction level ensures that my script will never see anything else added to the table. Thus, when I delete, I will be deleting only those rows that were present when the transaction started. Or more precisely, when the ISOLATION LEVEL was set.

I can see myself creating a NOTIFY for each of the scripts I listed above.

I can tell this will be fun.