Jul 252016
 

FreshPorts uses a cache for every port page (e.g. https://www.freshports.org/sysutils/bacula-server/). When an update to a port occurs, we must clear the cache for that port. This sounds simple, and it is. The various ways in which updates occur complicates the situation.

This post is all about fixing one edge case: the addition or removal of a port dependency (e.g. RUN_DEPENDS, LIB_DEPENDS).

When port A depends on port B, this fact is listed on both pages, but adding/removing a dependency is not properly handled (see this bug). Tonight, I have coded a fix. I’d like it to be less complex, but it has been fixed.

The table

This is the table in question:

freshports.org=# \d port_dependencies
         Table "public.port_dependencies"
         Column         |     Type     | Modifiers 
------------------------+--------------+-----------
 port_id                | integer      | not null
 port_id_dependent_upon | integer      | not null
 dependency_type        | character(1) | not null
Indexes:
    "port_dependencies_pkey" PRIMARY KEY, btree (port_id, port_id_dependent_upon, dependency_type)
Foreign-key constraints:
    "port_dependencies_port_id_dependent_upon_fkey" FOREIGN KEY (port_id_dependent_upon) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE
    "port_dependencies_port_id_fkey" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=# 

Using the terminology from the first sentence, port A and port B are represented by port_id and port_id_dependent_upon respectively. Any new row or any deleted row must invalidate the cache for both port A and port B.

Keeping track of cache to clear

This is the table we use to clear the ports cache:

freshports.org=# \d cache_clearing_ports
                                     Table "public.cache_clearing_ports"
   Column   |            Type             |                             Modifiers                             
------------+-----------------------------+-------------------------------------------------------------------
 id         | integer                     | not null default nextval('cache_clearing_ports_id_seq'::regclass)
 port_id    | integer                     | not null
 category   | text                        | not null
 port       | text                        | not null
 date_added | timestamp without time zone | not null default now()
Indexes:
    "cache_clearing_ports_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "cache_clearing_ports_port_id_fkey" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=# 

We add rows to this table to indicate that a given port must be cleared from the cache.

On INSERT

On an insert to the port_dependencies table, let’s do this:

CREATE OR REPLACE FUNCTION port_dependencies_insert_clear_cache() RETURNS TRIGGER AS $$
   DECLARE
      l_cache_clearing_ports_id   int8;
      l_port      text;
      l_category  text;
   BEGIN
        --
        -- This function handles the addition of a new dependency.
        -- yes, we need to clear the cache for both port_id and port_id_dependent_upon
        -- from the cache_clearing_ports.  I figure there is a shorter way to this but
        -- I cannot think it through right now.
        --

        IF TG_OP = 'INSERT' THEN
         -- handle port A (port_id)
         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = NEW.port_id;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = NEW.port_id;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (NEW.port_id, l_category, l_port);
          END IF;

         -- handle port B (port_id_dependent_upon)
         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = NEW.port_id_dependent_upon;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = NEW.port_id_dependent_upon;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (NEW.port_id_dependent_upon, l_category, l_port);
          END IF;

          NOTIFY port_updated;
      END IF;

      -- when a port changes, add an entry to the cache clearing table
      RETURN NEW;
   END
$$ LANGUAGE 'plpgsql';

  DROP TRIGGER IF EXISTS port_dependencies_insert_clear_cache ON port_dependencies;
CREATE TRIGGER port_dependencies_insert_clear_cache
    AFTER INSERT on port_dependencies
    FOR EACH ROW
    EXECUTE PROCEDURE port_dependencies_insert_clear_cache();

That function is rather long, and I know it can be simplified, but that’s not on tonight’s agenda. If you are so inclined, I am happy to hear your suggestions. Specifically, I think I can do an INSERT INTO … WHERE NOT EXISTS and do the SELECT & INSERT in one step, without the need for IF FOUND.

On DELETE

Here is the code for the DELETE.

CREATE OR REPLACE FUNCTION port_dependencies_delete_clear_cache() RETURNS TRIGGER AS $$
   DECLARE
      l_cache_clearing_ports_id   int8;
      l_port      text;
      l_category  text;
   BEGIN
        --
        -- This function handles the deletion of a existing dependency.
        -- yes, we need to clear the cache for both port_id and port_id_dependent_upon
        -- from the cache_clearing_ports.  I figure there is a shorter way to this but
        -- I cannot think it through right now.
        --

        IF TG_OP = 'DELETE' THEN
         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = OLD.port_id;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = OLD.port_id;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (OLD.port_id, l_category, l_port);
          END IF;

         SELECT port_id
           INTO l_cache_clearing_ports_id
           FROM cache_clearing_ports
          WHERE port_id = OLD.port_id_dependent_upon;

          IF NOT FOUND THEN
            SELECT category, name
              INTO l_category, l_port
              FROM ports_all
             WHERE id = OLD.port_id_dependent_upon;
            INSERT INTO cache_clearing_ports (port_id, category, port)
            VALUES (OLD.port_id_dependent_upon, l_category, l_port);
          END IF;

          NOTIFY port_updated;
      END IF;

      -- when a port changes, add an entry to the cache clearing table
      RETURN OLD;
   END
$$ LANGUAGE 'plpgsql';

  DROP TRIGGER IF EXISTS port_dependencies_delete_clear_cache ON port_dependencies;
CREATE TRIGGER port_dependencies_delete_clear_cache
    AFTER DELETE on port_dependencies
    FOR EACH ROW
    EXECUTE PROCEDURE port_dependencies_delete_clear_cache();

It is nearly identical to the first function. Again, those of you who want to suggest improvements there, feel free.

I will run this in dev for a while and see how it goes. Initial testing has been promising.

NOTE: there are no updates to this table, by design. Instead when we see a new commit to a port, we do a DELETE from port_dependencies WHERE port_id = :a.