VUXML – doing it faster – FreshPorts News

It takes a long time for FreshPorts to process a new vuln.xml file. A very long time. Why? Because it process all entries in the file. Every time.

Anton Berezin proposed an idea. I paraphrase, but in brief, he said:

Figuring out what changed is as easy as hashing an entry and storing it in a db

That’s a very good start. I’ve checked the code, and the sticking point is going to be is table:

freshports.org=# \d ports_vulnerable
    Table "public.ports_vulnerable"
 Column  |  Type   |     Modifiers      
---------+---------+--------------------
 port_id | integer | not null
 current | integer | not null default 1
 past    | integer | not null default 0
Indexes:
    "ports_vulnerable_pkey" PRIMARY KEY, btree (port_id)
Foreign-key constraints:
    "$1" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE RESTRICT ON DELETE CASCADE
Triggers:
    ports_vulnerable_delete_clear_cache AFTER DELETE ON ports_vulnerable FOR EACH ROW EXECUTE PROCEDURE ports_vulnerable_delete_clear_cache()

freshports.org=#

This table is used to decide whether or not to display a grey or black skull at the top of each port page. This table is populated with two different queries. The first, figures out the number of vulnerabilities, current and past, that affect a given port:

  SELECT count(distinct vuxml_id)
    INTO l_VulnCount
    FROM commit_log_ports_vuxml
   WHERE port_id = p_PortID;

The second, determines the number of vulnerabilities that currently affect the port:

    SELECT count(distinct vuxml_id)
      INTO l_VulnCurrent
      FROM commit_log_ports_vuxml CLPV, commit_log_ports CLP, ports P, ports_vulnerable PV
     WHERE CLP.commit_log_id = CLPV.commit_log_id
       AND CLPV.port_id      = CLP.port_id
       AND P.id              = CLP.port_id
       AND P.version         = CLP.port_version
       AND P.revision        = CLP.port_revision
       AND CLP.port_epoch    = P.portepoch
       AND PV.port_id        = P.id
       AND P.id              = p_PortID
     GROUP BY P.id;

NOTE: I am not sure of the reason for referencing ports_vulnerable here… My first thought is it is unnecessary. I’m not sure how it is to reference a table that is just about to get updated.

The difference between the two values, if any, is the number of past vulnerabilities.

The problem is an edge case where a vuln entry affects a port, then does not affect a port. For example, consider that a new vuln entry is created and mistakenly refers to PortA instead of PortB. At present, we wipe all entries from the ports_vulnerable table each time we process a new vuln.xml file. Thus, we start from nil each time. If we are now processing just one vuln at a time, we may have a lingering entry in ports_vulnerable which should be deleted.

How do we detect this situation?

Well, when we process the new vuln, we will delete the old one first, from the vuxml table. When this happens, various triggers are set off to delete related records. We can harness that to delete entries in ports_vulnerable as well.

And how?

Via this table:

freshports.org=# \d commit_log_ports_vuxml
                             Table "public.commit_log_ports_vuxml"
    Column     |  Type   |                              Modifiers                              
---------------+---------+---------------------------------------------------------------------
 id            | integer | not null default nextval('commit_log_ports_vuxml_id_seq'::regclass)
 commit_log_id | integer | not null
 port_id       | integer | not null
 vuxml_id      | integer | not null
Indexes:
    "commit_log_ports_vuxml_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "$1" FOREIGN KEY (vuxml_id) REFERENCES vuxml(id) ON UPDATE CASCADE ON DELETE CASCADE
    "$2" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE
    "$3" FOREIGN KEY (commit_log_id) REFERENCES commit_log(id) ON UPDATE CASCADE ON DELETE CASCADE

freshports.org=#

When the vuxml entry is deleted, we can see all the ports affected by this vuln. We can then delete entries from the ports_vulnerable table as well.

Or so I think….