Mar 092021
 

In yesterdays blog post, Replacing a column search with a full text search solution, I wanted to mention how adding a LIMIT 10 to a fast running query made it run slower. I didn’t remember that issue until this morning. Fortunately, the fix was low cost.

The query

This is the query I ran:

EXPLAIN ANALYSE
WITH short_list AS (
    SELECT
        port_id
    FROM
        generate_plist
    WHERE
        textsearchable_index_col @@ to_tsquery('share/examples/acme.sh/dnsapi/dns_1984hosting.sh')
)
  select CL.commit_date - SystemTimeAdjust() AS last_commit_date, 
         P.id, 
         E.name as port,
         C.name as category, 
...
         P.uses  
       FROM short_list, ports P
       LEFT OUTER JOIN ports_vulnerable    PV  ON PV.port_id       = P.id
       LEFT OUTER JOIN commit_log          CL  ON P.last_commit_id = CL.id,
       element_pathname EP,
       categories C, element E 
 WHERE P.id = short_list.port_id
   AND P.element_id = EP.element_id and EP.pathname like '/ports/head/%'
   AND P.category_id  = C.id
   AND P.element_id   = E.id  AND E.status = 'A';

The plan (as found at https://explain.depesz.com/s/Bkkg) was fast enough for me.

Adding LIMIT 10

This was used on the FreshPorts search page (not in production, but on the dev website). It needs a limit, so I added LIMIT 10.

The plan changed to a sequential scan of a large table. Boom, 28 seconds. That is unsatisfactory. The plan is at https://explain.depesz.com/s/oJZB.

The Fix

The fix, as pointed out by Thomas Hurst in his reply to my tweet centered around a CTE (Common Table Expressions). That initially confused me, because the query was already using a CTE. That’s the WITH short_list AS part of the query.

Then I read the AS MATERIALIZED part of his reply and started reading up on that. I changed the query to:

EXPLAIN ANALYSE
WITH short_list AS MATERIALIZED (
    SELECT
        port_id
    FROM
        generate_plist
    WHERE
        textsearchable_index_col @@ to_tsquery('share/examples/acme.sh/dnsapi/dns_1984hosting.sh')
)
  select CL.commit_date - SystemTimeAdjust() AS last_commit_date, 
         P.id, 
         E.name as port,
         C.name as category, 
...
         P.uses  
       FROM short_list, ports P
       LEFT OUTER JOIN ports_vulnerable    PV  ON PV.port_id       = P.id
       LEFT OUTER JOIN commit_log          CL  ON P.last_commit_id = CL.id,
       element_pathname EP,
       categories C, element E 
 WHERE P.id = short_list.port_id
   AND P.element_id = EP.element_id and EP.pathname like '/ports/head/%'
   AND P.category_id  = C.id
   AND P.element_id   = E.id  AND E.status = 'A'
LIMIT 10;

You can see the change on line 2: AS MATERIALIZED.

The plan (as found at https://explain.depesz.com/s/UBps) takes 1.4s and is back to using an index scan.

Further reading:

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive