Improving date search – FreshPorts News

I noticed the server load was high. I saw that several instances of search by date where still running.

Looking at the query in question:

freshports.org=# explain
freshports.org-#         SELECT count(DISTINCT CL.id) AS count                                                                                
freshports.org-#            FROM commit_log_ports CLP, commit_log CL JOIN commit_log_branches CLB ON CL.id = CLB.commit_log_id                 
freshports.org-#                                                     JOIN system_branch SB ON SB.branch_name = 'head' AND SB.id = CLB.branch_id
freshports.org-#           WHERE CL.commit_date BETWEEN '2014/11/07'::timestamptz  + SystemTimeAdjust()                                        
freshports.org-#                                    AND '2014/11/07'::timestamptz  + SystemTimeAdjust() + '1 Day'                              
freshports.org-#          AND CLP.commit_log_id = CL.id;
                                                                                                                      QUERY PLAN                                                                                                                       
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=684.40..684.41 rows=1 width=4)
   ->  Merge Join  (cost=43.20..684.40 rows=1 width=4)
         Merge Cond: (clb.commit_log_id = cl.id)
         ->  Nested Loop  (cost=2.55..3125771.74 rows=4875 width=8)
               Join Filter: (clb.commit_log_id = clp.commit_log_id)
               ->  Index Scan using commit_log_ports_pkey on commit_log_ports clp  (cost=0.00..544853.97 rows=630110 width=4)
               ->  Materialize  (cost=2.55..618.00 rows=273 width=4)
                     ->  Hash Join  (cost=2.55..616.64 rows=273 width=4)
                           Hash Cond: (clb.branch_id = sb.id)
                           ->  Seq Scan on commit_log_branches clb  (cost=0.00..485.26 rows=33626 width=8)
                           ->  Hash  (cost=2.54..2.54 rows=1 width=4)
                                 ->  Seq Scan on system_branch sb  (cost=0.00..2.54 rows=1 width=4)
                                       Filter: (branch_name = 'head'::text)
         ->  Sort  (cost=40.61..40.63 rows=9 width=4)
               Sort Key: cl.id
               ->  Index Scan using commit_log_commit_date on commit_log cl  (cost=0.01..40.47 rows=9 width=4)
                     Index Cond: ((commit_date >= ('2014-11-07 00:00:00+00'::timestamp with time zone + '00:00:00'::interval)) AND (commit_date <= (('2014-11-07 00:00:00+00'::timestamp with time zone + '00:00:00'::interval) + '1 day'::interval)))
(17 rows)

freshports.org=#

The analysis for this query is available at http://explain.depesz.com/s/uCT

I redesigned the query to make use of explicit JOINs:

freshports.org=# explain
freshports.org-# SELECT count(DISTINCT CL.id) AS count
freshports.org-#   FROM commit_log CL JOIN commit_log_ports    CLP ON CLP.commit_log_id = CL.id
freshports.org-#                                                  AND CL.commit_date BETWEEN '2014/11/07'::timestamptz  + SystemTimeAdjust()
freshports.org-#                                                                         AND '2014/11/07'::timestamptz  + SystemTimeAdjust() + '1 Day'
freshports.org-#                      JOIN commit_log_branches CLB ON CL.id             = CLB.commit_log_id
freshports.org-#                      JOIN system_branch       SB  ON SB.branch_name    = 'head'
freshports.org-#                                                  AND SB.id             = CLB.branch_id;
                                                                                                                            QUERY PLAN                                                                                                                             
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=692.70..692.71 rows=1 width=4)
   ->  Nested Loop  (cost=40.58..692.70 rows=1 width=4)
         ->  Nested Loop  (cost=40.58..652.23 rows=1 width=8)
               ->  Hash Join  (cost=40.58..651.95 rows=1 width=12)
                     Hash Cond: (clb.commit_log_id = cl.id)
                     ->  Seq Scan on commit_log_branches clb  (cost=0.00..485.26 rows=33626 width=8)
                     ->  Hash  (cost=40.47..40.47 rows=9 width=4)
                           ->  Index Scan using commit_log_commit_date on commit_log cl  (cost=0.01..40.47 rows=9 width=4)
                                 Index Cond: ((commit_date >= ('2014-11-07 00:00:00+00'::timestamp with time zone + '00:00:00'::interval)) AND (commit_date <= (('2014-11-07 00:00:00+00'::timestamp with time zone + '00:00:00'::interval) + '1 day'::interval)))
               ->  Index Scan using system_branch_pkey on system_branch sb  (cost=0.00..0.27 rows=1 width=4)
                     Index Cond: (id = clb.branch_id)
                     Filter: (branch_name = 'head'::text)
         ->  Index Scan using commit_log_ports_pkey on commit_log_ports clp  (cost=0.00..40.25 rows=18 width=4)
               Index Cond: (commit_log_id = cl.id)
(14 rows)

freshports.org=#

The analysis for that query is at http://explain.depesz.com/s/p3wn.

I’ll let readers describe what happened here…

Related Posts

Leave a Comment Cancel Reply