In recent posts, I’ve spoken about a particular task taking about an hour to process. I think I have found one small improvement. A query has gone from 50ms to about 5ms with the addition of an index.
The query is:
SELECT count(vuxml_id)
FROM commit_log_ports_vuxml CLPV, commit_log_ports CLP, ports P
WHERE CLP.commit_log_id = CLPV.commit_log_id
AND CLPV.port_id = CLP.port_id
AND P.id = CLP.port_id
AND P.version = CLP.port_version
AND P.revision = CLP.port_revision
AND CLP.port_epoch = P.portepoch
AND P.id = 16419
GROUP BY P.id;
The output from explain analyze is:
GroupAggregate (cost=0.00..3171.43 rows=1 width=8) (actual time=59.342..59.343 rows=1 loops=1)
-> Nested Loop (cost=0.00..3171.41 rows=1 width=8) (actual time=52.756..59.322 rows=1 loops=1)
Join Filter: (clp.commit_log_id = clpv.commit_log_id)
-> Nested Loop (cost=0.00..76.31 rows=1 width=12) (actual time=0.048..0.425 rows=1 loops=1)
Join Filter: ((clp.port_version = p.version) AND (clp.port_revision = p.revision) AND (clp.port_epoch = p.portepoch))
-> Index Scan using ports_pkey on ports p (cost=0.00..8.27 rows=1 width=13) (actual time=0.020..0.022 rows=1 loops=1)
Index Cond: (id = 16419)
-> Index Scan using commit_log_ports_port_id on commit_log_ports clp (cost=0.00..67.65 rows=22 width=18) (actual time=0.016..0.255 rows=102 loops=1)
Index Cond: (clp.port_id = 16419)
-> Seq Scan on commit_log_ports_vuxml clpv (cost=0.00..3074.39 rows=1657 width=12) (actual time=49.140..56.781 rows=1657 loops=1)
Filter: (clpv.port_id = 16419)
Total runtime: 59.458 ms
What stands out? That sequential scan.
What does the table look like?
Table "public.commit_log_ports_vuxml"
Column | Type | Modifiers
---------------+---------+---------------------------------------------------------------------
id | integer | not null default nextval('commit_log_ports_vuxml_id_seq'::regclass)
commit_log_id | integer | not null
port_id | integer | not null
vuxml_id | integer | not null
Indexes:
"commit_log_ports_vuxml_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"$1" FOREIGN KEY (vuxml_id) REFERENCES vuxml(id) ON UPDATE CASCADE ON DELETE CASCADE
"$2" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE
"$3" FOREIGN KEY (commit_log_id) REFERENCES commit_log(id) ON UPDATE CASCADE ON DELETE CASCADE
Ahh, no index on port_id. Let’s fix that.
freshports.org=# create index commit_log_ports_vuxml_port_id on commit_log_ports_vuxml(port_id); CREATE INDEX
Now what’s the query plan?
freshports.org=# explain analyse
freshports.org-# SELECT count(vuxml_id)
freshports.org-# FROM commit_log_ports_vuxml CLPV, commit_log_ports CLP, ports P
freshports.org-# WHERE CLP.commit_log_id = CLPV.commit_log_id
freshports.org-# AND CLPV.port_id = CLP.port_id
freshports.org-# AND P.id = CLP.port_id
freshports.org-# AND P.version = CLP.port_version
freshports.org-# AND P.revision = CLP.port_revision
freshports.org-# AND CLP.port_epoch = P.portepoch
freshports.org-# AND P.id = 16419
freshports.org-# GROUP BY P.id;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=29.10..1643.72 rows=1 width=8) (actual time=4.788..4.788 rows=1 loops=1)
-> Nested Loop (cost=29.10..1643.70 rows=1 width=8) (actual time=3.237..4.778 rows=1 loops=1)
Join Filter: (clp.commit_log_id = clpv.commit_log_id)
-> Nested Loop (cost=0.00..76.31 rows=1 width=12) (actual time=0.049..0.428 rows=1 loops=1)
Join Filter: ((clp.port_version = p.version) AND (clp.port_revision = p.revision) AND (clp.port_epoch = p.portepoch))
-> Index Scan using ports_pkey on ports p (cost=0.00..8.27 rows=1 width=13) (actual time=0.019..0.021 rows=1 loops=1)
Index Cond: (id = 16419)
-> Index Scan using commit_log_ports_port_id on commit_log_ports clp (cost=0.00..67.65 rows=22 width=18) (actual time=0.017..0.241 rows=102 loops
Index Cond: (clp.port_id = 16419)
-> Bitmap Heap Scan on commit_log_ports_vuxml clpv (cost=29.10..1546.68 rows=1657 width=12) (actual time=0.313..2.269 rows=1657 loops=1)
Recheck Cond: (clpv.port_id = 16419)
-> Bitmap Index Scan on commit_log_ports_vuxml_port_id (cost=0.00..28.69 rows=1657 width=0) (actual time=0.298..0.298 rows=1657 loops=1)
Index Cond: (clpv.port_id = 16419)
Total runtime: 4.899 ms
OK. Sure, a savings of 40ms isn’t going to help me much. But I estimate this query is run about 1100 times. That saves me 44 second.
Let’s see if there are some other cost savings around…











