In recent posts, I’ve spoken about a particular task taking about an hour to process. I think I have found one small improvement. A query has gone from 50ms to about 5ms with the addition of an index.
The query is:
SELECT count(vuxml_id) FROM commit_log_ports_vuxml CLPV, commit_log_ports CLP, ports P WHERE CLP.commit_log_id = CLPV.commit_log_id AND CLPV.port_id = CLP.port_id AND P.id = CLP.port_id AND P.version = CLP.port_version AND P.revision = CLP.port_revision AND CLP.port_epoch = P.portepoch AND P.id = 16419 GROUP BY P.id;
The output from explain analyze is:
GroupAggregate (cost=0.00..3171.43 rows=1 width=8) (actual time=59.342..59.343 rows=1 loops=1) -> Nested Loop (cost=0.00..3171.41 rows=1 width=8) (actual time=52.756..59.322 rows=1 loops=1) Join Filter: (clp.commit_log_id = clpv.commit_log_id) -> Nested Loop (cost=0.00..76.31 rows=1 width=12) (actual time=0.048..0.425 rows=1 loops=1) Join Filter: ((clp.port_version = p.version) AND (clp.port_revision = p.revision) AND (clp.port_epoch = p.portepoch)) -> Index Scan using ports_pkey on ports p (cost=0.00..8.27 rows=1 width=13) (actual time=0.020..0.022 rows=1 loops=1) Index Cond: (id = 16419) -> Index Scan using commit_log_ports_port_id on commit_log_ports clp (cost=0.00..67.65 rows=22 width=18) (actual time=0.016..0.255 rows=102 loops=1) Index Cond: (clp.port_id = 16419) -> Seq Scan on commit_log_ports_vuxml clpv (cost=0.00..3074.39 rows=1657 width=12) (actual time=49.140..56.781 rows=1657 loops=1) Filter: (clpv.port_id = 16419) Total runtime: 59.458 ms
What stands out? That sequential scan.
What does the table look like?
Table "public.commit_log_ports_vuxml" Column | Type | Modifiers ---------------+---------+--------------------------------------------------------------------- id | integer | not null default nextval('commit_log_ports_vuxml_id_seq'::regclass) commit_log_id | integer | not null port_id | integer | not null vuxml_id | integer | not null Indexes: "commit_log_ports_vuxml_pkey" PRIMARY KEY, btree (id) Foreign-key constraints: "$1" FOREIGN KEY (vuxml_id) REFERENCES vuxml(id) ON UPDATE CASCADE ON DELETE CASCADE "$2" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE "$3" FOREIGN KEY (commit_log_id) REFERENCES commit_log(id) ON UPDATE CASCADE ON DELETE CASCADE
Ahh, no index on port_id. Let’s fix that.
freshports.org=# create index commit_log_ports_vuxml_port_id on commit_log_ports_vuxml(port_id); CREATE INDEX
Now what’s the query plan?
freshports.org=# explain analyse freshports.org-# SELECT count(vuxml_id) freshports.org-# FROM commit_log_ports_vuxml CLPV, commit_log_ports CLP, ports P freshports.org-# WHERE CLP.commit_log_id = CLPV.commit_log_id freshports.org-# AND CLPV.port_id = CLP.port_id freshports.org-# AND P.id = CLP.port_id freshports.org-# AND P.version = CLP.port_version freshports.org-# AND P.revision = CLP.port_revision freshports.org-# AND CLP.port_epoch = P.portepoch freshports.org-# AND P.id = 16419 freshports.org-# GROUP BY P.id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------ GroupAggregate (cost=29.10..1643.72 rows=1 width=8) (actual time=4.788..4.788 rows=1 loops=1) -> Nested Loop (cost=29.10..1643.70 rows=1 width=8) (actual time=3.237..4.778 rows=1 loops=1) Join Filter: (clp.commit_log_id = clpv.commit_log_id) -> Nested Loop (cost=0.00..76.31 rows=1 width=12) (actual time=0.049..0.428 rows=1 loops=1) Join Filter: ((clp.port_version = p.version) AND (clp.port_revision = p.revision) AND (clp.port_epoch = p.portepoch)) -> Index Scan using ports_pkey on ports p (cost=0.00..8.27 rows=1 width=13) (actual time=0.019..0.021 rows=1 loops=1) Index Cond: (id = 16419) -> Index Scan using commit_log_ports_port_id on commit_log_ports clp (cost=0.00..67.65 rows=22 width=18) (actual time=0.017..0.241 rows=102 loops Index Cond: (clp.port_id = 16419) -> Bitmap Heap Scan on commit_log_ports_vuxml clpv (cost=29.10..1546.68 rows=1657 width=12) (actual time=0.313..2.269 rows=1657 loops=1) Recheck Cond: (clpv.port_id = 16419) -> Bitmap Index Scan on commit_log_ports_vuxml_port_id (cost=0.00..28.69 rows=1657 width=0) (actual time=0.298..0.298 rows=1657 loops=1) Index Cond: (clpv.port_id = 16419) Total runtime: 4.899 ms
OK. Sure, a savings of 40ms isn’t going to help me much. But I estimate this query is run about 1100 times. That saves me 44 second.
Let’s see if there are some other cost savings around…