p***@foo.me.uk
2012-12-04 15:06:48 UTC
Hi guys (and girls)
I've been banging my head over this for a few days now so if any of you kind
souls could take a minute to take a look at this I would be eternally
grateful.
I have a pretty straightforward query that is very slow by default, and
about 70 times faster when I set enable_bitmapscan=off. I would like to
convince the planner to use my lovely indexes.
The scenario is this; I have two tables, trade and position_effect. A trade
is a deal we do with somebody to exchange something for something else. It
has a time it was done, and is associated with a particular book for
accounting purposes. A position effect records changes to our position (e.g.
how much we have) of an particular asset. One trade can many position
effects (usually only 1,2 or 3)
For example, I do a trade of USD/GBP and I get two position effects, +1000
GBP and -1200USD
SCHEMA:
-------
The actual schema is a bit more complicated but I will put the important
parts here (if you think it important, the full schema for the two tables is
here: http://pastebin.com/6Y52aDFL):
CREATE TABLE trade
(
id bigserial NOT NULL,
time_executed timestamp with time zone NOT NULL,
id_book integer NOT NULL,
CONSTRAINT cons_trade_primary_key PRIMARY KEY (id),
)
CREATE INDEX idx_trade_id_book
ON trade
USING btree
(id_book, time_executed, id);
CREATE TABLE position_effect
(
id bigserial NOT NULL,
id_trade bigint NOT NULL,
id_asset integer NOT NULL,
quantity double precision NOT NULL,
CONSTRAINT cons_pe_primary_key PRIMARY KEY (id_trade, id_asset),
)
SETUP:
------
These tables are relatively large (~100 million rows in position effect).
The box is a pretty beastly affair with 512Mb of ram and 4x10 2Ghz cores.
The postgres configuration is here:
http://pastebin.com/48uyiak7
I am using a 64bit postgresql 9.2.1, hand compiled on a RedHat 6.2 box.
QUERY:
------
What I want to do is sum all of the position effects, for a particular asset
while joined to the trade table to filter for the time it was executed and
the book it was traded into:
SELECT sum(position_effect.quantity)
FROM trade, position_effect
WHERE trade.id = position_effect.id_trade
AND position_effect.id_asset = 1837
AND trade.time_executed >= '2012-10-28 00:00:00'
AND trade.id_book = 41
In this case there are only 11 rows that need to be summed. If I just let
postgres do its thing, that query takes 5000ms (Which when multiplied over
many books and assets gets very slow). I think this is because it is
bitmapping the whole position_effect table which is very large. If I disable
bitmap scans:
set enable_bitmapscan = off;
The query takes 43ms, and properly uses the indexes I have set up.
Slow version with bitmapscan enabled: http://explain.depesz.com/s/6I7
Fast version with bitmapscan disabled: http://explain.depesz.com/s/4MWG
I've been banging my head over this for a few days now so if any of you kind
souls could take a minute to take a look at this I would be eternally
grateful.
I have a pretty straightforward query that is very slow by default, and
about 70 times faster when I set enable_bitmapscan=off. I would like to
convince the planner to use my lovely indexes.
The scenario is this; I have two tables, trade and position_effect. A trade
is a deal we do with somebody to exchange something for something else. It
has a time it was done, and is associated with a particular book for
accounting purposes. A position effect records changes to our position (e.g.
how much we have) of an particular asset. One trade can many position
effects (usually only 1,2 or 3)
For example, I do a trade of USD/GBP and I get two position effects, +1000
GBP and -1200USD
SCHEMA:
-------
The actual schema is a bit more complicated but I will put the important
parts here (if you think it important, the full schema for the two tables is
here: http://pastebin.com/6Y52aDFL):
CREATE TABLE trade
(
id bigserial NOT NULL,
time_executed timestamp with time zone NOT NULL,
id_book integer NOT NULL,
CONSTRAINT cons_trade_primary_key PRIMARY KEY (id),
)
CREATE INDEX idx_trade_id_book
ON trade
USING btree
(id_book, time_executed, id);
CREATE TABLE position_effect
(
id bigserial NOT NULL,
id_trade bigint NOT NULL,
id_asset integer NOT NULL,
quantity double precision NOT NULL,
CONSTRAINT cons_pe_primary_key PRIMARY KEY (id_trade, id_asset),
)
SETUP:
------
These tables are relatively large (~100 million rows in position effect).
The box is a pretty beastly affair with 512Mb of ram and 4x10 2Ghz cores.
The postgres configuration is here:
http://pastebin.com/48uyiak7
I am using a 64bit postgresql 9.2.1, hand compiled on a RedHat 6.2 box.
QUERY:
------
What I want to do is sum all of the position effects, for a particular asset
while joined to the trade table to filter for the time it was executed and
the book it was traded into:
SELECT sum(position_effect.quantity)
FROM trade, position_effect
WHERE trade.id = position_effect.id_trade
AND position_effect.id_asset = 1837
AND trade.time_executed >= '2012-10-28 00:00:00'
AND trade.id_book = 41
In this case there are only 11 rows that need to be summed. If I just let
postgres do its thing, that query takes 5000ms (Which when multiplied over
many books and assets gets very slow). I think this is because it is
bitmapping the whole position_effect table which is very large. If I disable
bitmap scans:
set enable_bitmapscan = off;
The query takes 43ms, and properly uses the indexes I have set up.
Slow version with bitmapscan enabled: http://explain.depesz.com/s/6I7
Fast version with bitmapscan disabled: http://explain.depesz.com/s/4MWG
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance