Discussion:
Optimize SELECT * from table WHERE foreign_key_id IN (key1,key2,key3,key4...)
(too old to reply)
Niels Kristian Schjødt
2013-03-05 14:00:48 UTC
Permalink
Hi,

I'm running a rails app, where I have a model called Car that has_many Images. Now when I tell rails to include those images, when querying say 50 cars, then it often decides to use a SELECT * from images WHERE car_id IN (id1,id2,id3,id4…) instead of doing a join.

Now either way it uses the index I have on car_id:

Index Scan using car_id_ix on adverts (cost=0.47..5665.34 rows=1224 width=234)
Index Cond: (car_id = ANY ('{7097561,7253541,5159633,6674471,...}'::integer[]))

But it's slow, it's very slow. In this case it took 3,323ms

Can I do anything to optimize that query or maybe the index or something?

The table has 16.000.000 rows
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Julien Cigar
2013-03-05 14:26:44 UTC
Permalink
Post by Niels Kristian Schjødt
Hi,
I'm running a rails app, where I have a model called Car that has_many Images. Now when I tell rails to include those images, when querying say 50 cars, then it often decides to use a SELECT * from images WHERE car_id IN (id1,id2,id3,id4…) instead of doing a join.
why do you want a join here ? if you don't need any "cars" data there is
no need to JOIN that table.
Now a select ... from ... where id in (id1, id2, ..., idn) isn't very
scalable.

Instead of passing id1, id2, ..., idn you'be better pass the condition
and do a where id in (select ... ), or where exists (select 1 ... where
...), or a join, or ...
Post by Niels Kristian Schjødt
Index Scan using car_id_ix on adverts (cost=0.47..5665.34 rows=1224 width=234)
Index Cond: (car_id = ANY ('{7097561,7253541,5159633,6674471,...}'::integer[]))
But it's slow, it's very slow. In this case it took 3,323ms
3ms isn't slow
Post by Niels Kristian Schjødt
Can I do anything to optimize that query or maybe the index or something?
your index is already used
Post by Niels Kristian Schjødt
The table has 16.000.000 rows
--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Niels Kristian Schjødt
2013-03-05 23:51:42 UTC
Permalink
Hi, thanks for answering. See comments inline.
Post by Niels Kristian Schjødt
Hi,
I'm running a rails app, where I have a model called Car that has_many Images. Now when I tell rails to include those images, when querying say 50 cars, then it often decides to use a SELECT * from images WHERE car_id IN (id1,id2,id3,id4…) instead of doing a join.
why do you want a join here ? if you don't need any "cars" data there is no need to JOIN that table.
I need both
Now a select ... from ... where id in (id1, id2, ..., idn) isn't very scalable.
Instead of passing id1, id2, ..., idn you'be better pass the condition and do a where id in (select ... ), or where exists (select 1 ... where ...), or a join, or …
I tried this now, and it doesn't seem to do a very big difference unfortunately…
Post by Niels Kristian Schjødt
Now either way it uses the index I
Index Scan using car_id_ix on adverts (cost=0.47..5665.34 rows=1224 width=234)
Index Cond: (car_id = ANY ('{7097561,7253541,5159633,6674471,...}'::integer[]))
But it's slow, it's very slow. In this case it took 3,323ms
3ms isn't slow
Sorry, it's 3323ms!
Post by Niels Kristian Schjødt
Can I do anything to optimize that query or maybe the index or something?
your index is already used
Okay this leaves me with - "get better hardware" or?
Post by Niels Kristian Schjødt
The table has 16.000.000 rows
--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.
--
http://www.postgresql.org/mailpref/pgsql-performance
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Joshua D. Drake
2013-03-06 00:07:21 UTC
Permalink
Post by Niels Kristian Schjødt
Post by Julien Cigar
3ms isn't slow
Sorry, it's 3323ms!
Post by Julien Cigar
Post by Niels Kristian Schjødt
Can I do anything to optimize that query or maybe the index or something?
your index is already used
Okay this leaves me with - "get better hardware" or?
What does explain analyze say versus just explain.

JD
--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Jeff Janes
2013-03-06 16:44:00 UTC
Permalink
Post by Julien Cigar
3ms isn't slow
Post by Niels Kristian Schjødt
Post by Niels Kristian Schjødt
Sorry, it's 3323ms!
Can I do anything to optimize that query or maybe the index or something?
Post by Niels Kristian Schjødt
your index is already used
Okay this leaves me with - "get better hardware" or?
What does explain analyze say versus just explain.
Better yet, "explain (analyze, buffers)" with track_io_timing turned on.

Cheers,

Jeff

Julien Cigar
2013-03-06 12:33:05 UTC
Permalink
Post by Niels Kristian Schjødt
Hi, thanks for answering. See comments inline.
Post by Niels Kristian Schjødt
Hi,
I'm running a rails app, where I have a model called Car that has_many Images. Now when I tell rails to include those images, when querying say 50 cars, then it often decides to use a SELECT * from images WHERE car_id IN (id1,id2,id3,id4…) instead of doing a join.
why do you want a join here ? if you don't need any "cars" data there is no need to JOIN that table.
I need both
Now a select ... from ... where id in (id1, id2, ..., idn) isn't very scalable.
Instead of passing id1, id2, ..., idn you'be better pass the condition and do a where id in (select ... ), or where exists (select 1 ... where ...), or a join, or …
I tried this now, and it doesn't seem to do a very big difference unfortunately…
could you paste the full query, an explain analyze of it, and some
details about your config (how much ram ? what's your: shared_buffers,
effective_cache_size, cpu_tuple_cost, work_mem, ...) ?
Post by Niels Kristian Schjødt
Post by Niels Kristian Schjødt
Now either way it uses the index I
Index Scan using car_id_ix on adverts (cost=0.47..5665.34 rows=1224 width=234)
Index Cond: (car_id = ANY ('{7097561,7253541,5159633,6674471,...}'::integer[]))
But it's slow, it's very slow. In this case it took 3,323ms
3ms isn't slow
Sorry, it's 3323ms!
Post by Niels Kristian Schjødt
Can I do anything to optimize that query or maybe the index or something?
your index is already used
Okay this leaves me with - "get better hardware" or?
Post by Niels Kristian Schjødt
The table has 16.000.000 rows
--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.
--
http://www.postgresql.org/mailpref/pgsql-performance
--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Loading...