Speed of exist

Discussion:

Speed of exist

(too old to reply)

Bastiaan Olij

2013-02-19 06:34:56 UTC

Hi All,

Hope someone can help me a little bit here:

I've got a query like the following:
--
select Column1, Column2, Column3
from Table1
where exists (select 1 from Table2 where Table2.ForeignKey =
Table1.PrimaryKey)
or exists (select 1 from Table3 where Table3.ForeignKey = Table1.PrimaryKey)
--

Looking at the query plan it is doing a sequential scan on both Table2
and Table3.

If I remove one of the subqueries and turn the query into:
--
select Column1, Column2, Column3
from Table1
where exists (select 1 from Table2 where Table2.ForeignKey =
Table1.PrimaryKey)
--

It is nicely doing an index scan on the index that is on Table2.ForeignKey.

As Table2 and Table3 are rather large the first query takes minutes
while the second query takes 18ms.

Is there a way to speed this up or an alternative way of selecting
records from Table1 which have related records in Table2 or Table3 which
is faster?

Kindest Regards,

Bastiaan Olij

--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Andy

2013-02-19 07:31:02 UTC

Permalink

Limit the sub-queries to 1, i.e. :

select 1 from Table2 where Table2.ForeignKey = Table1.PrimaryKey fetch first 1 rows only

Andy.

Post by Bastiaan Olij
Hi All,
--
select Column1, Column2, Column3
from Table1
where exists (select 1 from Table2 where Table2.ForeignKey =
Table1.PrimaryKey)
or exists (select 1 from Table3 where Table3.ForeignKey = Table1.PrimaryKey)
--
Looking at the query plan it is doing a sequential scan on both Table2
and Table3.
--
select Column1, Column2, Column3
from Table1
where exists (select 1 from Table2 where Table2.ForeignKey =
Table1.PrimaryKey)
--
It is nicely doing an index scan on the index that is on Table2.ForeignKey.
As Table2 and Table3 are rather large the first query takes minutes
while the second query takes 18ms.
Is there a way to speed this up or an alternative way of selecting
records from Table1 which have related records in Table2 or Table3 which
is faster?
Kindest Regards,
Bastiaan Olij

--
------------------------------------------------------------------------------------------------------------------------

*Andy Gumbrecht*
Research & Development
Orpro Vision GmbH
Hefehof 24, 31785, Hameln

+49 (0) 5151 809 44 21
+49 (0) 1704 305 671
***@orprovision.com
www.orprovision.com

Orpro Vision GmbH
Sitz der Gesellschaft: 31785, Hameln
USt-Id-Nr: DE264453214
Amtsgericht Hannover HRB204336
Geschaeftsfuehrer: Roberto Gatti, Massimo Gatti, Adam Shaw

------------------------------------------------------------------------------------------------------------------------

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige
Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren, jegliche anderweitige Verwendung sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

------------------------------------------------------------------------------------------------------------------------

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient
(or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure, distribution or other use of the material or parts thereof is strictly
forbidden.

------------------------------------------------------------------------------------------------------------------------

--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Bastiaan Olij

2013-02-19 07:36:48 UTC

Permalink

Hi Andy,

I've tried that with the same result. One subquery works beautifully,
two subqueries with an OR and it starts to do a sequential scan...

Thanks,

Bastiaan Olij

Post by Andy
select 1 from Table2 where Table2.ForeignKey = Table1.PrimaryKey fetch first 1 rows only
Andy.

Post by Bastiaan Olij
Hi All,
--
select Column1, Column2, Column3
from Table1
where exists (select 1 from Table2 where Table2.ForeignKey =
Table1.PrimaryKey)
or exists (select 1 from Table3 where Table3.ForeignKey =
Table1.PrimaryKey)
--
Looking at the query plan it is doing a sequential scan on both Table2
and Table3.
--
select Column1, Column2, Column3
from Table1
where exists (select 1 from Table2 where Table2.ForeignKey =
Table1.PrimaryKey)
--
It is nicely doing an index scan on the index that is on
Table2.ForeignKey.
As Table2 and Table3 are rather large the first query takes minutes
while the second query takes 18ms.
Is there a way to speed this up or an alternative way of selecting
records from Table1 which have related records in Table2 or Table3 which
is faster?
Kindest Regards,
Bastiaan Olij

--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Pavel Stehule

2013-02-19 07:39:31 UTC

Permalink

Post by Bastiaan Olij
Hi Andy,
I've tried that with the same result. One subquery works beautifully,
two subqueries with an OR and it starts to do a sequential scan...

try to rewrite OR to two SELECTs joined by UNION ALL

Pavel

Post by Bastiaan Olij
Thanks,
Bastiaan Olij

Post by Andy
select 1 from Table2 where Table2.ForeignKey = Table1.PrimaryKey fetch first 1 rows only
Andy.

Post by Bastiaan Olij
Hi All,
--
select Column1, Column2, Column3
from Table1
where exists (select 1 from Table2 where Table2.ForeignKey =
Table1.PrimaryKey)
or exists (select 1 from Table3 where Table3.ForeignKey =
Table1.PrimaryKey)
--
Looking at the query plan it is doing a sequential scan on both Table2
and Table3.
--
select Column1, Column2, Column3
from Table1
where exists (select 1 from Table2 where Table2.ForeignKey =
Table1.PrimaryKey)
--
It is nicely doing an index scan on the index that is on
Table2.ForeignKey.
As Table2 and Table3 are rather large the first query takes minutes
while the second query takes 18ms.
Is there a way to speed this up or an alternative way of selecting
records from Table1 which have related records in Table2 or Table3 which
is faster?
Kindest Regards,
Bastiaan Olij

--
http://www.postgresql.org/mailpref/pgsql-performance

--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Bastiaan Olij

2013-02-19 10:18:55 UTC

Permalink

Hi Pavel,

That is what I've done in this particular case but there are parts where
I use exist checks in this way that are very cumbersome to write out
like that so I'm hoping there is a way to make the optimizer work with
existence checks in this way.

Cheers,

Bastiaan Olij

Post by Pavel Stehule

Post by Bastiaan Olij
Hi Andy,
I've tried that with the same result. One subquery works beautifully,
two subqueries with an OR and it starts to do a sequential scan...

try to rewrite OR to two SELECTs joined by UNION ALL
Pavel

--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance