John Lumby
2013-02-18 16:45:01 UTC
On 2012-10-09 23:09:21
it is caused by huge *under*-estimation of cardinality rather then *over*-estimation.
The statement is quite complex and the problem arises because there is a chain of
RECURSIVE CTEs each defined as a query involving an earlier CTE and more tables.
Eventually there is no hope for making a good cardinality estimate.
One CTE in particular has a cardinality estimate of 1 (I guess the actual
estimate is nearer zero and rounded up) but actual count is over 100000.
The planner puts this CTE as inner of a nested loop accessed by simple linear CTE scan
and the full query then takes over 20 minutes.
-> Nested Loop (cost=0.00..0.06 rows=1 width=588) (actual time=2340.421..1201593.856 rows=105984 loops=1)
Join Filter: ((winnum.subnet_id = binoptasc.subnet_id) AND (winnum.option_code = binoptasc.option_code) AND ((winnum.option_discriminator)::text = (binoptasc.option_discriminator)::text) AND (winnum.net_rel_level = binoptasc.net_rel_level))
Rows Removed by Join Filter: 7001612448
Buffers: shared hit=2290941
-> CTE Scan on winning_option_nums winnum (cost=0.00..0.02 rows=1 width=536) (actual time=2338.422..2543.684 rows=62904 loops=1)
Buffers: shared hit=2290941
-> CTE Scan on subnet_inhrt_options_asc binoptasc (cost=0.00..0.02 rows=1 width=584) (actual time=0.000..9.728 rows=111308 loops=62904)
Whereas, (by altering various statistics to be very wrong) the entire query runs in 21 seconds.
There have been several debates about how to address situations like this where
no practical non-query-specific statistics-gathering scheme can ever hope to
gather enough statistics to model the later derived tables. E.g. the frowned-on
SELECTIVITY clause and ideas for query-specific statistics.
Meanwhile, I have one other suggestion aimed specifically at problematic CTEs:
Would it be reasonable to provide a new Planner Configuration option :
enable_nestloop_cte_inner (boolean)
Enables or disables the query planner's use of nested-loop join plans in which a CTE is the inner.
It is impossible to suppress such nested-loop joins entirely,
but turning this variable off discourages the planner from using one
if there are other methods available, such as sorting the CTE for merge-join
or hashing it for hash-join.
The default is on.
John
re subject Why am I getting great/terrible estimates with these CTE queries?
You're assuming the case where the estimate is better is better for a
reason ... but it's only better as a result of blind dumb luck. The
outer-level query planner doesn't know anything about the CTE's output
except the estimated number of rows --- in particular, it doesn't drill
down to find any statistics about the join column.
I am also struggling with a problem involving CTEs although in my caseYou're assuming the case where the estimate is better is better for a
reason ... but it's only better as a result of blind dumb luck. The
outer-level query planner doesn't know anything about the CTE's output
except the estimated number of rows --- in particular, it doesn't drill
down to find any statistics about the join column.
it is caused by huge *under*-estimation of cardinality rather then *over*-estimation.
The statement is quite complex and the problem arises because there is a chain of
RECURSIVE CTEs each defined as a query involving an earlier CTE and more tables.
Eventually there is no hope for making a good cardinality estimate.
One CTE in particular has a cardinality estimate of 1 (I guess the actual
estimate is nearer zero and rounded up) but actual count is over 100000.
The planner puts this CTE as inner of a nested loop accessed by simple linear CTE scan
and the full query then takes over 20 minutes.
-> Nested Loop (cost=0.00..0.06 rows=1 width=588) (actual time=2340.421..1201593.856 rows=105984 loops=1)
Join Filter: ((winnum.subnet_id = binoptasc.subnet_id) AND (winnum.option_code = binoptasc.option_code) AND ((winnum.option_discriminator)::text = (binoptasc.option_discriminator)::text) AND (winnum.net_rel_level = binoptasc.net_rel_level))
Rows Removed by Join Filter: 7001612448
Buffers: shared hit=2290941
-> CTE Scan on winning_option_nums winnum (cost=0.00..0.02 rows=1 width=536) (actual time=2338.422..2543.684 rows=62904 loops=1)
Buffers: shared hit=2290941
-> CTE Scan on subnet_inhrt_options_asc binoptasc (cost=0.00..0.02 rows=1 width=584) (actual time=0.000..9.728 rows=111308 loops=62904)
Whereas, (by altering various statistics to be very wrong) the entire query runs in 21 seconds.
There have been several debates about how to address situations like this where
no practical non-query-specific statistics-gathering scheme can ever hope to
gather enough statistics to model the later derived tables. E.g. the frowned-on
SELECTIVITY clause and ideas for query-specific statistics.
Meanwhile, I have one other suggestion aimed specifically at problematic CTEs:
Would it be reasonable to provide a new Planner Configuration option :
enable_nestloop_cte_inner (boolean)
Enables or disables the query planner's use of nested-loop join plans in which a CTE is the inner.
It is impossible to suppress such nested-loop joins entirely,
but turning this variable off discourages the planner from using one
if there are other methods available, such as sorting the CTE for merge-join
or hashing it for hash-join.
The default is on.
John
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance