Casey Allen Shobe
2013-02-01 17:11:53 UTC
So where I'm working, a performance issue was identified that affected many
functions, because the (SQL language) functions took an int argument used
it in a where clause against a column (config_id) that was stored in
varchar format, leading to an inefficient casting when the query was
parameterized. We could work around that with (select $3::text) instead of
just $3, but since the data is actually all numbers under 65k, we altered
the data type of the column to smallint, rather than editing a boatload of
functions with a hacky workaround.
For most functions, this fixed the problem.
However, it had a drastically-negative impact on the query in question,
which was originally taking 2 minutes, 45 seconds. After adding a couple
indexes with the config_id still as a varchar, that time is reduced down to
42 seconds. However when the data type is smallint, the query runs for
many hours - I let it run for 4.5 hours yesterday before cancelling it.
It's pretty clear that the planner is making horrid misestimates and
picking a terrible plan. I would appreciate any advice for getting this
into a better state.
Here are the explain plans:
When config_id is a varchar, it executes in 42 seconds:
http://explain.depesz.com/s/wuf
When config_id is a smallint, it runs too long to allow to complete, but
clearly the plan is bad:
http://explain.depesz.com/s/u5P
Here is the query, along with rowcounts and schema of every table involved
in the query:
http://pgsql.privatepaste.com/c66fd497c9
PostgreSQL version is 8.4, and most of our GUC's are default.
Thanks in advance for any suggestions.
functions, because the (SQL language) functions took an int argument used
it in a where clause against a column (config_id) that was stored in
varchar format, leading to an inefficient casting when the query was
parameterized. We could work around that with (select $3::text) instead of
just $3, but since the data is actually all numbers under 65k, we altered
the data type of the column to smallint, rather than editing a boatload of
functions with a hacky workaround.
For most functions, this fixed the problem.
However, it had a drastically-negative impact on the query in question,
which was originally taking 2 minutes, 45 seconds. After adding a couple
indexes with the config_id still as a varchar, that time is reduced down to
42 seconds. However when the data type is smallint, the query runs for
many hours - I let it run for 4.5 hours yesterday before cancelling it.
It's pretty clear that the planner is making horrid misestimates and
picking a terrible plan. I would appreciate any advice for getting this
into a better state.
Here are the explain plans:
When config_id is a varchar, it executes in 42 seconds:
http://explain.depesz.com/s/wuf
When config_id is a smallint, it runs too long to allow to complete, but
clearly the plan is bad:
http://explain.depesz.com/s/u5P
Here is the query, along with rowcounts and schema of every table involved
in the query:
http://pgsql.privatepaste.com/c66fd497c9
PostgreSQL version is 8.4, and most of our GUC's are default.
Thanks in advance for any suggestions.
--
Casey Allen Shobe
***@shobe.info
Casey Allen Shobe
***@shobe.info