Discussion:
xmlconcat performance
(too old to reply)
Merlin Moncure
2013-02-28 17:48:05 UTC
Permalink
I got a problem with the performance of a PL/PGsql stored procedure
outputting an xml.
Server version: PostgreSQL 8.3.6 on i686-pc-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)
RAM installed: 4GB
Hard Disk: Seagate 500Gb SATA 2
This is a simplified content of the function showing the xmlconcat
behaviour.
CREATE OR REPLACE FUNCTION test_function (v_limit int)
RETURNS xml AS
$BODY$
DECLARE
v_xml xml;
BEGIN
FOR i IN 1..v_limit LOOP
v_xml := xmlconcat(v_xml, xmlelement(name content, 'aaaaaaa'));
END LOOP;
RETURN v_xml ;
END
$BODY$
LANGUAGE 'plpgsql' SECURITY DEFINER ;
As long as the v_limit parameter grows (and then the size of the output xml,
the time needed increase exponentially.
pang=# explain analyze select test_function(1000);
QUERY PLAN
--------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=65.430..65.431 rows=1
loops=1)
Total runtime: 65.457 ms
(2 rows)
pang=# explain analyze select test_function(5000);
QUERY PLAN
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=473.318..473.318
rows=1 loops=1)
Total runtime: 473.340 ms
(2 rows)
pang=# explain analyze select test_function(15000);
QUERY PLAN
------------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=4044.903..4044.904
rows=1 loops=1)
Total runtime: 4044.928 ms
(2 rows)
pang=# explain analyze select test_function(50000);
QUERY PLAN
--------------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=94994.337..94994.369
rows=1 loops=1)
Total runtime: 94994.396 ms
(2 rows)
I already tried to update to 8.3.23 service version but i didn't see any
improvement.
Do you have any suggestion about how to increase the performance of
xmlconcat?
My need is to use stored procedures that calls xmlconcat more than 50000
times, but it is unacceptable 94 seconds to complete the job.
Thanks in advance
typically for high performance string manipulation you have to do
things on more purely textual level and manipulate through arrays to
get really good performance. iterative string concatenation is
typically wrong approach -- you have to think in set terms.

also your database version is obsolete -- time to start thinking about upgrade.

merlin
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Merlin Moncure
2013-03-01 14:09:25 UTC
Permalink
Post by Merlin Moncure
I got a problem with the performance of a PL/PGsql stored procedure
outputting an xml.
Server version: PostgreSQL 8.3.6 on i686-pc-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)
RAM installed: 4GB
Hard Disk: Seagate 500Gb SATA 2
This is a simplified content of the function showing the xmlconcat
behaviour.
CREATE OR REPLACE FUNCTION test_function (v_limit int)
RETURNS xml AS
$BODY$
DECLARE
v_xml xml;
BEGIN
FOR i IN 1..v_limit LOOP
v_xml := xmlconcat(v_xml, xmlelement(name content, 'aaaaaaa'));
END LOOP;
RETURN v_xml ;
END
$BODY$
LANGUAGE 'plpgsql' SECURITY DEFINER ;
As long as the v_limit parameter grows (and then the size of the output xml,
the time needed increase exponentially.
pang=# explain analyze select test_function(1000);
QUERY PLAN
--------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=65.430..65.431 rows=1
loops=1)
Total runtime: 65.457 ms
(2 rows)
pang=# explain analyze select test_function(5000);
QUERY PLAN
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=473.318..473.318
rows=1 loops=1)
Total runtime: 473.340 ms
(2 rows)
pang=# explain analyze select test_function(15000);
QUERY PLAN
------------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual
time=4044.903..4044.904
rows=1 loops=1)
Total runtime: 4044.928 ms
(2 rows)
pang=# explain analyze select test_function(50000);
QUERY PLAN
--------------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual
time=94994.337..94994.369
rows=1 loops=1)
Total runtime: 94994.396 ms
(2 rows)
I already tried to update to 8.3.23 service version but i didn't see any
improvement.
Do you have any suggestion about how to increase the performance of
xmlconcat?
My need is to use stored procedures that calls xmlconcat more than 50000
times, but it is unacceptable 94 seconds to complete the job.
Thanks in advance
typically for high performance string manipulation you have to do
things on more purely textual level and manipulate through arrays to
get really good performance. iterative string concatenation is
typically wrong approach -- you have to think in set terms.
also your database version is obsolete -- time to start thinking about upgrade.
merlin
Thank you for the reply Merlin but i don't fully get what you mean. (sorry,
i'm not a PostgreSQL expert)
How would you change the above example function in order to improve
performance?
What do you mean with "manipulate through arrays"?
well arrays, or simple aggregation. for example:
select string_agg(v, '') from (select 'aaaaaaa'::text as v from
generate_series(1,50000)) q;

runs in ~ 30 ms.

merlin
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Loading...