Discussion:
Why is my pg_xlog directory so huge?
(too old to reply)
Niels Kristian Schjødt
2013-03-18 09:14:52 UTC
Permalink
After installing my new server I just discovered something that doesn't seem right:

sudo du -h /var/lib/postgresql/9.2/main

4.0K /var/lib/postgresql/9.2/main/pg_snapshots
4.0K /var/lib/postgresql/9.2/main/pg_serial
4.0K /var/lib/postgresql/9.2/main/pg_tblspc
29M /var/lib/postgresql/9.2/main/pg_clog
6.8G /var/lib/postgresql/9.2/main/pg_log
104K /var/lib/postgresql/9.2/main/pg_stat_tmp
81G /var/lib/postgresql/9.2/main/base/27132
6.1M /var/lib/postgresql/9.2/main/base/12040
4.0K /var/lib/postgresql/9.2/main/base/pgsql_tmp
6.0M /var/lib/postgresql/9.2/main/base/12035
6.0M /var/lib/postgresql/9.2/main/base/1
81G /var/lib/postgresql/9.2/main/base
80K /var/lib/postgresql/9.2/main/pg_multixact/members
108K /var/lib/postgresql/9.2/main/pg_multixact/offsets
192K /var/lib/postgresql/9.2/main/pg_multixact
12K /var/lib/postgresql/9.2/main/pg_notify
4.0K /var/lib/postgresql/9.2/main/pg_twophase
160K /var/lib/postgresql/9.2/main/pg_subtrans
752K /var/lib/postgresql/9.2/main/pg_xlog/archive_status
202G /var/lib/postgresql/9.2/main/pg_xlog
496K /var/lib/postgresql/9.2/main/global
289G /var/lib/postgresql/9.2/main

As you can see the pg_xlog folder is 202G, which is more than my entire database - this seems wrong to me, however I have no clue why this would happen.

In short, this is my postgresql.conf

data_directory = '/var/lib/postgresql/9.2/main' # use data in another directory
hba_file = '/etc/postgresql/9.2/main/pg_hba.conf' # host-based authentication file
ident_file = '/etc/postgresql/9.2/main/pg_ident.conf' # ident configuration file
external_pid_file = '/var/run/postgresql/9.2-main.pid' # write an extra PID file
listen_addresses = '192.168.0.4, localhost' # what IP address(es) to listen on;
port = 5432 # (change requires restart)
max_connections = 300 # (change requires restart)
unix_socket_directory = '/var/run/postgresql' # (change requires restart)
wal_level = hot_standby # minimal, archive, or hot_standby
synchronous_commit = on # synchronization level; on, off, or local
checkpoint_segments = 100 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 10min # range 30s-1h
checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0
archive_mode = on # allows archiving to be done
archive_command = 'rsync -a %p ***@192.168.0.2:/var/lib/postgresql/9.2/wals/%f </dev/null' # command to use to archive a logfile segment
max_wal_senders = 1 # max number of walsender processes
wal_keep_segments = 32 # in logfile segments, 16MB each; 0 disables
hot_standby = on # "on" allows queries during recovery
log_line_prefix = '%t ' # special values:
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8' # locale for system error message
lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
lc_numeric = 'en_US.UTF-8' # locale for number formatting
lc_time = 'en_US.UTF-8' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
default_statistics_target = 100
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
effective_cache_size = 22GB
work_mem = 160MB
wal_buffers = 4MB
shared_buffers = 4GB
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Magnus Hagander
2013-03-18 09:26:17 UTC
Permalink
On Mon, Mar 18, 2013 at 10:14 AM, Niels Kristian Schjødt
Post by Niels Kristian Schjødt
sudo du -h /var/lib/postgresql/9.2/main
<snip>
Post by Niels Kristian Schjødt
As you can see the pg_xlog folder is 202G, which is more than my entire database - this seems wrong to me, however I have no clue why this would happen.
My first guess would be that your archive_command is failing - so
check your logs for that. If that command fails, no xlog files will
ever be rotated (since it would invalidate your backups).
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Niels Kristian Schjødt
2013-03-18 14:08:42 UTC
Permalink
Okay, thanks. It' seems you were right! Now I have fixed the issue (it was an ssh key).
So I started a:
SELECT pg_start_backup('backup', true);

And when done, I executed a:
sudo -u postgres rsync -av --exclude postmaster.pid --exclude pg_xlog /var/lib/postgresql/9.2/main/ ***@192.168.0.2:/var/lib/postgresql/9.2/main/

Then I tried to finish off the backup by doing a:
SELECT pg_stop_backup();

But It keeps on telling me:
WARNING: pg_stop_backup still waiting for all required WAL segments to be archived (480 seconds elapsed)
HINT: Check that your archive_command is executing properly. pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.

And I could see in the log that it's some kind of permission issue. So I canceled it, and started the streaming replication on my slave, and it seems to work fine. However the pg_xlog dir on the master is still HUGE 153G - so how can I get this mess sorted, and cleaned up that directory?
Post by Magnus Hagander
On Mon, Mar 18, 2013 at 10:14 AM, Niels Kristian Schjødt
Post by Niels Kristian Schjødt
sudo du -h /var/lib/postgresql/9.2/main
<snip>
Post by Niels Kristian Schjødt
As you can see the pg_xlog folder is 202G, which is more than my entire database - this seems wrong to me, however I have no clue why this would happen.
My first guess would be that your archive_command is failing - so
check your logs for that. If that command fails, no xlog files will
ever be rotated (since it would invalidate your backups).
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Magnus Hagander
2013-03-18 14:38:56 UTC
Permalink
On Mon, Mar 18, 2013 at 2:08 PM, Niels Kristian Schjødt
Post by Niels Kristian Schjødt
Okay, thanks. It' seems you were right! Now I have fixed the issue (it was an ssh key).
SELECT pg_start_backup('backup', true);
SELECT pg_stop_backup();
WARNING: pg_stop_backup still waiting for all required WAL segments to be archived (480 seconds elapsed)
HINT: Check that your archive_command is executing properly. pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.
And I could see in the log that it's some kind of permission issue. So I canceled it, and started the streaming replication on my slave, and it seems to work fine. However the pg_xlog dir on the master is still HUGE 153G - so how can I get this mess sorted, and cleaned up that directory?
Once you have your archive_command working, it will transfer all your
xlog to the archive. Once it is, the xlog directory should
automatically clean up fairly quickly.

If you still have a permissions problem with the archive, you
obviously need to fix that first.

If you don't care about your archive you could set your
archive_command to e.g. /bin/true, and that will make it pretend it
has archived the files, and should clean it up quicker. But that will
mean you have no valid archive and thus no valid backups, until you
start over froma new base.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Niels Kristian Schjødt
2013-03-18 15:41:19 UTC
Permalink
Thanks! it worked! :-)
Post by Magnus Hagander
On Mon, Mar 18, 2013 at 2:08 PM, Niels Kristian Schjødt
Post by Niels Kristian Schjødt
Okay, thanks. It' seems you were right! Now I have fixed the issue (it was an ssh key).
SELECT pg_start_backup('backup', true);
SELECT pg_stop_backup();
WARNING: pg_stop_backup still waiting for all required WAL segments to be archived (480 seconds elapsed)
HINT: Check that your archive_command is executing properly. pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.
And I could see in the log that it's some kind of permission issue. So I canceled it, and started the streaming replication on my slave, and it seems to work fine. However the pg_xlog dir on the master is still HUGE 153G - so how can I get this mess sorted, and cleaned up that directory?
Once you have your archive_command working, it will transfer all your
xlog to the archive. Once it is, the xlog directory should
automatically clean up fairly quickly.
If you still have a permissions problem with the archive, you
obviously need to fix that first.
If you don't care about your archive you could set your
archive_command to e.g. /bin/true, and that will make it pretend it
has archived the files, and should clean it up quicker. But that will
mean you have no valid archive and thus no valid backups, until you
start over froma new base.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
--
Sent via pgsql-performance mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Loading...