I tried setting
juju config postgresql wal_e_storage_uri="file://localhost/backups/pg"
That triggered a snap install of wal-e, which was a step forward.
Then I tried to run a few actions.
The last command is just freezing…
I tried setting
juju config postgresql wal_e_storage_uri="file://localhost/backups/pg"
That triggered a snap install of wal-e, which was a step forward.
Then I tried to run a few actions.
The last command is just freezing…
After some discussion in Mattermost, I think that we’ve fixed this. When specifying a backup on the local filesystem, the charm will not make a local dir or setup permissions. That’s okay, as this may be out of scope for the postgres charm (discussion welcome!) For now, the following will get things to work:
# Setup the backup dir
juju ssh postgresql/0
sudo mkdir /var/backups/pg
sudo chown postgres /var/backups/pg
exit
# Set the config
juju config postgresql wal_e_storage_uri="file://localhost/var/backups/pg"
juju run-action postgresql/1 wal-e-backup --wait
@pengale - this is amazing, I’m testing that right now.
Now, the next challenge is the recover part…
I’d written a charm which must relate to a postgresql DB… and does a bit of the things highlighted here like making directories or setting up some basic permissions within the DB… I am eager to see how I can automate the restoration of a backup too
@emcp For sure. The reason I’m doing this now is that I’m in the process of writing backup procedures for the nextcloud charm that uses this postgres charm.
I’ve not yet released the charm, but having a proper backup/restore process in place feels like something important…
Any help greatly appreciated =) Drinks included.
I got stuck a little bit before where you are now automating via WALE … mostly because I was not ready to start digging into Point In Time Restorations and needed to just take a simpler approach with pg_restore
of a .sql
file…
all the helper charm does right now is … acquire db-admin
of the postgresql charm unit… and then from other charm pieces receive requests in the form of .sql snippets granting access downstream of itself
if it’s of interest can def. send it your way… but the db-admin makes me nervous, and I’ve only ever written these routines for LXD clouds
Yeah, the ol’way of just running
sudo -u postgres pg_dump foobar | gzip > foobar.sql.gz
Is legit anyways…
I’ve got 0 experience from wal-e, so I’m still exploring what this really leads me up to.
I’m rolling up a new db now to test a restore from this wal-e now to see where it gets me…
So, I started a new fresh unit.
Copied back the complete /var/backup/pg
Created a new directory /var/backup/restore and fixed permissions…
Then I ran…
juju config postgresql wal_e_storage_uri="file://localhost/var/backups/restore"
… and then I tried the restore action…
erik@pixel:~ > juju run-action postgresql/4 wal-e-restore storage-uri="file://localhost/var/backups/pg" confirm=true --wait
unit-postgresql-4:
UnitId: postgresql/4
id: "27"
results:
Stderr: |
wal_e.main INFO MSG: starting WAL-E
DETAIL: The subcommand is "backup-list".
STRUCTURED: time=2022-01-26T20:07:32.659825-00 pid=44274
wal_e.main INFO MSG: starting WAL-E
DETAIL: The subcommand is "backup-fetch".
STRUCTURED: time=2022-01-26T20:07:36.695406-00 pid=44393
wal_e.worker.file.file_worker INFO MSG: beginning partition download
DETAIL: The partition being downloaded is part_00000000.tar.lzo.
HINT: The absolute file key is var/backups/pg/basebackups_005/base_000000010000000000000006_00000040/tar_partitions/part_00000000.tar.lzo.
STRUCTURED: time=2022-01-26T20:07:36.936420-00 pid=44393
wal_e.main INFO MSG: starting WAL-E
DETAIL: The subcommand is "delete everything".
STRUCTURED: time=2022-01-26T20:07:46.024054-00 pid=44430
wal_e.main INFO MSG: deleting data in the store
STRUCTURED: time=2022-01-26T20:07:46.156494-00 pid=44430
wal_e.worker.base INFO MSG: deleting part of wal logs
DETAIL: The key being deleted is file://localhost/var/backups/restore/wal_005/000000010000000000000001.lzo.
STRUCTURED: time=2022-01-26T20:07:46.164442-00 pid=44430
Job for postgresql@12-main.service failed because the service did not take the steps required by its unit configuration.
See "systemctl status postgresql@12-main.service" and "journalctl -xe" for details.
Stdout: |
lxc
pg_control version number: 1201
Catalog version number: 201909212
Database system identifier: 7057599724181653945
Database cluster state: in production
pg_control last modified: Wed Jan 26 19:43:28 2022
Latest checkpoint location: 0/6000060
Latest checkpoint's REDO location: 0/6000028
Latest checkpoint's REDO WAL file: 000000010000000000000006
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:488
Latest checkpoint's NextOID: 24576
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 479
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 488
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint: Wed Jan 26 19:43:28 2022
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: logical
wal_log_hints setting: off
max_connections setting: 100
max_worker_processes setting: 8
max_wal_senders setting: 10
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 1
Mock authentication nonce: 540e679209ed9eca2d25eba22ac23d86a0485dff5a6b1bfa48c44d7985a86bd0
2022-01-26 19:53:44.292 GMT [31861] LOG: skipping missing configuration file "/etc/postgresql/12/main/juju_recovery.conf"
2022-01-26 19:53:44.292 GMT [31861] LOG: skipping missing configuration file "/etc/postgresql/12/main/juju_recovery.conf"
2022-01-26 19:53:44 UTC [31861]: [3-1] db=,user= LOG: starting PostgreSQL 12.9 (Ubuntu 12.9-0ubuntu0.20.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
2022-01-26 19:53:44 UTC [31861]: [4-1] db=,user= LOG: listening on IPv4 address "0.0.0.0", port 5432
2022-01-26 19:53:44 UTC [31861]: [5-1] db=,user= LOG: listening on IPv6 address "::", port 5432
2022-01-26 19:53:44 UTC [31861]: [6-1] db=,user= LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2022-01-26 19:53:44 UTC [31862]: [1-1] db=,user= LOG: database system was shut down at 2022-01-26 19:53:37 UTC
2022-01-26 19:53:44 UTC [31861]: [7-1] db=,user= LOG: database system is ready to accept connections
2022-01-26 19:53:45 UTC [31870]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:53:45 UTC [31873]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:53:45 UTC [31873]: [2-1] db=template1,user=postgres LOG: connection authorized: user=postgres database=template1 application_name=psql
2022-01-26 19:53:45 UTC [31873]: [3-1] db=template1,user=postgres LOG: disconnection: session time: 0:00:00.014 user=postgres database=template1 host=[local]
2022-01-26 19:53:46 UTC [31876]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:53:46 UTC [31876]: [2-1] db=template1,user=postgres LOG: connection authorized: user=postgres database=template1 application_name=psql
2022-01-26 19:53:46 UTC [31876]: [3-1] db=template1,user=postgres LOG: disconnection: session time: 0:00:00.005 user=postgres database=template1 host=[local]
2022-01-26 19:53:46 UTC [31879]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:53:46 UTC [31879]: [2-1] db=template1,user=postgres LOG: connection authorized: user=postgres database=template1 application_name=psql
2022-01-26 19:53:46 UTC [31879]: [3-1] db=template1,user=postgres LOG: disconnection: session time: 0:00:00.005 user=postgres database=template1 host=[local]
2022-01-26 19:53:46 UTC [31880]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:53:46 UTC [31880]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 19:53:46 UTC [31880]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.008 user=postgres database=postgres host=[local]
2022-01-26 19:53:46 UTC [31901]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:53:46 UTC [31901]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 19:53:46 UTC [31901]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.107 user=postgres database=postgres host=[local]
2022-01-26 19:54:01 UTC [33301]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:54:01 UTC [33301]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 19:54:01 UTC [33301]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.007 user=postgres database=postgres host=[local]
2022-01-26 19:55:01 UTC [35594]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:55:01 UTC [35594]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 19:55:01 UTC [35594]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.009 user=postgres database=postgres host=[local]
2022-01-26 19:56:02 UTC [36998]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:56:02 UTC [36998]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 19:56:02 UTC [36998]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.010 user=postgres database=postgres host=[local]
2022-01-26 19:57:01 UTC [37017]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:57:01 UTC [37017]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 19:57:01 UTC [37017]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.008 user=postgres database=postgres host=[local]
2022-01-26 19:58:01 UTC [37136]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:58:01 UTC [37136]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 19:58:01 UTC [37136]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.007 user=postgres database=postgres host=[local]
2022-01-26 19:58:44 UTC [31863]: [1-1] db=,user= LOG: checkpoint starting: time
2022-01-26 19:58:45 UTC [31863]: [2-1] db=,user= LOG: checkpoint complete: wrote 4 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.434 s, sync=0.026 s, total=0.495 s; sync files=4, longest=0.007 s, average=0.007 s; distance=2 kB, estimate=2 kB
2022-01-26 19:59:01 UTC [37244]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 19:59:01 UTC [37244]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 19:59:01 UTC [37244]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.008 user=postgres database=postgres host=[local]
2022-01-26 20:00:01 UTC [37292]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 20:00:01 UTC [37292]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 20:00:01 UTC [37292]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.007 user=postgres database=postgres host=[local]
2022-01-26 20:01:01 UTC [38825]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 20:01:01 UTC [38825]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 20:01:01 UTC [38825]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.010 user=postgres database=postgres host=[local]
2022-01-26 20:02:01 UTC [38844]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 20:02:01 UTC [38844]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 20:02:01 UTC [38844]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.008 user=postgres database=postgres host=[local]
2022-01-26 20:03:01 UTC [38864]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 20:03:01 UTC [38864]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 20:03:01 UTC [38864]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.008 user=postgres database=postgres host=[local]
2022-01-26 20:04:02 UTC [39562]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 20:04:02 UTC [39562]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 20:04:02 UTC [39562]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.009 user=postgres database=postgres host=[local]
2022-01-26 20:05:01 UTC [41654]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 20:05:01 UTC [41654]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 20:05:01 UTC [41654]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.008 user=postgres database=postgres host=[local]
2022-01-26 20:06:01 UTC [41680]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 20:06:01 UTC [41680]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 20:06:01 UTC [41680]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.011 user=postgres database=postgres host=[local]
2022-01-26 20:07:01 UTC [43092]: [1-1] db=[unknown],user=[unknown] LOG: connection received: host=[local]
2022-01-26 20:07:01 UTC [43092]: [2-1] db=postgres,user=postgres LOG: connection authorized: user=postgres database=postgres
2022-01-26 20:07:01 UTC [43092]: [3-1] db=postgres,user=postgres LOG: disconnection: session time: 0:00:00.009 user=postgres database=postgres host=[local]
2022-01-26 20:07:33 UTC [31861]: [8-1] db=,user= LOG: received fast shutdown request
2022-01-26 20:07:33 UTC [31861]: [9-1] db=,user= LOG: aborting any active transactions
2022-01-26 20:07:33 UTC [31861]: [10-1] db=,user= LOG: background worker "logical replication launcher" (PID 31869) exited with exit code 1
2022-01-26 20:07:33 UTC [31863]: [3-1] db=,user= LOG: shutting down
2022-01-26 20:07:33 UTC [31863]: [4-1] db=,user= LOG: checkpoint starting: shutdown immediate
2022-01-26 20:07:33 UTC [31863]: [5-1] db=,user= LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.024 s, sync=0.001 s, total=0.260 s; sync files=0, longest=0.000 s, average=0.000 s; distance=7754 kB, estimate=7754 kB
2022-01-26 20:07:34 UTC [31861]: [11-1] db=,user= LOG: database system is shut down
2022-01-26 20:07:46.422 GMT [44471] LOG: skipping missing configuration file "/etc/postgresql/12/main/juju_recovery.conf"
2022-01-26 20:07:46.422 GMT [44471] LOG: skipping missing configuration file "/etc/postgresql/12/main/juju_recovery.conf"
2022-01-26 20:07:46 UTC [44471]: [3-1] db=,user= LOG: starting PostgreSQL 12.9 (Ubuntu 12.9-0ubuntu0.20.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
2022-01-26 20:07:46 UTC [44471]: [4-1] db=,user= LOG: listening on IPv4 address "0.0.0.0", port 5432
2022-01-26 20:07:46 UTC [44471]: [5-1] db=,user= LOG: listening on IPv6 address "::", port 5432
2022-01-26 20:07:46 UTC [44471]: [6-1] db=,user= LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2022-01-26 20:07:46 UTC [44473]: [1-1] db=,user= LOG: database system was interrupted; last known up at 2022-01-26 19:43:28 UTC
2022-01-26 20:07:46 UTC [44473]: [2-1] db=,user= LOG: creating missing WAL directory "pg_wal/archive_status"
2022-01-26 20:07:46 UTC [44473]: [3-1] db=,user= FATAL: using recovery command file "recovery.conf" is not supported
2022-01-26 20:07:46 UTC [44471]: [7-1] db=,user= LOG: startup process (PID 44473) exited with exit code 1
2022-01-26 20:07:46 UTC [44471]: [8-1] db=,user= LOG: aborting startup due to startup process failure
2022-01-26 20:07:46 UTC [44471]: [9-1] db=,user= LOG: database system is shut down
pg_ctl: could not start server
Examine the log output.
recovery.conf exists at /var/lib/postgresql/12/main/recovery.conf
status: completed
timing:
completed: 2022-01-26 20:07:47 +0000 UTC
enqueued: 2022-01-26 20:07:29 +0000 UTC
started: 2022-01-26 20:07:29 +0000 UTC
… and my unit isn’t so happy.
@erik-lonroth apologies. I’ve been on vacation for the past few days. I did not attempt to do a restore. From those logs, it’s hard to tell whether the charm is doing something that is breaking the backup, or if the backup is failing due to some internal issues (e.g., more permissions stuff).
I’ll try to set aside some time to take a look at it tomorrow …
(mentioning here to include what I mentioned in reply to Erik on chat.charmhub.io)
Looking at https://www.postgresql.org/docs/13/recovery-config.html (and the fact that stub’s blog post was from 2017) I suspect this is a charm bug that manifests with PG12 or greater. I’ll try and find time to confirm that tomorrow (would start by deploying on an older series than focal) and then we can file a bug against the charm if we can confirm this is a PG12 or greater problem.
And in fact I see https://bugs.launchpad.net/postgresql-charm/+bug/1924756 already covers this. I’ll try to find time to confirm it does work on older versions, and then update that bug with details.
Any news on this?
I would say this is a really great thing to get working for a database charm.
I can’t use it unless I can trust restoring works…
Can I help?
Hi Erik,
We’ve got some resources to work on this, so hopefully we’ll have a fix before too long. You could certainly help by verifying this works as expected on older series (bionic should be fine) and then testing again on focal once we’ve got a fix ready - we’ll update here once that’s happened.
We don’t have a specific timeline for that yet, but we did want to let you know someone will be working on it soon.
Thanks, Tom
I will for sure test it out!
I was looking for it and followed this solution. In my case what happened was the the backup went well, but it kept the archive_mode even after the backup was finished. So it continued to fill pg_wal
directory even after wal_e_storage config was set to empty juju config postgresql wal_e_storage_uri=
. It lead to a full disk and PG instance down.
Its not clear to me what is the status on this @mthaddon
Currently, we’re relying on manually creating backups for pgsql.
Anyone who has been able to perform a backup-restore cycle using this charm?
We’ve just published an update to the postgresql charm in the edge channel that should fix this. Can you test and let us know? If so we’ll publish that to the stable channel.