Skip to Content

PostgreSQL 8.4 Database Backups with pg_dump and cron

Do a Google Search for "postgresql backup" and you will find tons of information. Now add the word "cron" onto that list and try to make heads or tails of the results. You find lots of information, but none of it with clear concise "do this" type instructions (at least none that I found). So, here's a description of the problem, and the solution I found to solve it.

The problem

When you execute pg_dump, it will by default run as the currently logged in user. So this means your backup scripts may run from the command line just fine, but fail utterly when run as a scheduled task. Even if you specify what user to use for the command with the "-U" option, you still need to enter that user's password. The automated task runs without user input, so that password doesn't get entered. PostgreSQL does not support passing the password on the command line.


Update: There is support for passing credentials as environment variables, but I haven't thoroughly tested this yet.

So the problem then is how to run a scheduled task that will execute pg_dump with the correct database user AND provide the password.

To solve this problem we have to explore a little deeper into PostgreSQL.

The solution

Some of the Google solutions previously mentioned suggest using environment variables. I find this a little messy, and more importantly - couldn't get it working.

Some of those solutions suggest setting up the cron job under the "postgres" user's crontab. Again, I couldn't get this working without messing around with the pg_hba.conf configurations, which introduces some possible security issues.

The best method (i.e. the one that worked for me) was to make use of a .pgpass file. This is a simple file that contains authentication information in this form:

<server>:<port>:<database>:<username>:<password>

This file specifies what credentials to use if they are not specified. Multiple entries can exist in the file, and the wildcard character "*" is supported. The first line that matches the connection being attempted is used. So a "real" line may be:

localhost:*:sampledatabase:bob:supersecret

In my case, I had to play with the server and port values to get things working.

And this works without requiring any changes to the pg_hba.conf file. (assuming you are already set up to allow users on the local network authenticated access...)

There is one other step. The file MUST be readable ONLY by the user running the commands. For Linux that is read/write for the owner, nothing for group and other - or a mask of 600. If you fail to set this you'll be told about the permission error and the .pgpass file is not used.

So, because we are running a cron job, and cron runs as root, we need to create a /root/.pgpass file. And then execute the command "chmod 600 /root/.pgpass" (as the root user).

Once the file is created and has the proper permissions, you should be able to execute the following as the root user:

pg_dump sampledatabse > backup.sql

That should run even if there is no "root" database user.

Even safer

I took another step to help make things secure, and one to help with the maintenance.

First, I created a special user that is only to be used for backup scripts. I used this command:

sudo -u postgres createuser -s -P backupscripts

Yep, I named my user "backupscripts", because I'm soooo original. I assigned the password, and made sure the user is a super user (needed for pg_dump!). This user information is NOT given out and only the sysadmins know about it. This way, if we ever suspect the user account is compromised, we can just change it's password, or delete the user without affecting any of the other database users. So, instead of running "pg_dump sampledatabase > backup.sql", I modified this to :

pg_dump -U backupscripts sampledatabase > backup.sql

Regarding maintenance... This is the sort of thing that will be easily forgotten once it is running. So, make sure you document the setup. In my case, I simply added a note to my backup script that explained the need for the .pgpass file, with a link to pertinent documentation. Anyone trying to troubleshoot backup issues will eventually take a look at that script. This is all covered in the installation instructions for the application as well.

Conclusion

Making use of the .pgpass feature of PostgreSQL is easy, though not well covered. But, it can quickly get you back to "it just works" with only a minor distraction.

Now that database backups can be automated and scripted reliably, you can focus on other issues - such as what other arguments you may want to pass to pg_dump. I suggest taking a look at using the formatted output (that can then be used with pg_restore and allows a few more options to you!). Running "man pg_dump", or even "pg_dump --help" will give you a list of the possibilities. You'll need to find the ones that work for your situation.