Automating rsync over ssh
Consider the problem of automatically copying files from a remote server to a local machine. The following are the requirements:
- use
rsync
to copy files. - use
ssh
to connect to the remote server. - use
cron
to automate periodic syncs. - keep it secure by using ssh keys based authentication.
The first three requirements are fairly obvious to meet & setup. It is the last one which took me some time to figure out.
I use key based authentication for ssh connections wherever possible. I also encrypt the private key using a passphrase. Here’s a guide that walks you through the setup. In normal use, I have to enter the passphrase only once per login session. Thereafter, it is cached using ssh-agent/gnome keyring. Thus, I never have to enter the remote server password or my private key passphrase when opening ssh
connections to remote machines.
Crucially, when using rsync
over ssh
, it picks up the keys too & works without requiring further password/passphrase input from me! So it is clearly possible to automate rsync
over ssh
without requiring manual input.
The problem however comes when trying to run rsync
in a cron
job. As it happens, cron
runs jobs in a restricted environment which doesn’t have access to the regular user’s environment. Thus, whatever magic enables rsync
to talk to ssh-agent
to get the private keys doesn’t work in a cron
invoked shell.
What is that magic? Turns out, ssh-agent
creates a local socket over which clients can talk to it. It exports the environment variable SSH_AUTH_SOCK
to point to the socket file. On Ubuntu, and perhaps other flavours, I see this:
antrix@cellar:~$ env | grep SSH_AUTH_SOCK
SSH_AUTH_SOCK=/tmp/keyring-IvaiHt/ssh
As you can see, the socket is created in a randomly named temp directory that changes every login session. So to make rsync
work in a cron
environment, we just have to setup the SSH_AUTH_SOCK
variable correctly and then rsync should be able to connect.
This is how I do it:
#!/bin/bash
# syncfiles.sh - run this script from a cron job to sync files
RUNNING=`ps -ef | grep 'rsync -av' | grep -v grep`
if [ -z $RUNNING ]; then
export SSH_AUTH_SOCK=`find /tmp/keyring-*/ -type s -user antrix -group antrix -name ssh`
echo "================" >>/tmp/rsync.log
echo `date +"%D %T"` >>/tmp/rsync.log
rsync -av antrix@example.com:/path/to/sync/ /path/to/local/ >>/tmp/rsync.log 2>&1
echo "================" >>/tmp/rsync.log
else
exit 0
fi
Since the directory in which the socket will be created will vary with each session, we need to first find
the correct file & then export it in the environment. The check for $RUNNING
ensures that only one instance of the script runs at a time.
With the script in place, setup a cron job as usual to run syncfiles.sh
and everything should just work.
There’s still the matter of entering the passphrase the first time to unlock the private keys. Thankfully, Ubuntu sets up keyring
that prompts me to enter the passphrase as soon as the script is executed for the first time. Since I rarely shutdown this machine, even that task is done only once in a blue moon.