Automating rsync over ssh

Consider the problem of automatically copying files from a remote server to a local machine. The following are the requirements:

The first three requirements are fairly obvious to meet & setup. It is the last one which took me some time to figure out.

I use key based authentication for ssh connections wherever possible. I also encrypt the private key using a passphrase. Here’s a guide that walks you through the setup. In normal use, I have to enter the passphrase only once per login session. Thereafter, it is cached using ssh-agent/gnome keyring. Thus, I never have to enter the remote server password or my private key passphrase when opening ssh connections to remote machines.

Crucially, when using rsync over ssh, it picks up the keys too & works without requiring further password/passphrase input from me! So it is clearly possible to automate rsync over ssh without requiring manual input.

The problem however comes when trying to run rsync in a cron job. As it happens, cron runs jobs in a restricted environment which doesn’t have access to the regular user’s environment. Thus, whatever magic enables rsync to talk to ssh-agent to get the private keys doesn’t work in a cron invoked shell.

What is that magic? Turns out, ssh-agent creates a local socket over which clients can talk to it. It exports the environment variable SSH_AUTH_SOCK to point to the socket file. On Ubuntu, and perhaps other flavours, I see this:

antrix@cellar:~$ env | grep SSH_AUTH_SOCK
SSH_AUTH_SOCK=/tmp/keyring-IvaiHt/ssh

As you can see, the socket is created in a randomly named temp directory that changes every login session. So to make rsync work in a cron environment, we just have to setup the SSH_AUTH_SOCK variable correctly and then rsync should be able to connect.

This is how I do it:

#!/bin/bash
# syncfiles.sh - run this script from a cron job to sync files
RUNNING=`ps -ef | grep 'rsync -av' | grep -v grep`
if [ -z $RUNNING ]; then
    export SSH_AUTH_SOCK=`find /tmp/keyring-*/ -type s -user antrix -group antrix -name ssh`
    echo "================" >>/tmp/rsync.log
    echo `date +"%D %T"` >>/tmp/rsync.log
    rsync -av antrix@example.com:/path/to/sync/ /path/to/local/ >>/tmp/rsync.log 2>&1
    echo "================" >>/tmp/rsync.log
else
    exit 0
fi

Since the directory in which the socket will be created will vary with each session, we need to first find the correct file & then export it in the environment. The check for $RUNNING ensures that only one instance of the script runs at a time.

With the script in place, setup a cron job as usual to run syncfiles.sh and everything should just work.

There’s still the matter of entering the passphrase the first time to unlock the private keys. Thankfully, Ubuntu sets up keyring that prompts me to enter the passphrase as soon as the script is executed for the first time. Since I rarely shutdown this machine, even that task is done only once in a blue moon.