Mini-HOWTO: Fixing rsync on Tiger (Mac OS X 10.4.x)

Introduction

Now the horse has left the barn, I decided to finally implement that backup system I'd been thinking about for ages. Disk crashes can be great motivators.

Goal

Producing a working network backup / cloning system for Mac OS X systems. The system can be used for local backups as well, for example to FireWire disks.

Problems

Many files on HFS+, the Mac's most common file system, have metadata. This is partly a leftover from the past (resource forks), and partly a new development (ACLs, extended attributes). Plain rsync doesn't (yet) cope with this metadata.

Since OS X 10.4 (aka Tiger) the MacOS ships with a modified version of rsync. An added option, -E, enables the transfer of extended attributes. This is done by encapsulating the resource fork, Finder data et al in a synthetic file which is added to the rsync transfer list. The name of this file is formed by prepending ._ to the name of the original file, a technique which is also used when copying data from HFS+ partitions to non-Apple file systems such as NFS mounts. It may not be pretty or foolproof (what happens when both foo and ._foo exist?), but at least it's documented by Apple and not likely to change in the very near future. This rsync derivative is based on rsync-2.6.3.

However, Googling and testing have revealed four problems with Apple's rsync. In order of severity, worst first:

  1. The rsync sender will frequently crash with a Bus Error / Segmentation Fault after generating the file list, but before transferring any files. This turns out to be caused by a buffer overrun.
  2. When used with the --delete option, the rsync receiver will try to unlink the (fake) synthetic files, flooding the syslog with failure reports, possibly filling the entire boot disk.
  3. When files with extended attributes are transferred, the modification time will be set to the time of the transfer, even when the user has specified that modification times be preserved. As a result, using mtime to determine whether a file has changed is broken.
  4. Extended attributes have no modification time of themselves. Since a file's mtime is not updated when its attributes are changed, only checksumming can be used to determine whether attribute data needs to be transferred. With default settings, this means that ALL extended attributes are ALWAYS copied.
  5. Update 20051215: I have received a report that, even with this patch, ACLs are not backed up properly. I don't use ACLs, and I haven't verified this claim.

The patch

Problems 1-3 are fixed by this patch. This patch is released under version 2 of the GNU GPL. I know of no fix for problem 4, but consider it mostly an annoyance.

Putting it all together

This requires familiarity with the Terminal. I have no .dmg or whatnot, since I wouldn't know how to create one (and there are licensing issues, see below). Following these steps should get you a working rsync, though.
  1. Update Tiger to 10.4.3 or later. Install XCode, the Apple developer tools. If you don't have the disc (it's shipped with the Tiger install media), you can get the latest version from Apple's developer website (free registration required).
  2. Get the sources. Open the terminal, and type:
    	mkdir rsync-build
    	cd rsync-build
    	curl -O http://www.opensource.apple.com/darwinsource/10.4.3/rsync-20/rsync-2.6.3.tar.gz
    	curl -O http://www.opensource.apple.com/darwinsource/10.4.3/rsync-20/patches/EA.diff
    	curl -O http://www.opensource.apple.com/darwinsource/10.4.3/rsync-20/patches/PR-3945747-endian.diff
    	curl -O http://www.lartmaker.nl/rsync/rsync-tiger-fixes.diff
    
  3. If you don't already have it, install copyfile.h in /usr/include . Get it from Apple's developer website http://www.opensource.apple.com/darwinsource/10.4.3/Libc-391.2.3/darwin/copyfile.h (again, free registration required). In the Terminal:
    	sudo mv -n copyfile.h /usr/include
    
    Copying to /usr/include requires root privileges; enter your password when prompted. The '-n' option to mv makes sure that you don't overwrite a (newer) installed version.

    NOTE: copyfile.h is NOT licensed under the GPL, but rather under the Apple Public Source Licence. You may want to review this license; I Am Not A Lawyer so I cannot say and will not speculate on how this affects your rights.

  4. Unpack the rsync source, and apply the patches. In the Terminal:
  5. 	tar zxf rsync-2.6.3.tar.gz
    	cd rsync-2.6.3
    	patch -p0 < ../EA.diff
    	patch -p0 < ../PR-3945747-endian.diff
    	patch -p0 < ../rsync-tiger-fixes.diff
    
  6. Configure and make rsync:
    	./configure --enable-ea-support
    	make
    
  7. You now have a patched rsync binary. If you're feeling brave, you can replace the Apple-supplied version with it (sudo cp -f rsync /usr/bin). Myself, I'd suggest installing it in /usr/local/bin (the default) by doing:
    	sudo make install
    
Note that this procedure is for a plain XCode install. If you're using Fink you'll need to change bits (but then, you'll probably know how).

As is documented on other sites, you'll want to make sure that the target drive has 'Ignore Ownership on This Volume' disabled (Finder:Get Info on the disk, the button is under the 'Ownership & Permissions' - tab). Also, it helps to turn Spotlight off for the target volume.

Bottom line

It Works For Me. I've run a few tests, both full and incremental, with ~60GB in just over half a million files with creation dates going back to 1994 (Pathways into Darkness, anyone ?). With rsync installed in Server mode (see the man pages) on a Mac mini, a no-changes full filesystem 'incremental' backup takes 45 minutes over Airport Extreme (and less over Ethernet), during which both machines are still mostly responsive. For reference, my command line is (all on one line!):
  sudo /usr/local/bin/rsync -aREx --delete --exclude='.Spotlight-*' --exclude '/private/var/vm/*' \
    / [IP-address of Mac mini]::PowerBookBackup
I have successfully booted the Mac mini from the resulting disk clone. Although I haven't stress-tested the system, all looked well (I could open Photoshop and iTunes with no problems). A similar procedure should work to an external disk attached to the source computer, although I haven't tested that configuration.

So why didn't I just use RsyncX ? Googling revealed some (perceived?) compatibility issues between RsyncX and Tiger. Besides, RsyncX only works between Macs, and I really want to use my 1.5TB RAID-5 Linux box as backup target.

About the rsync -H option: there have been rumors of incompatibility with OSX. I'll have to find out; however, on my PowerBook's boot drive only 4050 of the >500000 files have a link count greater than one. Update 20051130: I've tried it, and it seems to work fine.

These bugs and fixes have been reported to Apple.

What's next

Getting rsync-on-X to play nice with rsync-on-Linux. Update 20051218: Works too. Using the same patched rsync source on Linux, I have backed up my entire OS X boot disk to a Linux server, then restored the files from the Linux server to a FireWire drive connected to my Mac. Booting the Mac from this copy works, and I can launch applications with no noticeable issues.

Porting all patches to rsync-2.6.6.

Watch this space!

DISCLAIMER

No warranties whatsoever. Do not ever trust a backup system you haven't thoroughly tested. I know that I claim it's working, but I might be lying or hallucinating (or misunderstanding important bits).

I have tried to include all information which I had hoped to find in one place when I started this journey a few days ago. Hope it's of some use to others.


JDB. [as they say: you'll get experience right after you needed it]