I’ve always been very serious about data safety. I’ve lost data more than once, and so when I moved into the linux world I searched high and low for a backup system that would suit me. At the time, not many options existed, but one that stood out was the rsync + cp with hard links.
To sum up the idea, you make a copy of all the files on your hard drive to a backup system (another computer with a large attached hard drive), then when you want to update the backup, you copy the backup using hard links and then backup only the files that have changed since the previous backup (think snapshots). Let me explain hardlinks.
In linux there are symbolic links and hardlinks. Symbolic links are like most other OSes links… If you have a file and make a symbolic link, if you delete the link the file still exists, if you delete the original file, the link no longer works (its target is missing).
Hardlinks is a way for two (or more) files to ‘point’ to the same data. If I have a file and make a hard link, if I delete the link the file remains, if I delete the file the link still works… the actual file will exist as long as one hardlink points to it, when the last hardlink is removed, the space is regained and the data is released.
Using nothing more than a couple of command line utilities, one can have the most sophisticated incremental backups (like Apple’s time machine, just around for 15-20 years before time machine ever existed): rsync (makes copies of files that have changed), ssh (allows this to work over a network or the internet) and ‘cp -l’ (which makes hardlink copies of files).
I wanted a way to make this a little more automated, idiot proof if you wish, so I wrote a python program called pysshbackup. I’ve been using this program on my systems for some years and it works really well. It is command line only, it has a light menu system (similar to fdisk), and settings are stored in a xml file.
I recently uploaded the code to my github account in case anyone thinks this may be up their alley as well.
As always, because I am part of the open source software community who has given me all the tools I used to create the script, this is open source as well, GPL version2.
The program can be found here: https://github.com/eugenecormier/pysshbackup
I’m not sure if this is useful to anyone, but in the spirit of openness: have at it!
~Eugene
The original information I was using can be found here (last updated 2004!, still relevent): http://www.mikerubel.org/computers/rsync_snapshots/