The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
SYNOPSIS

    From your Perl program:

     use File::RsyBak qw(backup);
     backup(
         source    => '/path/to/mydata',
         target    => '/backup/mydata',
         histories => [-7, 4, 3],         # 7 days, 4 weeks, 3 months
         extra_rsync_opts => [qw/--exclude Cache --exclude cache --exclude tmp --exclude temp/],
     );

    Or, just use the provided script from the command-line:

     % rsybak --source /path/to/mydata --target /backup/mydata --extra-rsync-opts-json '["--exclude","Cache","--exclude","cache","--exclude","tmp","--exclude","temp"]'

    Or, if you have this configuration in /etc/rsybak.conf or
    ~/rsybak.conf:

     [system]
     source = /path/to/mydata
     target = /backup/mydata
     ; also specify as JSON
     extra_rsync_opts = ["--exclude","Cache","--exclude","cache","--exclude","tmp","--exclude","temp"]

    you can just run:

     % rsybak --config-profile system

    Example resulting backup (after several runs so that backup history has
    accumulated):

     % ls /path/to/mydata/
     myfile
     anotherfile
     mydir/
    
     % ls /backup/mydata/
     current/
     hist.2013-10-31@12:04:17+00/
     hist.2013-11-01@12:09:31+00/
     hist.2013-11-02@12:09:41+00t/
     hist.2013-11-03@12:15:02+00/
     hist.2013-11-04@12:13:19+00/
     hist.2013-11-05@12:11:31+00/
     hist2.2013-10-08@12:07:50+00/
     hist2.2013-10-15@12:06:03+00/
     hist2.2013-10-21@12:02:42+00/
     hist2.2013-10-27@12:06:25+00t/
     hist3.2013-06-25@12:15:39+00/
     hist3.2013-08-31@12:05:31+00/
     hist3.2013-10-02@12:05:57+00/

    Each directory under /backup/mydata is a "snapshot" backup of
    /path/to/mydata:

     % ls /backup/mydata/current/
     myfile
     anotherfile
     mydir/
    
     % ls /backup/mydata/hist.2013-10-31@12:04:17+00/
     myfile
     anotherfile
     mydir/
    
     % ls /backup/mydata/hist3.2013-10-02@12:05:57+00/
     myfile
     anotherfile
     mydir/
     someoldfile

DESCRIPTION

    This module is basically just a wrapper around rsync to create a
    filesystem backup system. Some characteristics of this backup system:

      * Supports backup histories and history levels

      For example, you can create 7 level-1 backup histories (equals 7
      daily histories if you run backup once daily), 4 level-2 backup
      histories (equals 4 weekly histories) and 3 level-3 backup histories
      (roughly equals 3 monthly histories). The number of levels and
      history per levels are customizable.

      * Backups (and histories) are not compressed/archived ("tar"-ed)

      They are just verbatim copies (produced by "rsync -a") of source
      directory. The upside of this is ease of cherry-picking
      (taking/restoring individual files from backup). The downside is lack
      of compression and the backup not being a single archive file.

      This is because rsync needs two real directory trees when comparing.
      Perhaps if rsync supports tar virtual filesystem in the future...

      * Hardlinks are used between backup histories to save disk space

      This way, we can maintain several backup histories without wasting
      too much space duplicating data when there are not a lot of
      differences among them.

      * High performance

      Rsync is implemented in C and has been optimized for a long time. rm
      is also used instead of Perl implementation File::Path::remove_path.

      * Unix-specific

      There are ports of rsync and rm on Windows, but this module hasn't
      been tested on those platforms.

HOW IT WORKS

 First-time backup

    First, we lock target directory to prevent other backup process from
    interfering:

     mkdir -p TARGET
     flock    TARGET/.lock

    Then we copy source to temporary directory:

     rsync    SRC            TARGET/.tmp

    If copy finishes successfully, we rename temporary directory to final
    directory 'current':

     rename   TARGET/.tmp    TARGET/current
     touch    TARGET/.current.timestamp

    If copy fails in the middle, TARGET/.tmp will still be lying around and
    the next backup run will just continue the rsync process:

     rsync    SRC            TARGET/.tmp

    Finally, we remove lock:

     unlock   TARGET/.lock

 Subsequent backups (after TARGET/current exists)

    First, we lock target directory to prevent other backup process to
    interfere:

     flock    TARGET/.lock

    Then we rsync source to target directory (using
    --link-dest=TARGET/current):

     rsync    SRC            TARGET/.tmp

    If rsync finishes successfully, we rename target directories:

     rename   TARGET/current TARGET/hist.<timestamp>
     rename   TARGET/.tmp    TARGET/current
     touch    TARGET/.current.timestamp

    If rsync fails in the middle, TARGET/.tmp will be lying around and the
    next backup run will just continue the rsync process.

    Finally, we remove lock:

     unlock   TARGET/.lock

 Maintenance of histories/history levels

    TARGET/hist.* are level-1 backup histories. Each backup run will
    produce a new history:

     TARGET/hist.<timestamp1>
     TARGET/hist.<timestamp2> # produced by the next backup
     TARGET/hist.<timestamp3> # and the next ...
     TARGET/hist.<timestamp4> # and so on ...
     TARGET/hist.<timestamp5>
     ...

    You can specify the number of histories (or number of days) to
    maintain. If the number of histories exceeds the limit, older histories
    will be deleted, or one will be promoted to the next level, if a higher
    level is specified.

    For example, with histories being set to [7, 4, 3], after
    TARGET/hist.<timestamp8> is created, TARGET/hist.<timestamp1> will be
    promoted to level 2:

     rename TARGET/hist.<timestamp1> TARGET/hist2.<timestamp1>

    TARGET/hist2.* directories are level-2 backup histories. After a while,
    they will also accumulate:

     TARGET/hist2.<timestamp1>
     TARGET/hist2.<timestamp8>
     TARGET/hist2.<timestamp15>
     TARGET/hist2.<timestamp22>

    When TARGET/hist2.<timestamp29> arrives, TARGET/hist2.<timestamp1> will
    be promoted to level 3: TARGET/hist3.<timestamp1>. After a while,
    level-3 backup histories too will accumulate:

     TARGET/hist3.<timestamp1>
     TARGET/hist3.<timestamp29>
     TARGET/hist3.<timestamp57>

    Finally, TARGET/hist3.<timestamp1> will be deleted after
    TARGET/hist3.<timestamp85> comes along.

HISTORY

    The idea for this module came out in 2006 as part of the Spanel hosting
    control panel project. We need a daily backup system for shared hosting
    accounts that supports histories and cherry-picking. Previously we had
    been using a Python-based script rdiff-backup. It was not very robust,
    the script chose to exit on many kinds of non-fatal errors instead of
    ignoring the errors and continuning backup. It was also very slow: on a
    server with hundreds of accounts with millions of files, backup process
    often took 12 hours or more. After evaluating several other solutions,
    we realized that nothing beats the raw performance of rsync. Thus we
    designed a simple backup system based on it.

    First public release of this module is in Feb 2011. I have since used
    this script in various production servers as well as personal
    PCs/laptops.

FAQ

 How do I exclude some directories?

    Just use rsync's --exclude et al. Pass them to extra_rsync_opts.

 What is a good backup practice (using RsyBak)?

    Just follow the general practice. While this is not a place to discuss
    backups in general, some of the principles are:

      * backup regularly (e.g. once daily or more often)

      * automate the process (else you'll forget)

      * backup to another disk partition and computer

      * verify your backups often (what good are they if they can't be
      restored)

      * make sure your backup is secure (encrypted, correct permission,
      etc)

 How do I restore backups?

    Backups are just verbatim copies of files/directories, so just use
    whatever filesystem tools you like.

 How to do remote backup?

    From your backup host:

     [BAK-HOST]% rsybak --source USER@SRC-HOST:/path --dest /backup/dir

    Or alternatively, you can backup on SRC-HOST locally first, then send
    the resulting backup to BAK-HOST.

SEE ALSO

    File::Backup

    File::Rotate::Backup

    Snapback2, which is a backup system using the same basic principle
    (rsync snapshots), created in as early as 2004 (or earlier) by Mike
    Heins. Do check it out. I wish I had found it first before reinventing
    it in 2006 :-)