6. The Backup Plan

Archiving with tar

  • tar usage:
    # To archive files with tar, use the following syntax:
    $ tar -cf output.tar [SOURCES]
    # List files in archive:
    $ tar -tf archive.tar
    # Print more details when archiving or listing with -v:
    $ tar -tvf archive.tar 
    -rw-rw-r-- shaan/shaan       0 2013-04-08 21:34 file1
    -rw-rw-r-- shaan/shaan       0 2013-04-08 21:34 file2
    # Append files to an archive:
    $ tar -rvf original.tar new_file
    # Extracting files and folders:
    $ tar -xf archive.tar
    # Specify the folder for extraction:
    $ tar -xf archive.tar -C /path/to/extraction_directory
    # Extract only the specified files:
    $ tar -xvf file.tar file1 file4
    # Use stdin and stdout with tar:
    $ tar cvf - files/ | ssh user@example.com "tar xv -C Documents/"
    # Concatenate 2 archives:
    $ tar -Af file1.tar file2.tar
    # Only append newer files (would still duplicate the file in the archive):
    $ tar -uf archive.tar filea
    # Compare the files in the archives with the filesystem:
    $ tar -df archive.tar
    afile: Mod time differs
    afile: Size differs
    # delete file from archive:
    $ tar -f archive.tar --delete file1 file2 ..
    # apply compression on the tar archive:
    -j for bzip2
    -z for gzip
    --lzma for lzma
    -a or --auto-compress: auto algorithm selection based on the extension.
    $ tar acvf archive.tar.gz filea fileb filec
    # exclude set of files from archiving:
    $ tar -cf arch.tar * --exclude "*.txt"
    # or:
    $ cat list
    $ tar -cf arch.tar * -X list
    # Excluding version control directories:
    $ tar --exclude-vcs -czvvf source_code.tar.gz eye_of_gnome_svn
    # Print the total number of bytes written:
    $ tar -cf arc.tar * --exclude "*.txt" --totals
    Total bytes written: 20480 (20KiB, 12MiB/s)

Archiving with cpio

  • cpio takes input filenames from stdin and writes archive into stdout:
    $ echo file1 file2 file3 | cpio -ov > archive.cpio
    # to list content:
    $ cpio -it < archive.cpio
    # To extract files:
    $ cpio -id < archive.cpio

Compressing data with gzip

  • To compress with gzip:
    $ gzip filename
    $ ls
    # to extract:
    $ gunzip filename.gz
    # read from stdin and write to stdout:
    $ cat file | gzip -c > file.gz
  • Create gzip tarball:
    $ tar -czvvf archive.tar.gz [FILES]
    # or:
    $ tar -cavvf archive.tar.gz [FILES]
  • we can use bzip2 and bunzip2 in a similar way.
  • We can use lzma and unlzma in a similar way:
    $ tar -xjvf archive.tar.bz2
    $ tar -cvvf --lzma archive.tar.lzma [FILES]
    $ tar -xvvf --lzma archive.tar.lzma -C extract_directory

Archiving and compressing with zip

  • zip usage:
    # To archive with zip:
    $ zip archive_name.zip [SOURCE FILES/DIRS]
    # To archive recursively:
    $ zip -r archive.zip folder1 folder2
    # To extract:
    $ unzip file.zip
    # To update with newer files:
    $ zip file.zip -u newfile
    # To delete a file from the archive:
    $ zip -d arc.zip file.txt
    # To list archived files:
    $ unzip -l archive.zip

Faster archiving with pbzip2

  • pbzip2 can use multiple cores to do the compression.
  • usage:
    $ pbzip2 myfile.tar
    # or
    tar cf myfile.tar.bz2 --use-compress-prog=pbzip2 dir_to_compress/
    # For extraction:
    $ pbzip2 -dc myfile.tar.bz2 | tar x
    # or:
    $ pbzip2 -d myfile.tar.bz2
    # Specify the number of CPUs:
    pbzip2 -p4 myfile.tar

Creating filesystems with compression

  • squashfs is an heavy-compression based read-only filesystem.
  • We need squashfs-tools to create squashfs files.
  • Usage:
    # To create a file:
    $ mksquashfs SOURCES compressedfs.squashfs
    # To mount a squashfs we use loopback:
    # mkdir /mnt/squash
    # mount -o loop compressedfs.squashfs /mnt/squash
    # Excluding files while creating the sqaushfs file:
    $ sudo mksquashfs /etc test.squashfs -e /etc/passwd /etc/shadow
    # or:
    $ cat excludelist
    $ sudo mksquashfs /etc test.squashfs -ef excludelist

Backup snapshots with rsync

  • Usage:
    # Copy a source directory to destination:
    $ rsync -av source_path destination_path
    For example,
    $ rsync -av /home/slynux/data slynux@
    # -a: archiving
    # -v: verbose
    # backup to remote server with ssh:
    $ rsync -av source_dir username@host:PATH
    # compressing data during transfer:
    $ rsync -avz source destination
    => if there is a / at the end of the source folder rsync will copy contents of 
    that end directory specified in the source_path to the destination.
    => If / is not present at the end of the source, rsync will copy that end 
    directory itself to the destination.
    # exclude files while archiving with the --exclude PATTERN flag,
    # or use a file with: --exclude-from FILEPATH
    # delete non-existent files while updating:
    $ rsync -avz SOURCE DESTINATION --delete
    # schedule backups at intervals:
    $ crontab -ev
    0 */10 * * * rsync -avz /home/code user@IP_ADDRESS:/home/backups
    => will backup every 10 hours.

Version control-based backup with Git

  • Build git repository:
    $ mkdir -p /home/backups/backup.git
    $ cd /home/backups/backup.git
    $ git init --bare
  • On a source machine, specify the git config:
    $ git config --global user.name  "Sarath Lakshman"
    $ git config --global user.email slynux@slynux.com
  • Backup on intervals:
    Add in crontab:
    0 */5 * * *  /home/data/backup.sh
    # create the script:
    #!/bin/ bash
    cd /home/data/source
    git add .
    git commit -am "Backup taken at @ $(date)"
    git push
  • To revert back:
    $ git checkout 3131f9661ec1739f72c213ec5769bc0abefa85a9
    # to make this permanent:
    $ git commit -am "Restore @ $(date) commit ID: 3131f9661ec1739f72c213ec5769bc0abefa85a9"

Creating entire disk images using fsarchiver

  • Usage:
    fsarchiver savefs backup.fsa /dev/sda1
    # backup multiple partitions:
    fsarchiver savefs backup.fsa /dev/sda1 /dev/sda2
    # restore a partition with:
    fsarchiver restfs backup.fsa id=0,dest=/dev/sda1
    # restore multiple partitions:
    fsarchiver restfs backup.fsa id=0,dest=/dev/sda1 id=1,dest=/dev/sdb1