===== 6. The Backup Plan =====
==== Archiving with tar ====
* tar usage: # To archive files with tar, use the following syntax:
$ tar -cf output.tar [SOURCES]
# List files in archive:
$ tar -tf archive.tar
file1
file2
# Print more details when archiving or listing with -v:
$ tar -tvf archive.tar
-rw-rw-r-- shaan/shaan 0 2013-04-08 21:34 file1
-rw-rw-r-- shaan/shaan 0 2013-04-08 21:34 file2
# Append files to an archive:
$ tar -rvf original.tar new_file
# Extracting files and folders:
$ tar -xf archive.tar
# Specify the folder for extraction:
$ tar -xf archive.tar -C /path/to/extraction_directory
# Extract only the specified files:
$ tar -xvf file.tar file1 file4
# Use stdin and stdout with tar:
$ tar cvf - files/ | ssh user@example.com "tar xv -C Documents/"
# Concatenate 2 archives:
$ tar -Af file1.tar file2.tar
# Only append newer files (would still duplicate the file in the archive):
$ tar -uf archive.tar filea
# Compare the files in the archives with the filesystem:
$ tar -df archive.tar
afile: Mod time differs
afile: Size differs
# delete file from archive:
$ tar -f archive.tar --delete file1 file2 ..
# apply compression on the tar archive:
-j for bzip2
-z for gzip
--lzma for lzma
-a or --auto-compress: auto algorithm selection based on the extension.
$ tar acvf archive.tar.gz filea fileb filec
# exclude set of files from archiving:
$ tar -cf arch.tar * --exclude "*.txt"
# or:
$ cat list
filea
fileb
$ tar -cf arch.tar * -X list
# Excluding version control directories:
$ tar --exclude-vcs -czvvf source_code.tar.gz eye_of_gnome_svn
# Print the total number of bytes written:
$ tar -cf arc.tar * --exclude "*.txt" --totals
Total bytes written: 20480 (20KiB, 12MiB/s)
==== Archiving with cpio ====
* cpio takes input filenames from stdin and writes archive into stdout: $ echo file1 file2 file3 | cpio -ov > archive.cpio
# to list content:
$ cpio -it < archive.cpio
# To extract files:
$ cpio -id < archive.cpio
==== Compressing data with gzip ====
* To compress with gzip:$ gzip filename
$ ls
filename.gz
# to extract:
$ gunzip filename.gz
# read from stdin and write to stdout:
$ cat file | gzip -c > file.gz
* Create gzip tarball:$ tar -czvvf archive.tar.gz [FILES]
# or:
$ tar -cavvf archive.tar.gz [FILES]
* we can use bzip2 and bunzip2 in a similar way.
* We can use lzma and unlzma in a similar way:
$ tar -xjvf archive.tar.bz2
$ tar -cvvf --lzma archive.tar.lzma [FILES]
$ tar -xvvf --lzma archive.tar.lzma -C extract_directory
==== Archiving and compressing with zip ====
* zip usage:
# To archive with zip:
$ zip archive_name.zip [SOURCE FILES/DIRS]
# To archive recursively:
$ zip -r archive.zip folder1 folder2
# To extract:
$ unzip file.zip
# To update with newer files:
$ zip file.zip -u newfile
# To delete a file from the archive:
$ zip -d arc.zip file.txt
# To list archived files:
$ unzip -l archive.zip
==== Faster archiving with pbzip2 ====
* pbzip2 can use multiple cores to do the compression.
* usage: $ pbzip2 myfile.tar
# or
tar cf myfile.tar.bz2 --use-compress-prog=pbzip2 dir_to_compress/
# For extraction:
$ pbzip2 -dc myfile.tar.bz2 | tar x
# or:
$ pbzip2 -d myfile.tar.bz2
# Specify the number of CPUs:
pbzip2 -p4 myfile.tar
==== Creating filesystems with compression ====
* squashfs is an heavy-compression based read-only filesystem.
* We need **squashfs-tools** to create squashfs files.
* Usage: # To create a file:
$ mksquashfs SOURCES compressedfs.squashfs
# To mount a squashfs we use loopback:
# mkdir /mnt/squash
# mount -o loop compressedfs.squashfs /mnt/squash
# Excluding files while creating the sqaushfs file:
$ sudo mksquashfs /etc test.squashfs -e /etc/passwd /etc/shadow
# or:
$ cat excludelist
/etc/passwd
/etc/shadow
$ sudo mksquashfs /etc test.squashfs -ef excludelist
==== Backup snapshots with rsync ====
* Usage: # Copy a source directory to destination:
$ rsync -av source_path destination_path
For example,
$ rsync -av /home/slynux/data slynux@192.168.0.6:/home/backups/data
# -a: archiving
# -v: verbose
# backup to remote server with ssh:
$ rsync -av source_dir username@host:PATH
# compressing data during transfer:
$ rsync -avz source destination
=> if there is a / at the end of the source folder rsync will copy contents of
that end directory specified in the source_path to the destination.
=> If / is not present at the end of the source, rsync will copy that end
directory itself to the destination.
# exclude files while archiving with the --exclude PATTERN flag,
# or use a file with: --exclude-from FILEPATH
# delete non-existent files while updating:
$ rsync -avz SOURCE DESTINATION --delete
# schedule backups at intervals:
$ crontab -ev
0 */10 * * * rsync -avz /home/code user@IP_ADDRESS:/home/backups
=> will backup every 10 hours.
==== Version control-based backup with Git ====
* Build git repository: $ mkdir -p /home/backups/backup.git
$ cd /home/backups/backup.git
$ git init --bare
* On a source machine, specify the git config: $ git config --global user.name "Sarath Lakshman"
$ git config --global user.email slynux@slynux.com
* Backup on intervals: Add in crontab:
0 */5 * * * /home/data/backup.sh
# create the script:
#!/bin/ bash
cd /home/data/source
git add .
git commit -am "Backup taken at @ $(date)"
git push
* To revert back: $ git checkout 3131f9661ec1739f72c213ec5769bc0abefa85a9
# to make this permanent:
$ git commit -am "Restore @ $(date) commit ID: 3131f9661ec1739f72c213ec5769bc0abefa85a9"
==== Creating entire disk images using fsarchiver ====
* Usage: fsarchiver savefs backup.fsa /dev/sda1
# backup multiple partitions:
fsarchiver savefs backup.fsa /dev/sda1 /dev/sda2
# restore a partition with:
fsarchiver restfs backup.fsa id=0,dest=/dev/sda1
# restore multiple partitions:
fsarchiver restfs backup.fsa id=0,dest=/dev/sda1 id=1,dest=/dev/sdb1