===== 6. The Backup Plan ===== ==== Archiving with tar ==== * tar usage: # To archive files with tar, use the following syntax: $ tar -cf output.tar [SOURCES] # List files in archive: $ tar -tf archive.tar file1 file2 # Print more details when archiving or listing with -v: $ tar -tvf archive.tar -rw-rw-r-- shaan/shaan 0 2013-04-08 21:34 file1 -rw-rw-r-- shaan/shaan 0 2013-04-08 21:34 file2 # Append files to an archive: $ tar -rvf original.tar new_file # Extracting files and folders: $ tar -xf archive.tar # Specify the folder for extraction: $ tar -xf archive.tar -C /path/to/extraction_directory # Extract only the specified files: $ tar -xvf file.tar file1 file4 # Use stdin and stdout with tar: $ tar cvf - files/ | ssh user@example.com "tar xv -C Documents/" # Concatenate 2 archives: $ tar -Af file1.tar file2.tar # Only append newer files (would still duplicate the file in the archive): $ tar -uf archive.tar filea # Compare the files in the archives with the filesystem: $ tar -df archive.tar afile: Mod time differs afile: Size differs # delete file from archive: $ tar -f archive.tar --delete file1 file2 .. # apply compression on the tar archive: -j for bzip2 -z for gzip --lzma for lzma -a or --auto-compress: auto algorithm selection based on the extension. $ tar acvf archive.tar.gz filea fileb filec # exclude set of files from archiving: $ tar -cf arch.tar * --exclude "*.txt" # or: $ cat list filea fileb $ tar -cf arch.tar * -X list # Excluding version control directories: $ tar --exclude-vcs -czvvf source_code.tar.gz eye_of_gnome_svn # Print the total number of bytes written: $ tar -cf arc.tar * --exclude "*.txt" --totals Total bytes written: 20480 (20KiB, 12MiB/s) ==== Archiving with cpio ==== * cpio takes input filenames from stdin and writes archive into stdout: $ echo file1 file2 file3 | cpio -ov > archive.cpio # to list content: $ cpio -it < archive.cpio # To extract files: $ cpio -id < archive.cpio ==== Compressing data with gzip ==== * To compress with gzip:$ gzip filename $ ls filename.gz # to extract: $ gunzip filename.gz # read from stdin and write to stdout: $ cat file | gzip -c > file.gz * Create gzip tarball:$ tar -czvvf archive.tar.gz [FILES] # or: $ tar -cavvf archive.tar.gz [FILES] * we can use bzip2 and bunzip2 in a similar way. * We can use lzma and unlzma in a similar way: $ tar -xjvf archive.tar.bz2 $ tar -cvvf --lzma archive.tar.lzma [FILES] $ tar -xvvf --lzma archive.tar.lzma -C extract_directory ==== Archiving and compressing with zip ==== * zip usage: # To archive with zip: $ zip archive_name.zip [SOURCE FILES/DIRS] # To archive recursively: $ zip -r archive.zip folder1 folder2 # To extract: $ unzip file.zip # To update with newer files: $ zip file.zip -u newfile # To delete a file from the archive: $ zip -d arc.zip file.txt # To list archived files: $ unzip -l archive.zip ==== Faster archiving with pbzip2 ==== * pbzip2 can use multiple cores to do the compression. * usage: $ pbzip2 myfile.tar # or tar cf myfile.tar.bz2 --use-compress-prog=pbzip2 dir_to_compress/ # For extraction: $ pbzip2 -dc myfile.tar.bz2 | tar x # or: $ pbzip2 -d myfile.tar.bz2 # Specify the number of CPUs: pbzip2 -p4 myfile.tar ==== Creating filesystems with compression ==== * squashfs is an heavy-compression based read-only filesystem. * We need **squashfs-tools** to create squashfs files. * Usage: # To create a file: $ mksquashfs SOURCES compressedfs.squashfs # To mount a squashfs we use loopback: # mkdir /mnt/squash # mount -o loop compressedfs.squashfs /mnt/squash # Excluding files while creating the sqaushfs file: $ sudo mksquashfs /etc test.squashfs -e /etc/passwd /etc/shadow # or: $ cat excludelist /etc/passwd /etc/shadow $ sudo mksquashfs /etc test.squashfs -ef excludelist ==== Backup snapshots with rsync ==== * Usage: # Copy a source directory to destination: $ rsync -av source_path destination_path For example, $ rsync -av /home/slynux/data slynux@192.168.0.6:/home/backups/data # -a: archiving # -v: verbose # backup to remote server with ssh: $ rsync -av source_dir username@host:PATH # compressing data during transfer: $ rsync -avz source destination => if there is a / at the end of the source folder rsync will copy contents of that end directory specified in the source_path to the destination. => If / is not present at the end of the source, rsync will copy that end directory itself to the destination. # exclude files while archiving with the --exclude PATTERN flag, # or use a file with: --exclude-from FILEPATH # delete non-existent files while updating: $ rsync -avz SOURCE DESTINATION --delete # schedule backups at intervals: $ crontab -ev 0 */10 * * * rsync -avz /home/code user@IP_ADDRESS:/home/backups => will backup every 10 hours. ==== Version control-based backup with Git ==== * Build git repository: $ mkdir -p /home/backups/backup.git $ cd /home/backups/backup.git $ git init --bare * On a source machine, specify the git config: $ git config --global user.name "Sarath Lakshman" $ git config --global user.email slynux@slynux.com * Backup on intervals: Add in crontab: 0 */5 * * * /home/data/backup.sh # create the script: #!/bin/ bash cd /home/data/source git add . git commit -am "Backup taken at @ $(date)" git push * To revert back: $ git checkout 3131f9661ec1739f72c213ec5769bc0abefa85a9 # to make this permanent: $ git commit -am "Restore @ $(date) commit ID: 3131f9661ec1739f72c213ec5769bc0abefa85a9" ==== Creating entire disk images using fsarchiver ==== * Usage: fsarchiver savefs backup.fsa /dev/sda1 # backup multiple partitions: fsarchiver savefs backup.fsa /dev/sda1 /dev/sda2 # restore a partition with: fsarchiver restfs backup.fsa id=0,dest=/dev/sda1 # restore multiple partitions: fsarchiver restfs backup.fsa id=0,dest=/dev/sda1 id=1,dest=/dev/sdb1