public:books:linux_shell_scripting_cookbook:chapter_2 [NervTech's Wiki]

General way to read content with cat:
```
cat file1 file2 file3 ...
```

Combine stdin with a file:

echo 'Text through stdin' | cat - file.txt

Removing extra (more than 2 consecutive) blank lines:
```
cat -s file
```

Other cat flags:

# Display tabs as ^I:
cat -T file.py

# Display line numbers:
cat -n file.txt

We can start recording a session with:

script -t 2> timing.log -a output.session
...
# type commands here
...
exit

Replay the commands with:
```
scriptreplay timing.log output.session
```

Find all the files in current directory:

find base_path

# For instance:
find . -print

# We can use -print0 to use '\0' as delimiting character.
# This is usefull when flename contains spaces

Search base on filename or regular expression:

find /home/slynux -name "*.txt" -print

# Using the option iname to ignore case:
find . -iname "example*" -print

# or condition for multiple criteria:
find . \( -name "*.txt" -o -name "*.pdf" \) -print

# Using path argument:
find /home/users -path "*/slynux/*" -print

# Using regex argumen to match paths based on regular expressions:
find . -regex ".*\(\.py\|\.sh\)$"

# or iregex to ignore case:
find . -iregex ".*\(\.py\|\.sh\)$"

Negating arguments:

# Exclude things that match a pattern:
find . ! -name "*.txt" -print

Search based on directory depth:

# Only printing files in the current directory:
find . -maxdepth 1 -name "f*" -print

# Or using mindepth:
find . -mindepth 2 -name "f*" -print

# note that these flags should be specified as third argument for find to improve efficiency.

Search based on file type:

find . -type d -print  # find directories
find . -type f -print # find regular files
find . -type l -print # find symlinks

Search on file times:

 # we can use the flags:
-atime : access time
-mtime : modification time
-ctime : change time

# The provided inter value is the number of days:
find . -type f -atime -7 -print # all files accessed within the last 7 days
find . -type f -atime 7 -print # all files accessed exactly 7 days ago.
find . -type f -atime +7 -print # all files that were accessed more that 7 days ago.

# we can also used minutes based flags:
-amin
-mmin
-cmin

# We can also find files newer that a given file:
find . -type f -newer file.txt -print

Search based on file size:

find . -type f -size +2k # files bigger than 2KB
find . -type f -size -2k # smaller than 2kB

# instead of 'k', we can use 'M', 'G'

Deleting file matches:
```
find . -type f -name "*.swp" -delete
```

Match based on file permissions:

find . -type f -perm 644 -print

# For instance:
find . -type f -name "*.php" ! -perm 644 -print

# Or search based on user:
find . -type f -user slynux -print

Executing commands with find:

find . -type f -user root -exec chown slynux {} \;

# In the previous command:
# '{}' will be replaced by each filename.

# if we want to run a command with a list of files as parameters then
# we just replace ';' with '+'

To concatenate multiple files for instance:

find . -type f -name "*.c" -exec cat {} \;>all_c_files.txt

To copy all the .txt files that are older than 10 days to a directory OLD:
```
 find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD  \;
```

If we need multiple commands with exec then we have to write a script file.

Combine exec with printf:

find . -type f -name "*.txt" -exec printf "Text file: %s\n" {} \;

Skipping specified directories in find:

find devel/source_path  \( -name ".git" -prune \) -o \( -type f -print \)

Converting multiple lines to a single line output:
```
cat example.txt | xargs
```
Converting single-line into multiple-line output:
```
cat example.txt | xargs -n 3
```

Specify delimiter:

echo "splitXsplitXsplitXsplit" | xargs -d X

Provide one/more arguments from a file listing to a command:

cat args.txt | xargs -n 1 ./cecho.sh

# To provide n arguments we use the prototype:
INPUT | xargs -n X

# to provide all the arguments at once:
cat args.txt | xargs ./ccat.sh

We can specify the -I flag to provide a replacement string (only with one argument per command execution):
```
cat args.txt | xargs -I {} ./cecho.sh -p {} -l
```

Using xargs with find:

find . -type f -name "*.txt"  -print | xargs rm -f

# Safer implementation is:
find . -type f -name "*.txt" -print0 | xargs -0 rm -f

Count number of lines of C code:

find source_code_dir_path -type f -name "*.c" -print0 | xargs -0 wc -l

# One could also consider using the SLOCCount utility

Using a subshell script instead of xargs:

 cat files.txt | xargs -I {} cat {}

# is equivalent to:
cat files.txt  | ( while read arg; do cat $arg; done )

Simple translation:

echo "HELLO WHO IS THIS" | tr 'A-Z' 'a-z'

Other encryptions:

echo 12345 | tr '0-9' '9876543210' # encrypt
echo 87654 | tr '9876543210' '0-9' # decrypt

# ROT13 encryption:
echo "tr came, tr saw, tr conquered." | tr 'a-zA-Z' 'n-za-mN-ZA-M'
# decryption:
echo ge pnzr, ge fnj, ge pbadhrerq. | tr 'a-zA-Z' 'n-za-mN-ZA-M'

Converting tab to space:
```
tr '\t' ' ' < file.txt
```

Deleting characters:

echo "Hello 123 world 456" | tr -d '0-9'

Complementing character set:

echo hello 1 char 2 next 4 | tr -d -c '0-9 \n'

Squeezing characters with tr:

echo "GNU is       not     UNIX. Recursive   right ?" | tr -s ' '

Compute a sum from a file:

# Assuming sum.txt contains one number per line:
cat sum.txt | echo $[ $(tr '\n' '+' ) 0 ]

Can be used with character sets like: alnum, alpha, cntrl, digit, graph, lower, print, punct, space, upper, xdigit:
```
tr [:class:] [:class:]
```

To compute the checksum we can use:

$ md5sum filename
68b329da9893e34099c7d8ad5cb9c940 filename

# We can redirect the output to file:
$ md5sum filename > file_sum.md5

# Prototype is:
$ md5sum file1 file2 file3 ...

# This will output one line per file.

To verify the integrity of a file:

$ md5sum -c file_sum.md5
# This will output a message whether checksum matches or not.
# Alternatively:
$ md5sum -c *.md5

Usage of SHA-1 is similar: replace md5sum with sha1sum.

We can compute checksum for directory with md5deep and sha2deep:

$ md5deep -rl directory_path > directory.md5
# -r to enable recursive traversal
# -l for using relative path. By default it writes absolute file path in 
output

# Alternatively we can use find:
$ find directory_path -type f -print0 | xargs -0 md5sum >> directory.md5

Encryption with crypt:

$ crypt <input_file >output_file
Enter passphrase:

# alternatively, we can provide the passphrase on the command line:
$ crypt PASSPHRASE <input_file >encrypted_file

# to decrypt:
$ crypt PASSPHRASE -d <encrypted_file >output_file

Encryption with gpg:

$ gpg -c filename

# to decrypt:
$ gpg filename.gpg

Encryption with base64:

$ base64 filename > outputfile

# or:
$ cat file | base64 > outputfile

# To decode:
$ base64 -d file > outputfile

# or:
$ cat base64_file | base64 -d > outputfile

md5sum and sha1sum can also be used to store passwords for instance (but bcrypt and sha512sum are recommended instead)

Generate shadow password with openssl:

$ opensslpasswd -1 -salt SALT_STRING PASSWORD
$1$SALT_STRING$323VkWkSLHuhbt1zkSsUG.

Sort a given set of files:

$ sort file1.txt file2.txt > sorted.txt

# or:
$ sort file1.txt file2.txt -o sorted.txt

# For numerical sorting:
$ sort -n file.txt

# To sort in reverse order:
$ sort -r file.txt

# To sort by month:
$ sort -M months.txt

# To merge 2 sorted files:
$ sort -m sorted1 sorted2

# To find unique lines in sorted file:
$ sort file1.txt file2.txt | uniq

To check if a file is already sorted we check the result of sort:

#!/bin/bash
#Desc: Sort
sort -C filename ;
if [ $? -eq 0 ]; then
   echo Sorted;
else
   echo Unsorted;
fi

Sort by a column in text file:

$ cat data.txt
1  mac    2000
2  winxp    4000
3  bsd    1000
4  linux    1000

# we use the -k flag to specify the column to use:

# Sort reverse by column1
$ sort -nrk 1  data.txt
4  linux    1000 
3  bsd    1000 
2  winxp    4000 
1  mac    2000 
# -nr means numeric and reverse

# Sort by column 2
$ sort -k 2  data.txt
3  bsd    1000 
4  linux    1000 
1  mac    2000 
2  winxp    4000

Specify a range for the key:

$ cat data.txt
1010hellothis
2189ababbba
7464dfddfdfd
$ sort -nk 2,3 data.txt

# To use first character as key:
$ sort -nk 1,1 data.txt

# To use a \0 separator:
$ sort -z data.txt | xargs -0
#Zero terminator is used to make safe use with xargs

# To ignore leading blank and use dictionnary order:
$ sort -bd unsorted.txt

Usage of uniq:
```
$ sort unsorted.txt | uniq
```
Display only the unique lines:
```
$ uniq -u sorted.txt
```

Count how many times each line appears:

$ sort unsorted.txt | uniq -c
      1 bash
      1 foss
      2 hack

Find the duplicate lines:
```
$ sort unsorted.txt  | uniq -d
hack
```

Specify start and width for uniqueness computation:

$ cat data.txt
u:01:gnu 
d:04:linux 
u:01:bash 
u:01:hack

$ sort data.txt | uniq -s 2 -w 2
d:04:linux 
u:01:bash

Terminate lines with \0 separator:
```
$ uniq -z file.txt
```

Create a temporary file:

$ filename=`mktemp`
$ echo $filename
/tmp/tmp.8xvhkjF5fH

Create temporary directory:

$ dirname=`mktemp -d`
$ echo $dirname
tmp.NI8xzW7VRX

To just generate a filename without actually creating it:

$ tmpfile=`mktemp -u`
$ echo $tmpfile
/tmp/tmp.RsGmilRpcT

Create temp file according to template:
```
$mktemp test.XXX
test.2tc
```

Splitting a file:

$ split -b 10k data.file
$ ls
data.file  xaa  xab  xac  xad  xae  xaf  xag  xah  xai  xaj

To use numeric suffixes:
```
$ split -b 10k data.file -d -a 4
```

Specify a filename prefix:

$ split -b 10k data.file -d -a 4 split_file

To split based on number of lines:
```
$ split -l 10 data.file
```

csplit can be used to split based on file content:

csplit server.log /SERVER/ -n 2 -s {*}  -f server -b "%02d.log"  ; rm server00.log

Extracting the name from name.extension:

file_jpg="sample.jpg"
name=${file_jpg%.*}
echo File name is: $name

Extracting the extension from name.extension:
```
extension=${file_jpg#*.}
```
Note the the oerator % is non-greedy (eg. finds the minimal match). Instead operator % % is greedy:
```
$ VAR=hack.fun.book.txt
$ echo ${VAR%.*}
hack.fun.book

$ echo ${VAR%%.*}
hack
```

We also have operator ## similar to # but greedy:

$ VAR=hack.fun.book.txt
$ echo ${VAR#*.}
fun.book.txt

$ echo ${VAR##*.}
txt

Rename all image files in the current directory:

#!/bin/bash
#Filename: rename.sh
#Desc: Rename jpg and png files
count=1;
for img in `find . -iname '*.png' -o -iname '*.jpg' -type f -maxdepth 
1`
do
  new=image-$count.${img##*.}
  echo "Renaming $img to $new"
  mv "$img" "$new"
  let count++
done

Renaming *.JPG to *.jpg:
```
$ rename *.JPG *.jpg
```
Replace spaces with underscore:
```
$ rename 's/ /_/g' *
```

Convert from uppr to lower or opposite:

$ rename 'y/A-Z/a-z/' *
$ rename 'y/a-z/A-Z/' *

Move all mp3 in a folder:

$ find path -type f -name "*.mp3" -exec mv {} target_dir \;

Recursive rename:

$ find path -type f -exec rename 's/ /_/g' {} \;

Dictionary files found in /usr/share/dict/

Check if word is part of dictionary:

#!/bin/bash
#Filename: checkword.sh
word=$1
grep "^$1$" /usr/share/dict/british-english -q 
if [ $? -eq 0 ]; then
  echo $word is a dictionary word;
else
  echo $word is not a dictionary word;
fi

# Usage as:
$ ./checkword.sh ful 
ful is not a dictionary word 

$ ./checkword.sh fool 
fool is a dictionary word

or we can use aspell.
List all words in a file starting with a given word as follows:
```
$ look word filepath

# or:
$ grep "^word" filepath
```

Automate an input for a command with:

$ echo -e "1\nhello\n" | ./interactive.sh 
You have entered 1, hello

# -e flag for echo means 'interpret escape sequences'

The expect program can be used when the input order is not always the same.

Run multiple instances of scripts with for instance:

#/bin/bash
#filename: generate_checksums.sh
PIDARRAY=()
for file in File1.iso File2.iso
do
  md5sum $file &
  PIDARRAY+=("$!") # $! : retrieves the PID of the last background process.
done
wait ${PIDARRAY[@]}

2. Have a Good Command

Concatenating with cat

Recording and playing back of terminal sessions

Finding files and file listing

Playing with xargs

Translating with tr

Checksum and verification

Cryptographic tools and hashes

Sorting unique and duplicates

Temporary file naming and random numbers

Splitting files and data

Slicing filenames based on extension

Renaming and moving files in bulk

Spell checking and dictionary manipulation

Automating interactive input

Making commands quicker by running parallel processes