Shell - HowTo's

How to compare files ?

Usage :

Command Usage Command Line Output Notes
cksum compute checksum and file size cksum file1 file2 several fields :
  1. the file checksum
  2. the file size in bytes
  3. the file name
cmp compare files byte by byte cmp file1 file2
  • when files are similar : nothing
  • when files differ :
works both with ASCII and binary files
comm compare sorted files line by line
diff compare files line by line diff file1 file2
  • when files are similar : nothing
  • when files differ :
works both with ASCII and binary files

Example :

Compare 2 sets of files having the same names (such as 2 backups : ./directory1/* ./directory2/*) :

diff -b : ignore trailing whitespaces
cd directory1; for i in *conf; do echo $i; diff -b $i ../directory2/$i; done
diff -r : recursive
diff -r directory1 directory2
cmp -s : don't output differences
cd directory1; for i in *conf; do echo -n $i; cmp -s $i ../directory2/$i; echo " $?"; done

Read-only TMOUT variable, how to workaround and disable automatic logout ?

Situation :

Defining a timeout after which an inactive user will be automatically logged out is a good practice. It helps cleaning connections (and the associated processes) and leaves resources for the "active" users. However, some BOFH (aka mean sysadmins ) set the timeout to ridiculously low values.

Details :

Trying to alter TMOUT fails :
export TMOUT=0
-bash: TMOUT: readonly variable
unset TMOUT
-bash: unset: TMOUT: cannot unset: readonly variable

Solution :

Method 1 : gdb (sources : 1, 2)

  • Not tested / confirmed, may require some polish... Use at you own risk.
  • There are chances, if your sysadmin made TMOUT readonly, that /usr/bin/gdb (from gdb) is not available anyway (maybe building a static gdb binary could do the trick)
Add to ~/.bashrc :
# Disable the stupid auto-logout
unset TMOUT >/dev/null 2>&1
if [ $? -ne 0 ]; then
	gdb <<EOF >/dev/null 2>&1
	attach $$
	call unbind_variable("TMOUT")
	detach
	quit
EOF
fi

Method 2 : exec (source)

exec env TMOUT=durationSeconds bash
This lets you set any durationSeconds value, including 0. In such case, be very careful you don't forget open connections here and there or BOFH will be mad .

Safe values :

  • 3 hours : exec env TMOUT=10800 bash
  • 3 days : exec env TMOUT=259200 bash

How to write messages specifically to stdout or stderr ?

You just have to : Check it :
tmpScript=$(mktemp tmpFile.XXXXXXXX); echo -e '#!/bin/bash\necho stdout >&1\necho stderr >&2' > "$tmpScript"; echo -e '\nrunning script, no filter'; bash "$tmpScript"; echo -e '\nrunning script, hide stdout'; bash "$tmpScript" 1>/dev/null; echo -e '\nrunning script, hide stderr'; bash "$tmpScript" 2>/dev/null; [ -f "$tmpScript" ] && rm "$tmpScript"
running script, no filter
stdout
stderr

running script, hide stdout
stderr

running script, hide stderr
stdout

Situation :

I suspect suspectFile of being a hard link, how can I make sure ?

Details :

A symlink (aka symbolic or soft link) is just a special file (with its own inode) containing the path to another file. This is why it's able to refer to files on any filesystem.

A hard link is not so special, actually. On *Nix filesystems (where Everything is a file., remember ?), directories are "files listing files" (including sub-directories). They do so by matching an inode number (i.e. where the data is found on the filesystem) to a label (the file name). And since this is just a list of matching inode / label, nothing forbids several labels to share the same inode. Which is exactly what hard links do. (sources : 1, 2)
A "regular" file (or a directory) is just a hard link with a "1 to 1" inode / label relation, whereas hard links have a "1 to many" inode / label relation.

  • Unlike symlinks, with hard links it is not possible to differentiate the link from its target.
  • When found in different directories, hard links need not having the same file name.

Solution :

If you know the target :

Since a hard link shares an inode with its target, let's list their inode numbers :
ls -il suspectFile target
135340 -rw-r--r-- 2 kevin users 9042 Sep 19 15:05 suspectFile
135340 -rw-r--r-- 2 kevin users 9042 Sep 19 15:05 target

If you don't know the target :

Check inodes :
  1. ls -i suspectFile
    135340 suspectFile
  2. find /somewhere -inum 135340
    /path/to/suspectFile
    /different/path/to/someFile
Same as above (source) :
find /somewhere -inum $(ls -i suspectFile | cut -d' ' -f1)
/path/to/suspectFile
/different/path/to/someFile
Or even easier for the same result :
find /somewhere -samefile suspectFile
Investigate file properties (source) :
stat suspectFile
  File: suspectFile
  Size: 9042            Blocks: 24         IO Block: 4096   regular file
Device: fe04h/65028d    Inode: 135340      Links: 2		more than 1 link : this is a hard link
Access: (0644/-rw-r--r--)  Uid: ( 1000/   kevin)   Gid: ( 1000/   users)
Access: 2018-09-19 15:05:26.103989305 +0200
Modify: 2018-09-19 15:05:26.103989305 +0200
Change: 2018-09-20 14:11:38.766124455 +0200
 Birth: -

How to bind a function to a key ?

  1. If the key is a "special" one (function, CTRL-ed or ALT-ed), get its code with CTRL-V key. For example, CTRL-V F9 will output ^[[20~, but we mustn't forget that the ^[ here is an escape sequence and stands for \e. So the final code for F9 is \e[20~.
  2. Then : bind '"keyCode":"someCommand"' :
    • bind '"\e[20~":"echo \"Hello World\""' : this just displays the echo "Hello World" string after pressing F9without actually executing the echo command.
    • bind '"\e[20~":"echo \"Hello World\"\n"' : thanks to the final \n (interpreted as), the command is executed upon pressing F9. There seem to be no way of preventing the bound command to be displayed
    • Also works with compound commands : bind '"\e[20~":"for i in $(seq 10 -1 1); do echo $i; sleep 0.1; done; echo 'BOOOM\!'\n"'
    • Or : bind '"\e[20~":"nbPressed=$((nbPressed+1)); echo \"You pressed F9 $nbPressed times.\"\n"'
  3. To unbind the function : bind '"\e[20~":""'

Define key bindings in ~/.inputrc to make them permanent

# CTRL-F11 : display "/etc/apache2/sites-enabled/"
"\e[23;5~":"/etc/apache2/sites-enabled/"

# CTRL-F12 : display "/var/www/"
"\e[24;5~":"/var/www/"

If the key bindings must work through Putty, the configuration becomes :

# F11 : display "/etc/apache2/sites-enabled/"
"\e[23~":"/etc/apache2/sites-enabled/"

# F12 : display "/var/www/"
"\e[24~":"/var/www/"

~/.inputrc is a readline config file, not a Bash one, so it can not be source'd. Instead, it may be loaded at Bash startup. (source, details)

How to redirect the inputs and outputs of a command ?

Standard file descriptors :

I/O name file descriptor
STDIN 0
STDOUT 1
STDERR 2
Child processes inherit open file descriptors. This is why pipes work. (source, details)

Redirect the outputs :

command 1>file.log 2>&1
which means :
  1. send STDOUT to file.log
  2. and send STDERR to "where STDOUT is", i.e. follow the same redirection
With Bash only, this can be shortened into :
command &>file.log
Nothing before > redirects STDOUT only (source). Commands below are equivalent :
  • command > whatever
  • command 1> whatever

Discard the output :

You just have to redirect the output to /dev/null :
command 1>/dev/null 2>&1
works in most shell flavors (Korn, Bourne, ...), and even on Windows
command &>/dev/null
shorter but Bash only
exec > /path/to/logFile 2>&1
exec hack to discard / redirect the output of a group of commands

There is no "ninja hack" to this : if some output is still displayed despite the redirection, make sure you're actually redirecting the output of the right command.

How to group shell commands ?

There are 2 ways to do this :

(listOfCommands)

This causes a subshell environment to be created, in which each of the commands of listOfCommands will be executed. Since commands are executed in a subshell, variable assignments do not remain in effect after the subshell completes.
a=42; echo "a=$a"; (b=12; echo -e "\ta=$a\n\tb=$b"; a=0; echo -e "\ta=$a\n\tb=$b"); echo -e "a=$a\nb=$b"
a=42
	a=42		$a is "passed" inside of the ()
	b=12
	a=0		$a can be overwritten inside of the ()
	b=12
a=42			the changed value of $a doesn't exist outside of the ()
b=			$b doesn't exist outside of ()

{ listOfCommands; }

Commands are executed in the current shell context, no subshell is created. The semicolon ; (or newline) is required after listOfCommands.
a=42; echo "a=$a"; { b=12; echo -e "\ta=$a\n\tb=$b"; a=0; echo -e "\ta=$a\n\tb=$b"; }; echo -e "a=$a\nb=$b"
a=42
	a=42		this is the same "$a" since it's the same shell context
	b=12
	a=0		we can do whatever we like
	b=12
a=0			being in the same shell context means variables stay altered outside of the {}
b=12			$b now exists outside of {}
In addition to the creation of a subshell, there is a subtle difference between these two constructs due to historical reasons : the braces { and } are reserved words, so they must be separated from the list by blanks.

How to handle files which name starts with a dash - ?

Situation :

For some strange reason, you end up with a file called -foo, which is rather embarrassing as you can't "rm -foo".

Details :

Indeed, - is interpreted as a command option prefix.

Solution :

prefix the file name with its path :
rm ./-foo
explicitly declare the end of command options with -- (sources : 1, 2) :
rm -- -foo
with a hack involving find and an inode number
not sure this actually works because there's some character substitution occurring (Bash normal behavior), and you'd have to circumvent this with "--" :

testDir='/tmp'; touch "$testDir/-foo"; inodeNumber=$(ls -i "$testDir/"*foo | cut -d' ' -f1); find "$testDir" -inum "$inodeNumber"

This lists much more than only /tmp/-foo .

How to cd into a directory among others starting with the same characters ?

Situation :

Let's imagine the current directory has several subdirectories such as : and you want to cd into 20161121-25_support.

Solution at the end of this article

Details :

To do so :
  1. You type : cd 2 + TAB
  2. the shell completes :
    cd 201611
    but to go further you have to hit TAB + TAB again to see the available options
  3. the shell returns :
    20161115_workOnSomeStuff/	20161121-25_support/	20161122_otherProject/
    so now you have to figure out which character to type next : 2 + TAB + TAB again for available options
  4. still 2 options :
    20161121-25_support/	20161122_otherProject/
    Type the next character : 1 + TAB
  5. the shell completes :
    cd 20161121-25_support/
  6. then you just have to hit : you're in !

Solution :

cd *rt

All the magic here is in the shell expansion of the wildcard * + rt to match any directory which name ends in rt.

How to rename numerous files ?

  • Examples below work fine, but things can be done much simpler with rename.
  • Consider shell brace expansion to retrieve file name or file extension.

Make backup copies :

for i in *cfg; do cp "$i" "$i.old"; done; ls -lh

Writing $i_old for the target (with or without quotes) causes an error because the underscore character _ is allowed in variable names, and in such case, Bash will search a variable named i_old, which doesn't exist. (More on Bash variables : 1, 2)

Change file extensions :

for i in *oldExtension; do mv "$i" $(basename "$i" .oldExtension).newExtension; done

copy-paste :
  1. file.cfg into file.cfg.old : for i in *cfg; do mv "$i" $(basename "$i" .cfg).cfg.old; done; ls -lh
  2. file.cfg.old into file.cfg :
    • for i in *old; do mv "$i" $(basename "$i" .cfg.old).cfg; done; ls -lh
    • OR : oldExtension='.DONE'; for i in path/to/directory/*$oldExtension; do mv "$i" $(basename "$i" $oldExtension); done; ls -lh

Rename *JPG files into *jpg :

  • for i in *JPG; do mv "$i" $(basename "$i" .JPG).jpg; done or
  • for i in *JPG; do mv "$i" "${i%.JPG}.jpg"; done

Change all the spaces in the name of .mpg files into _ :

  • for i in *mpg; do mv "$i" $(echo $i | tr ' ' '_'); done or
  • for i in *mpg; do mv "$i" "${i// /_}"; done (source)

Remove a substring from several file names such as xx - artist - title.mp3 (actually : replace it with an empty string) :

The PERL method (source) :
  • theory : for file in *; do mv "$file" "$(perl -e '$tmp=$ARGV[0]; $tmp=~s/before/after/; print $tmp' "$file")"; done
  • copy-paste : for file in *; do mv "$file" "$(perl -e '$tmp=$ARGV[0]; $tmp=~s/artist - //; print $tmp' "$file")"; done
The Bash method to change before into after (source)

for file in *; do mv "$file" "${file/before/after}"; done

${string/before/after}
replace 1st match of before with after
${string//before/after}
replace all matches of before with after

How to enter a directory having a special non-printable character in its name ?

Let's imagine a situation where, after successfully executing cd path/to/directory, then ls -l, you end up with :
drwxr-xr-x 5 bob users	4096 jun.	25 21:19 normalDirectory_1
drwxr-xr-x 6 bob users	4096 nov.	18 07:46 normalDirectory_2
drwxr-xr-x 4 bob users	4096 dec.	 1 18:48 normalDirectory_3
drwxr-xr-x 3 bob users	4096 nov.	28 10:00 ?weirdDirectory		what's wrong with this one ?

This ?weirdDirectory contains a non-printable character in its name represented by a ? by ls. BTW, ls can do more for us and report the octal code of this special character with its -b flag (but this is of no help here).
This is even worse when the special character is the 1st one, because we can't even be saved by Bash completion.

All of this is inspired by a true story, while trying to browse directories on a remote server (where locales were possibly quite f***ed up) via PuTTY + screen (possibly poorly configured also). That server had directories — wherever they came from — with letters such as éèà... and since I couldn't type them in my terminal (even copy-pasting), I could not enter those directories.

<spoiler_alert>We'll have to create such a directory for the demonstration, which lets the cat out of the bag on the solution</spoiler_alert>

Let's create a directory named €uro, then enter it (as said earlier, this is piece of cake when sitting in front of a properly configured host, and everything below is meaningless) :

  1. let's start by finding the octal code of the symbol here : 240
  2. echo -e "\0240uro"
    �uro
  3. create the directory : mkdir $(echo -e "\0240uro")
  4. display it : ls -l will return
    drwxr-xr-x 2 bob users 4,0K déc.	3 17:17 ?uro/
  5. enter it : cd $(ls -d *uro)

Actually, cd $(ls -d ?uro) would have done the job perfectly but might be confusing : ls -l "displays" the directory name with a ? because the terminal itself is not able to display it, whereas the ? in the cd $(ls ...) command is a wildcard representing a single character.

How to chain shell commands ?

There are several flavors of commands chaining, each of them having its own interest :
command1; command2; command3
Run command1, then command2, then command3. The exit status is that of command3.
command1 & command2 & command3
All commands are executed at once and return results as they arrive.
Try it : { sleep 2; echo "2s"; } & { sleep 1; echo "1s"; } &
command1 && command2
command2 is executed only if command1 exits on success.
Try it : true && echo 'I said TRUE.'; false && echo 'I said FALSE.'
command1||command2
command2 is executed only if command1 exits on failure.
Try it : true || echo 'I said TRUE.'; false || echo 'I said FALSE.'
command1 && commandIfSuccess || commandIfFailure
Based on success / failure of command1, execute either commandIfSuccess or commandIfFailure
Albeit extremely convenient, this construct comes with serious warnings.
Compound &&'s :
true && true && echo 'Hello world'
Hello world
true && false && echo 'Hello world'
(nothing)
true && echo 'Hello' && echo 'world'
Hello
world
true && { false; echo 'hello'; }
hello
This is the typical case where the success of a single command triggers n extra commands. More on {} and ()

Warnings about the cmd1 && cmd2 || cmd3 construct :

  • this construct can work as commandIfSuccess / commandIfFailure only because cmd2 never fails (this is where the magic comes from). Check it (source) :
    for cmd1 in true false; do for cmd2 in true false; do echo -e "\n$cmd1 && $cmd2 || cmd3"; (echo 'cmd1'; $cmd1) && (echo 'cmd2'; $cmd2) || echo 'cmd3'; done; done
  • from the example above, you must remember that, if cmd2 fails, both cmd2 and cmd3 will be executed :
    • true && { echo 'hello'; true; } || { echo 'world'; }
      hello
    • true && { echo 'hello'; false; } || { echo 'world'; }
      hello
      world
  • All of this is because lists of commands using && and || are executed with left associativity (What is left associativity ?)
Remember : cmd1 && cmd2 || cmd3 ...
  • is a hack simulating a ternary operator
  • may be used only when cmd2 is a command than can not fail, typically something basic like an echo or to set a variable
  • should be replaced by
    if ...; then
    	...
    else
    	...
    fi
    otherwise

How to get the IP address of the current host from a script ?

Many articles answer this question suggesting to parse the output of ifconfig (ip a should be preferred, actually) with the usual grep, cut and awk.

This is very nice, but already available with : hostname -i.

How to run a resource-friendly command ?

Start a single process :

nice -n 19 myResourceGreedyCommand & ionice -c 3 -p $!

Start a CPU + HDD hungry find while limiting its impact on the system :

for item in list of items; do
	find haystack [find options] & pidFind=$!
	echo "Nicing"
	ionice -c 3 -p $pidFind
	renice 19 -p $pidFind
	echo "Finding...";
	wait $pidFind
done

This comes from a very specific real-life use case. It may not be reused as-is, but you get the idea