I like Bash scripting - Scripting's my favorite


The while read construct

Typical case : reading a file line by line :

echo -e 'Bob\nKevin\nStuart' > "$tmpFile";
while read name; do
	echo "Hello, '$name' !"
done < "$tmpFile"
rm "$tmpFile"
Hello, 'Bob' !
Hello, 'Kevin' !
Hello, 'Stuart' !

Leave leading and trailing spaces with IFS= :

echo -e ' Bob \n Kevin \n Stuart ' > "$tmpFile";
while read name; do
	echo "Hello, '$name' !"
done < "$tmpFile"
while IFS= read name; do
	echo "Hello, '$name' !"
done < "$tmpFile"
rm "$tmpFile"
Hello, 'Bob' !
Hello, 'Kevin' !
Hello, 'Stuart' !
Hello, ' Bob ' !
Hello, ' Kevin ' !
Hello, ' Stuart ' !

Trying to understand the -r :

echo -e 'apples\t12\nbananas\t3\ncoconuts\t42' > "$tmpFile";
while read fruit number; do
	echo -e "Number of '$fruit' : '$number'"
done < "$tmpFile"
while read -r fruit number; do
	echo -e "Number of '$fruit' : '$number'"
done < "$tmpFile"
rm "$tmpFile"
Number of 'apples' : '12'
Number of 'bananas' : '3'
Number of 'coconuts' : '42'
Number of 'apples' : '12'
Number of 'bananas' : '3'
Number of 'coconuts' : '42'
Makes no difference (not the right use case ? I'll have to further investigate this one.)

It also works with process substitution :

while read line; do
	echo "$line"
done < <(ps -u $(whoami) | head -10)
Remember : done < <(command)

... and with heredocs too :

while read line; do
	echo "$line"
done <<< $(ps -u $(whoami) | head -10)
Remember : done <<< $(command)

The if then else fi construct

Because I can never remember this construct :
if condition; then
There's no need for { } surrounding the then and else blocks.

Script error : Bad substitution

Situation :

Script execution fails :
./myScript.sh: errorLine: ./myScript.sh: Bad substitution

Details :

myScript.sh tries to use a substitution function (such as the $() construct or brace expansion) that is not supported by the shell used to execute it.

Solution :

  1. Have a look at the shebang line of myScript.sh. It's very likely that you'll see something like #!/bin/sh, because the author of this script "has always done like this and never had problems" ()
  2. Change the shebang to the shell you'd like to interpret myScript.sh (typically : #!/bin/bash)

cut vs awk : which is the fastest to extract data from a CSV ?

Situation :

I work a lot with CSV files, and one of the recurrent tasks is to extract the value of the nth field from the current line ($line). There are several methods to do so : Is one faster than the other ?

Details :

Let's script this :
#!/usr/bin/env bash



tmpFile=$(mktemp --tmpdir='/run/shm')
resultFile=$(mktemp --tmpdir='/run/shm')

showStep() {
	echo -e "\n$stepDescription"

showMethod() {
	showStep "Getting the ${fieldToExtract}th field with the '$methodDescription' method"

function getDurationOfAction() {
	{ time "$1"; } 2>&1 | awk '/real/ { print $2 }'

showStep "Preparing source data : $nbLines lines of $nbFieldsPerLine fields ($fieldSize characters each)"
for((i=0; i<"$nbLines"; i++)); do
	pwgen "$fieldSize" -N "$nbFieldsPerLine" -1 | xargs | tr ' ' "$fieldSeparator" >> "$tmpFile"

showMethod 'variable=$(echo | cut)'
getData_echoCutVariableMethod() {
	while read line; do
		data=$(echo "$line" | cut -d "$fieldSeparator" -f "$fieldToExtract")
	done < "$tmpFile"
getDurationOfAction 'getData_echoCutVariableMethod'

showMethod "echo | cut > $resultFile"
getData_echoCutResultFileMethod() {
	while read line; do
		echo "$line" | cut -d "$fieldSeparator" -f "$fieldToExtract" > "$resultFile"
	done < "$tmpFile"
getDurationOfAction getData_echoCutResultFileMethod

showMethod 'variable=$(echo | awk)'
getData_echoAwkVariableMethod() {
	while read line; do
		data=$(echo "$line" | awk -F "$fieldSeparator" '{print $'$fieldToExtract'}')
	done < "$tmpFile"
getDurationOfAction getData_echoAwkVariableMethod

showMethod "echo | awk > $resultFile"
getData_echoAwkResultFileMethod() {
	while read line; do
		echo "$line" | awk -F "$fieldSeparator" '{print $'$fieldToExtract'}' > "$resultFile"
	done < "$tmpFile"
getDurationOfAction getData_echoAwkResultFileMethod

rm "$tmpFile" "$resultFile"
Preparing source data : 5000 lines of 100 fields (10 characters each)

Getting the 76th field with the 'variable=$(echo | cut)' method

Getting the 76th field with the 'echo | cut > resultFile' method

Getting the 76th field with the 'variable=$(echo | awk)' method

Getting the 76th field with the 'echo | awk > resultFile' method

Solution :

Summary :

  • cut is faster than awk
  • work on a RAMdisk whenever possible
  • remember cut can extract all specified fields from a file in a single operation :

Alternate method for parsing a CSV file (source) :

The methods shown above retrieve CSV data fields in a 2-step process :
  1. read the line from the CSV file
  2. split the line into fields and store values in the corresponding variables
It is possible to get CSV values in a single operation like this :
while IFS=, read -r field1 field2; do
	# do something with "$field1" and "$field2"
done < input.csv
Depending on the context, one of these methods may be more appropriate :
  • The cut / awk methods are not the "pure Bash" ones but they can prove useful when the source has MANY fields (like a log file) and you're only interested in SOME of them, not ALL.
  • The IFS= + read -r is the "proper" way of doing this, but it requires to name ALL the data fields, even if they're not used inside the loop. Moreover, this can make the while... line longer and decrease code readability when there are MANY data fields.

The case esac construct

something='hello world'
case $something in
		echo 'foo'
		echo 'bar'
  • this is what the various cases will be matched against
  • tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal are performed before matching is attempted
  • shell pattern matching (not regexp), do not use quotes (details and examples)
  • tilde expansion, parameter expansion, command substitution, and arithmetic expansion are performed before matching is attempted
  • the final semicolons ;; are mandatory. They can not be omitted to create a "fallback" like in C or with the PHP switch
  • the first pattern that matches is executed, the following ones are skipped
Some of the extended pattern matching operators require the extglob shell option to be enabled (and will produce shell errors otherwise) :
shopt -s extglob

Bash functions

Function declaration syntax : function foo() {} or foo() {} (source) ?

There is (almost) no difference when working on GNU/Linux :

foo() {}
  • is the POSIX syntax
  • is more portable
  • has less chances of failing when moving scripts to other shells/proprietary Unices

Possible function declaration syntaxes

foo() commands
  • POSIX syntax
  • Supported by Bourne-like shells
function foo { commands; }
  • Korn shell syntax
  • Supported by Bash and Zsh for compatibility with Ksh
function foo() { commands; }
do not use this

More about shell functions (source)

List existing functions

You may have created + sourced some shell functions, either inline or from files such as ~/.bashrc and ~/.bash_aliases. You can list them all with : declare -F

Find where a function is defined

  • shopt -s extdebug; declare -F functionName
    functionName	lineNumber	/path/to/functions.sh
  • bash --debugger; declare -F functionName
    functionName	lineNumber	/path/to/functions.sh

View the code of a function

declare -f functionName

Functions and variable scope :

As stated here (source ?) :
  • All variables declared inside a function will be shared with the calling environment.
  • All variables declared local will not be shared.

Bash scripting : using the right shebang

The shebang is the first line of a script (shell, Python, PERL, ...) which instructs the operating system of which binary should be used to interpret and execute the script commands. shebangs usually start with #!, optionally followed by a space.
As for shell scripts, and especially Bash scripts, there are several flavors of shebangs :

Shebang Pro's Con's
#!/bin/sh short and simple
  • expects /bin/sh to symlink to /bin/bash, which is common but not mandatory / may change
  • works only if not using Bash-specific commands or options. Otherwise, this could lead to weird bugs / undesired side-effects, which is why I discourage using this method : the #!/bin/bash shebang is safer with only 2 extra keystrokes
#!/bin/bash calling THE Bash binary with its absolute path : short, simple, efficient. This is the safest. some may argue this is less portable because the Bash binary may not be /bin/bash but /usr/bin/bash or /usr/local/bin/bash or ... (but I guess symlinks would be created adequately in such situations anyway)
#!/usr/bin/env bash find the Bash binary wherever it is (it picks the 1st answer from the output of env) : different install path (system-wide), customized path (user-level setting). This is more portable.
  • could be used to execute a rogue Bash binary
  • users with customized $PATH could refer to different binaries and experience different behaviors of the same script


Bash loops

for loops :

More examples

  • ugly : start=37; stop=73; increment=7; for i in $(eval echo "{$start..$stop..$increment}"); do echo $i; done (source)
  • better : start=37; stop=73; increment=7; for((i=$start; i<$stop; i+=$increment)); do echo $i; done

for vs while (source) :

  • ugly : for line in $(cat file.txt); do echo $line; done
  • ugly again (UUOC) : cat file.txt | while read line; do echo $line; done
  • better : while read line; do echo $line; done < file.txt (more)
    This is better because you don't need to spawn a sub-process with |, or with $(...), or start the external cat command.

An until loop :

a=5; until [ "$a" -eq 2 ]; do echo $a; a=$((a-1)); done

until ping -c 1 & >/dev/null; do echo -n '.'; sleep 1; done; echo OK

needle='some text'
tmpFile=$(mktemp --tmpdir tmp.XXXXXXXX)
until grep -i "$needle" $tmpFile; do
	wget $url -O $tmpFile
echo "Number of loops : $loops"

Bash arrays

Indexed arrays (source)

Initialize an array :

declare -a myArray

alternate syntax :
declare -a myArray

myArray=(john paul george ringo)

Display array values :

all at once :
echo "${myArray[@]}"
john paul george ringo
a single one :
echo "${myArray[2]}"

Browse array in a for loop :

for item in "${myArray[@]}"; do
	echo "$item"

Append a value to the array :

echo "${myArray[@]}"
john paul george ringo the_5th_guy

Associative arrays (source)

With Bash 4, it is possible to use associative arrays (but not nested associative arrays : source).
As an alternative to associative arrays, you can use tuples.

Create an associative array :

declare -A myArray
myArray[foo]='this is "foo"'
myArray[bar]='this is "bar"'
myArray[baz]='this is "baz"'

Browse by keys :

for key in "${!myArray[@]}"; do
	echo -e "key :\t$key"
	echo -e "value :\t${myArray[$key]}\n"
key :	bar
value :	this is "bar"

key :	baz
value :	this is "baz"

key :	foo
value :	this is "foo"

Browse by values :

for value in "${myArray[@]}"; do
	echo -e "value :\t$value\n"
value : this is "bar"

value : this is "baz"

value : this is "foo"


Variables holding a path in shell scripts

This article is about coding styles, which is not only personal but also far from perfect (by design ). When it comes to declare a variable in a shell script to store a path, I see at least 2 method to do so, and so far I've still not settled on one or on the other (so writing this article may help) :

Method 1 : the trailing / is in the value :



Method 2 : no trailing / in the value :


Pro's Con's
  • Error using variable : somePath='/path/to/directory/' then baz="$somePath/baz" (instead of baz="${somePath}baz") would be interpreted as /path/to/directory//baz : ugly but works
  • Error declaring variable : somePath='/path/to/directory' (missing trailing /) then baz="${somePath}baz" will be interpreted as : /path/to/directorybaz : error
  • The { and } are slow to type, and add some visually-cryptic-characters around the variable name which is bad for readability.
  • Error declaring variable : somePath='/path/to/directory/' (extra trailing /) then baz="$somePath/baz" will be interpreted as : /path/to/directory//baz : ugly but works
  • Better readability
  • Less keystrokes
  • Error using variable : somePath='/path/to/directory' then baz="${somePath}baz" (instead of baz="$somePath/baz") would be interpreted as /path/to/directorybaz : error
    This is less likely to happen because of the extra keystrokes involved.

Counting keystrokes :

The method 1 is 70 characters long, and the method 2 : 67. But after removing characters that are typed in both methods (variable names, =, $, , quotes, ...), we get (with { and } being 2 keystrokes each on a french keyboard) :

Conclusion :

Let's settle on method 2 ! (until we find further arguments )


How to load tuples in a shell script ?

data='key1;value1 key2;value2'; for tuple in $data; do key=$(echo $tuple | cut -d ';' -f 1); value=$(echo $tuple | cut -d ';' -f 2); echo "tuple: '$tuple', key : '$key', value : '$value'"; done

Real-world example : testing a list of user accounts on a FTP server :

ftpHost='my.ftp.server'; ftpPort=21; data='joe;123456 jack;password william;qwerty averell;averell'; for tuple in $data; do login=$(echo $tuple | cut -d ';' -f 1); password=$(echo $tuple | cut -d ';' -f 2); echo "Trying account '$login/$password' :"; curl --insecure --ftp-ssl --ftp-pasv --user "$login:$password" "ftp://$ftpHost:$ftpPort/"; echo; done


How to use booleans in Bash ?

Playing with true and false (source) :

# ...do something interesting...
if $theWorldIsFlat; then
	echo 'Be careful not to fall off!'

Bash doesn't know booleans. This hack works because $myVariable is replaced by true at run time, which returns a Unix success (same would go on with false, returning a Unix failure).
If you're not convinced $myVariable is an alias of the true command, try these :

myVariable=true; if $myVariable; then echo OK; fi
myVariable=false; if $myVariable; then echo OK; fi
myVariable=ooops; if $myVariable; then echo OK; fi
bash: ooops : command not found

More fun ?

true && true && echo A || echo B
true && false && echo A || echo B

Testing return codes :

#!/usr/bin/env bash


returnBoolean() {

	case "$wantedReturnValue" in
			return $(true)
			return $(false)

for result in $UNIX_SUCCESS $UNIX_FAILURE; do
	returnBoolean $result
	[ "$returnCode" -eq "$result" ] && echo OK || echo KO

set and unset variables :

unset a; [ "$a" ] && echo OK || echo KO; a=1; [ "$a" ] && echo OK || echo KO; a=0; [ "$a" ] && echo OK || echo KO


"Variable variables" a.k.a dynamic variables

Usage :

A dynamic variable is a variable holding the name of another variable.

Example :

A basic example :

#!/usr/bin/env bash


echo "The value of '$dynamicVar' is '${!dynamicVar}'."
Will output :

The value of 'myVar' is 'Hello'.

Checking a list of variables :

This snippet checks that none of the variables from the list is an empty string :
for variableToCheck in variableA variableB variableC; do
	echo "The variable '$variableToCheck' has value : '${!variableToCheck}'."
	if [ -z "${!variableToCheck}" ]; then
		<deal with it !>

When it comes to check the input of a script in search of missing parameters, it is better to simply count parameters rather than testing all variables. Indeed, when expecting command parameterA parameterB and getting command foo, you can't tell which of parameter A or B is missing (unless you're using named parameters, of course).

List all permutations of a list of lists, then list all items :

#!/usr/bin/env bash

listOfLists='colors fruits cars'
colors='red green blue'
fruits='apple banana grape'
cars='ferrari porsche lada'

# generate all permutations of the list of lists
for aList in $listOfLists; do
	for anotherList in $listOfLists; do
		[ "$anotherList" = "$aList" ] && continue
		for oneMoreList in $listOfLists; do
			[ "$oneMoreList" = "$anotherList" ] || [ "$oneMoreList" = "$aList" ] && continue

			output="$output\nLISTS: 1.$aList 2.$anotherList 3.$oneMoreList"

			# we now have all the list of lists, let's list the contents of all those lists
			for item1 in ${!aList}; do
				for item2 in ${!anotherList}; do
					for item3 in ${!oneMoreList}; do
						output="$output\n$item1 $item2 $item3"
			# contents over

echo -e "$output" | column -s ' ' -t

Bash tests : [ ... -x ... ], [ ... = ... ], [[ ... =~ ... ]], ...

File operators (source 1, 2) :

Option true if ...
-b file is a block device special file, such as /dev/sda :
brw-rw---T 1 root disk 8, 0 sept. 22 14:23 /dev/sda
-d file is a directory
-e file exists, whatever type of file it is
-f file is a regular file, not a directory or a device
-h -L file is a symbolic link
-s file has data (i.e. is not zero-sized)
! bitwise NOT. Must be followed by a whitespace.

String operators (source) :

It is a safe practice to always quote tested strings.

Option Usage
-n string is not null, i.e. has length > 0
-z string is null, i.e. has length == 0
is equal to
[ "$a" = "$b" ]
[ "$a" == "$b" ]
[[ "$a" == "$b" ]]
is not equal to
[ "$a" != "$b" ]
[[ "$a" != "$b" ]]

Integer operators (source) :

is equal to
[ "$a" -eq "$b" ]
[[ "$a" = "$b" ]]
[[ "$a" == "$b" ]]
is not equal to
[ "$a" -ne "$b" ]
[[ "$a" != "$b" ]]

Logical operators (sources : 1, 2) :

For compound tests, you can use things like :
if [ $condition1 ] && [ $condition2 ]
if [[ $condition1 && $condition2 ]]
if [ $condition1 -a $condition2 ]
if [ $condition1 ] || [ $condition2 ]
if [[ $condition1 || $condition2 ]]
if [ $condition1 -o $condition2 ]

Unary if :

Some languages have a unary if operator :
if(true) {
Bash allows to mimic a unary if operator, but this looks error-prone (read below). Instead, this looks more reliable :
  • if [ "$myVar" -eq "$UNIX_SUCCESS" ]; then
  • if [[ "$myVar" == "$UNIX_SUCCESS" ]]; then

Unary if and booleans :

Be very careful with these constructs as they may have counter-intuitive results :
true; echo $?; false; echo $?; [ true ]; echo $?; [ false ]; echo $?; [[ true ]]; echo $?; [[ false ]]; echo $?
  • The [ whatever ] construct is short for [ -n whatever ] and tests the string length (source, details, examples)
    [ true ]; echo $?; [ false ]; echo $?; [ hello ]; echo $?; [ '' ]; echo $?
    0		actually testing the non-empty string true, so UNIX_SUCCESS
    1		testing an empty string, so UNIX_FAILURE
  • Same goes on when whatever evaluates to a string (i.e. command result or value returned by a function) : it will be checked with -n :
    [ $(true) ]; echo $?; [ $(false) ]; echo $?; [ $(echo hello) ]; echo $?; [ $(echo -n '') ]; echo $?
    1		true returns a UNIX_SUCCESS but displays nothing (i.e. empty string)
    1		similar reason for false
    0		echo will always display a string...
    1		...unless properly silenced 
  • If you just want to test an exit status, you don't need [ ], just chain commands with the proper operators.
  • Same goes on for [[ ]].

Regex operators :

Test a match (source : 1, 2) :

filesystemSize='1234M'; [[ "$filesystemSize" =~ ^[0-9]+[KMGT] ]] && echo match || echo 'no match'
filesystemSize='123'; [[ "$filesystemSize" =~ ^[0-9]+[KMGT] ]] && echo match || echo 'no match'
no match
ip=''; regexp="([0-9]{1,3}\.){3}[0-9]{1,3}"; [[ $ip =~ $regexp ]] && echo OK || echo KO
How to check a string against a list of values ?
  • The regular expression itself mustn't be quoted.
  • If the regular expression becomes slightly complex (or contains a Bash variable), it should be stored in an extra variable.
  • Should the regular expression contain a SPACE character, it must be escaped : \ .

Test a no-match (source) :

string='abcdef'; [[ ! "$string" =~ 123 ]] && echo A || echo B
string='abcd123ef'; [[ ! "$string" =~ 123 ]] && echo A || echo B

Retrieve matched substrings (source) :

string='abcdef'; [[ $string =~ .(.).(.).(.) ]]; for i in {1..3}; do echo -n ${BASH_REMATCH[$i]}; done

About test, [ and [[ (source : 1, 2) :

[ (test command) and [[ ("new test" command) are used to evaluate expressions :

  • [ and test are available in POSIX shells.
  • [[ works only in Bash, Zsh and the Korn shell, and is more powerful.


Bash script flags

Many flags can be moved up or down to alter the behavior of Bash scripts (defensive scripting, debugging, ...). Flags can be raised by 2 means : To lower a flag : set +flag
Flag Usage
-e Exit immediately if a simple-command exits with a non-zero status.
-n read commands but do not execute them
-u leave script and display an error message when using an unset variable
-v show the code as it is read
-x show the code as it is executed

Notes and examples with -e (details) :

This doesn't work on :

  • commands within if, until, while block
  • compound commands (list using && or ||)
  • commands with return value being inverted via !

set -e; echo -n 'hello'; true; echo ' world'
Outputs : hello world
echo "set -e; echo -n 'hello'; false; echo ' world'" | bash
Outputs : hello (the echo | bash hack is just to be able to see the result, since because of the false, an exit is executed, forcing to leave the current shell)
set -e; dir='/tmp'; if [ -d "$dir" ] ; then echo "$dir exists"; else echo "$dir does not exist"; fi
Outputs : /tmp exists
No non-success exit code met, -e keeps sleeping.
set -e; dir='/aDirThatDoesNotExist'; if [ -d "$dir" ] ; then echo "$dir exists"; else echo "$dir does not exist"; fi
Outputs : /aDirThatDoesNotExist does not exist
A non-success exit code is met, but -e is muzzled by if.
set -e; dir='/tmp'; [ -d "$dir" ] && echo "$dir exists" || echo "$dir does not exist"
Outputs : /tmp exists
No non-success exit code met, -e keeps sleeping.
set -e; dir='/aDirThatDoesNotExist'; [ -d "$dir" ] && echo "$dir exists" || echo "$dir does not exist"
Outputs : /aDirThatDoesNotExist does not exist
A non-success exit code is met, but -e is muzzled by ????.
#!/usr/bin/env bash
set -e
echo -n 'hello'
echo ' world'
Outputs : hello world
#!/usr/bin/env bash
set -e
echo -n 'hello'
echo ' world'
Outputs : hello, and returns the exit code 1
#!/usr/bin/env bash
set -e
echo -n 'hello'
if true; then
	echo -n ' wonderful'
echo ' world'
Outputs : hello wonderful world
#!/usr/bin/env bash
set -e
echo -n 'hello'
if false; then
	echo -n ' wonderful'
echo ' world'
Outputs : hello world, and returns the exit code 0.
A non-success exit code is met, but -e is muzzled by if.

Opportunity for a joke :

If you run set -e in a terminal, this will affect the current shell and any further command your "victim" will type. At the 1st non-success return code met (which is VERY easy : try TAB-completing like cd TAB), an exit will be fired, closing the terminal

If you _unintentionally_ run that joke on yourself (), you can disable the -e flag with : set +e