I like Bash scripting - Scripting's my favorite (HowTo's)

How to (re-)indent code automatically ?

It is sadly frequent that we have to dive into old + cluttered + hardly readable code, that is even not indented properly (or not indented at all). One of the first things to do, then, is to improve readability with proper indentation of the code. This won't magically turn lead into gold, but it helps anyway. Especially if it can be done effortlessly .

This is what reIndentCode.sh is about. Turning this :
a ( b
c d [ e ] f [
g h { j (
k ) } l m
n ] o ) p
q r
into this :
a ( b
	c d [ e ] f [
		g h { j (
				k ) } l m
		n ] o ) p
q r
This shell script is a quick-n-dirty solution I implemented when dealing with a 50KiB Perl script going awry. I bet it has many limitations, but it can be a starting point for next time / different situations.

How to get a cascading list of owner + group + permissions for a specified file ?

Situation :

Sometimes you end in a situation where you can not write into /this/is/a/very/long/pathTo/myFile, even though myFile has the proper bits set. So it would be interesting to have a clear picture of owner + group + permissions, from the top, down to myFile.

Solution :

The solution below works fine, so I'll leave it here —maybe it can become an inspiration for future needs— but there's actually a single command to do the very same thing : namei .
fullPathToStudy='/this/is/a/very/long/pathTo/myFile'; nbFields=$(echo "$fullPathToStudy" | grep -o '/' | wc -l); output=''; for ((i=1; i<=nbFields+1; i++)); do currentPath="$(echo "$fullPathToStudy" | cut -d '/' -f 1-$i)/"; [ -d "$currentPath" ] && output="$output\n$(ls -ld "$currentPath")"; done; [ -f "$fullPathToStudy" ] && output="$output\n$(ls -l "$fullPathToStudy")"; echo -e "$output" | awk '{print $1" "$3" "$4" "$NF}' | column -s ' ' -t
drwxr-xr-x	root	root		/
drwxr-xr-x	root	root		/this/
drwxr-xr-x	bob	developers	/this/is/
drwxr-xr-x	bob	developers	/this/is/a/
drwxr-xr-x	bob	developers	/this/is/a/very/
drwx------	bob	developers	/this/is/a/very/long/
drwx------	bob	developers	/this/is/a/very/long/pathTo/
-rw-------	bob	developers	/this/is/a/very/long/pathTo/myFile

How to find duplicate lines in a file ?

Situation :

Let's consider a file such as :
Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Etiam mollis viverra ligula,
Lorem ipsum dolor sit amet,
ut luctus magna imperdiet eget.
Ut consectetur laoreet venenatis.
Nulla euismod sapien nec sodales tempor.
Lorem ipsum dolor sit amet,
Suspendisse sagittis odio eu urna imperdiet,
vitae sollicitudin ante mattis.
How can I spot the duplicated lines ?

Solution :

echo -e "Lorem ipsum dolor sit amet,\nconsectetur adipiscing elit.\nEtiam mollis viverra ligula,\nLorem ipsum dolor sit amet,\nut luctus magna imperdiet eget.\nUt consectetur laoreet venenatis.\nNulla euismod sapien nec sodales tempor.\nLorem ipsum dolor sit amet,\nSuspendisse sagittis odio eu urna imperdiet,\nvitae sollicitudin ante mattis." | sort | uniq -c | sort -n | awk '$1 > 1 && $2 !~ "^(#.*)?$" {print $0}'

Same as above, with contents stored in a file :

sort fileWithDuplicateLines | uniq -c | sort -n | awk '$1 > 1 && $2 !~ "^(#.*)?$" {print $0}'

How to remove multi-line comments ?

Situation :

Considering this piece of code below, how can I remove the comments ?
not commented
/*comment part 1
comment part 2
comment part 3*/ but I want to keep the end of this line
not commented either
This snippet will be fed into lines hereafter :
echo -e 'not commented\n/*comment part 1\ncomment part 2\ncomment part 3*/ but I want to keep the end of this line\nnot commented either'

Solution :

There are plenty of different / context-specific / incompatible / incomplete ways to do this

tr | sed | tr method :

The idea is to :
  1. turn the whole input into a single giant line
  2. remove the comments
  3. turn the remainings back into distinct lines
echo -e 'not commented\n/*comment part 1\ncomment part 2\ncomment part 3*/ but I want to keep the end of this line\nnot commented either' | tr '\n' 'X' | sed -r 's|(.*)/\*.*?\*/(.*)|\1\2|g' | tr 'X' '\n'
Doesn't work if there are several distinct multi-line comments :
echo -e 'not commented\n/*comment 1 part 1\ncomment 1 part 2\ncomment 1 part 3*/ but I want to keep the end of this line\nnot commented either\n/* comment 2 part 1\ncomment 2 part 2\ncomment 2 part 3*/, keep this\nnot commented either.' | tr '\n' 'X' | sed -r 's|(.*)/\*.*?\*/(.*)|\1\2|g' | tr 'X' '\n'

Alternate solution :

Other (better !) method (inspired by) :

echo -e 'not commented\n/*comment 1 part 1\ncomment 1 part 2\ncomment 1 part 3*/ but I want to keep the end of this line\nnot commented either\n/* comment 2 part 1\ncomment 2 part 2\ncomment 2 part 3*/, keep this\nnot commented either.' | sed 's|/\*|\n&|g; s|*/|&\n|g' | sed '/\/\*/,/*\//d' | sed '/^$/d'

How to count occurrences of a character in a string ?

echo 'Hello world' | grep -o 'l' | wc -l
3

How to escape single quotes within single-quoted strings ?

Solution :

Details :

Explanation of how '"'"' is interpreted as just ' :

There are 3 successive quoted strings : aaa, b and ccc :
 aaa  b  ccc
'foo'"'"'bar'
^   ^^^^^   ^
1   23456   7
  • 1 (') : start 1st quotation (aaa) using single quotes
  • 2 (') : end 1st quotation
  • 3 (") : start 2nd quotation (b) using double-quotes
  • 4 (') : quoted character, the one we wanted to escape
  • 5 (") : end 2nd quotation
  • 6 (') : start 3rd quotation (ccc) using single quotes
  • 7 (') : end 3rd quotation

How to concatenate strings ?

The good ol'method :

myString='hello'; myString="$myString world"; echo "$myString"
hello world

The += method :

  • This is a Bash-specific construct (i.e. /bin/bash only. It will fail in scripts starting with #!/bin/sh & al.)
  • Not supported in 3.x Bash versions (exact version not found, feel free to investigate the Bash source code)
  • myString='hello'; myString+=" world"; echo "$myString"
    hello world
  • Whatever the variable type (even though they look like integers), += concatenates :
    value=42; value+=31; echo "$value"
    4231
  • To workaround this and actually sum, use let :
    value=42; let value+=31; echo "$value"
    73
    You can also subtract, multiply and divide :
    value=42; let value-=40; echo "$value"; value=42; let value*=4; echo "$value"; value=42; let value/=2; echo "$value"
    2
    168
    21

How to get a file modification date ?

Please, please, PLEASE : don't parse the output of ls !!!

There are dedicated + convenient + reliable + simple commands to get a file modification date :

How to write all the outputs of a script into a file ?

Situation :

Solution 1 (badness=1) :

command1 > "/path/to/logFile" 2>&1
command2 >> "/path/to/logFile" 2>>&1
command3 >> "/path/to/logFile" 2>>&1
Does the job but will clutter your script making it barely readable. Ok for very short scripts.

Solution 2 (badness=100) :

(
command1
command2
command3
) > "/path/to/logFile" 2>&1
This hack looks so awkward I honestly couldn't have imagined that myself. I saw that in a 1000+ lines non-indented shell script full of UUoC, backticks value=`command`, parsing the output of ls, ...

jump to the solution

Details :

Solution :

#!/usr/bin/env bash

showResult() {
	local blockType=$1
	echo "======== cat '$logFile' after '$blockType' block"
	cat "$logFile"
	echo '======== /cat'
	rm "$logFile"
	echo
	}


logFile="$0.log"
myVariable='initial value'

########## with '()' ##########
#	==> creates a subshell
echo "before () : $myVariable"
(
	myVariable='changed inside ()'
	echo 'hello'
	echo 'world'
	echo "in () : $myVariable"
) > "$logFile"
echo "after () : $myVariable"

showResult '()'


########## with '{}' ##########
#	==> no subshell
echo "before {} : $myVariable"
{
	myVariable='changed inside {}'
	echo 'hello'
	echo 'world'
	echo "in () : $myVariable"
} > "$logFile"
echo "after {} : $myVariable"

showResult '{}'


########## with 'exec' ##########
#	==> uses file descriptors
echo "before exec : $myVariable"
exec 10>&1 20>&2 1>"$logFile" 2>&1

myVariable='changed inside exec'
echo 'hello'
echo 'world'
echo "in exec : $myVariable"

exec 2>&20 20>&- 1>&10 10>&-
echo "after exec : $myVariable"

showResult 'exec'
before () : initial value
after () : initial value
======== cat './test.sh.log' after '()' block
hello
world
in () : changed inside ()
======== /cat

before {} : initial value
after {} : changed inside {}
======== cat './test.sh.log' after '{}' block
hello
world
in () : changed inside {}
======== /cat

before exec : changed inside {}
after exec : changed inside exec
======== cat './test.sh.log' after 'exec' block
hello
world
in exec : changed inside exec
======== /cat

Details :

exec can be used to redirect all the outputs. If, at some point of the script, we want to "stop redirecting the outputs", we have to (source) :

step description command file descriptors
0 before any redirection (n/a)
  • 1 : /dev/stdout
1 prepare for the recovery, then redirect exec 3>&1 1>logFile
  • 3 : /dev/stdout
  • 1 : logFile
2 use the output redirection : anything that should normally be written to screen goes to logFile any command you like
3 recover the original standard output (i.e. "stop redirecting") exec 1>&3 3>&-
  • 1 : /dev/stdout
  • 3 : closed. nowhere ?

Named file descriptors (aka "automatic file descriptor allocation")

Based on the example above :
  • I don't know whether there are restrictions (or risk of collisions) on file descriptor numbers...
  • file descriptors 3 to 9 are available (source)
  • starting from Bash 4.1 (May 2010) (source), it is possible (recommended?) to use "automatic file descriptor allocation" instead of picking numbers manually (source) :
#!/usr/bin/env bash

outFile=$(mktemp)

echo 'foo'

exec {fileDescriptor}>&1 1>"$outFile"
echo 'bar'

exec 1>&${fileDescriptor} {fileDescriptor}>&-
echo 'baz'

cat "$outFile"
rm "$outFile"
echo "The chosen file descriptor was : '$fileDescriptor'"
foo
baz
bar
The chosen file descriptor was : '10'
  • x&<y and x&>y both mean "make x a copy of y". The only difference is that they respectively refer to an input and output file descriptor.
  • if y is -, x will be closed.
source

Snippet to copy-paste :

exec {previousStdout}>&1 {previousStderr}>&2 1>"/path/to/logFile" 2>&1

(script content goes here)

exec 2>&${previousStderr} {previousStderr}>&- 1>&${previousStdout} {previousStdout}>&-

How to detect whether a script is run interactively or not ?

Situation :

As most (if not all) questions found here, this question arose from a real-life situation. That time, I had to fix an old script written by someone who's not even part of the company anymore...
My readings and the tests I made lead me to the conclusion that asking this question is the sign of poor script design. Functionalities like : should be enough to design a script that works fine —exactly the same way— whatever method is used to start it : manually / at / cron.
For best practices, have a look at :

Details :

#!/usr/bin/env bash
#	To fire this script via 'at' :
#	at $(date --date "now +1 minutes" '+%H%M') -f myScript.sh

exec 1> output.txt

########################################## ##########################################################

[ -z "$PS1" ] && interactive='no' || interactive='yes'; echo "interactive 1 : '$interactive'"

# when run :
#	manually			==>	yes
# 	in a script			==>	no
# 	in a script fired by 'at'	==>	yes
# 	in a script fired by 'cron'	==>	no


########################################## ##########################################################

# source
case $- in *i*) interactive='yes' ;; *) interactive='no' ;; esac; echo "interactive 2 : '$interactive'"

# when run :
#	manually			==>	yes
# 	in a script			==>	no
# 	in a script fired by 'at'	==>	no
# 	in a script fired by 'cron'	==>	no


########################################## ##########################################################

echo "dollarDash : '$-'"
# when run :
#	manually			==>	himBHs
# 	in a script			==>	hB
# 	in a script fired by 'at'	==>	s
# 	in a script fired by 'cron'	==>	hB

Solution :

There is no easy / standard / reliable way to distinguish use cases, but there are some workarounds :

How to transpose line to column / column to line ?

Line to column Column to line
"Single" input
  • fieldSeparator=';'; echo "foo${fieldSeparator}bar${fieldSeparator}baz" | tr "$fieldSeparator" '\n'
  • echo -e 'foo\nbar\nbaz' | xargs
  • fieldSeparator=';'; echo -e 'foo\nbar\nbaz' | xargs | tr ' ' "$fieldSeparator"
"Multiple" input See https://unix.stackexchange.com/questions/520031/pivot-file-values#answer-520047

How to compare code snippets ?

  1. Append these functions to ~/.bash_aliases :
    getFileSnippet() {
    	local fileToInspect=$1
    	local startLine=$2
    	local stopLine=$3
    	tmpFile=$(mktemp --tmpdir='/run/shm' tmp.XXXXXXXX)
    	sed -n "$startLine,${stopLine}p" "$fileToInspect" > "$tmpFile"
    	echo "$tmpFile"
    	}
    
    compareSnippets() {
    	[ $# -ne 6 ] && { echo 'Wrong number of arguments, 6 expected.'; return $UNIX_FAILURE; }
    	local fileToInspect1=$1
    	local startLine1=$2
    	local stopLine1=$3
    	local fileToInspect2=$4
    	local startLine2=$5
    	local stopLine2=$6
    
    	for argumentToCheck in fileToInspect1 fileToInspect2; do
    		[ -f "${!argumentToCheck}" ] || { echo "Argument '${!argumentToCheck}' is not a file."; return $UNIX_FAILURE; }
    	done
    
    	snippet1=$(getFileSnippet "$fileToInspect1" "$startLine1" "$stopLine1")
    	snippet2=$(getFileSnippet "$fileToInspect2" "$startLine2" "$stopLine2")
    	diff "$snippet1" "$snippet2"
    	rm "$snippet1" "$snippet2"
    	}
  2. compare code snippets with :
    compareSnippets fileA 117 124 fileB 159 166

How to remove the line matching a regexp and the following one ?

Situation :

I have :
foo1
bar1
foo2 REMOVE THIS LINE AND THE FOLLOWING ONE
bar2
foo3
bar3
I want :
foo1
bar1
foo3
bar3

Solution :

echo -e 'foo1\nbar1\nfoo2 REMOVE THIS LINE AND THE FOLLOWING ONE\nbar2\nfoo3\nbar3' | awk 'BEGIN {matchingLineNumber=-1}; /REMOVE/ {matchingLineNumber=NR; next}; NR==matchingLineNumber+1 {next}; {print}'

Details :

Let's consider the awk '...' part remembering :
BEGIN {matchingLineNumber=-1}; /REMOVE/ {matchingLineNumber=NR; next}; NR==matchingLineNumber+1 {next}; {print}
becomes :
BEGIN				{ matchingLineNumber = -1 }
/REMOVE/			{ matchingLineNumber = NR; next }
NR == matchingLineNumber + 1	{ next }
				{ print }
It works whatever the position of the matching line within the input :

echo -e 'foo2 REMOVE THIS LINE AND THE FOLLOWING ONE\nbar2\nfoo1\nbar1\nfoo3\nbar3' | awk 'BEGIN {matchingLineNumber=-1}; /REMOVE/ {matchingLineNumber=NR; next}; NR==matchingLineNumber+1 {next}; {print}'

echo -e 'foo1\nbar1\nfoo3\nbar3\nfoo2 REMOVE THIS LINE AND THE FOLLOWING ONE\nbar2' | awk 'BEGIN {matchingLineNumber=-1}; /REMOVE/ {matchingLineNumber=NR; next}; NR==matchingLineNumber+1 {next}; {print}'

How to swap command-line arguments ?

Situation :

While building a one-liner, one of the commands outputs :
a b
whereas the next command expects :
b a
How may I swap them ?

Solution :

How to read a multiline variable line by line ?

Situation :

myVariable=$(echo {a..c} | tr ' ' '\n'); echo "myVariable : '$myVariable'"
myVariable : 'a
b			definitely a multiline variable
c'
echo "$myVariable" | while read aSingleLineOfMyVariable; do
	echo "a single line : '$aSingleLineOfMyVariable'"
done; echo "last value : '$aSingleLineOfMyVariable'"
a single line : 'a'	works fine inside the loop
a single line : 'b'
a single line : 'c'
last value : ''		undefined variable since it only exists in the subshell created while piping

Details :

tmpFile=$(mktemp); echo "$myVariable" > "$tmpFile"; while read aSingleLineOfMyVariable; do
	echo "a single line : '$aSingleLineOfMyVariable'"
done < "$tmpFile"; echo "last value : '$aSingleLineOfMyVariable'"; rm "$tmpFile"
a single line : 'a'
a single line : 'b'
a single line : 'c'
last value : ''		$aSingleLineOfMyVariable is lost when leaving the loop
This does the job but :
  • creating a temporary file is not very elegant
  • creating a temporary file may have a performance cost if repeated numerous times
  • may require write permissions to /tmp
  • variables created within the loop are local to the loop
while IFS= read aSingleLineOfMyVariable; do
	echo "a single line : '$aSingleLineOfMyVariable'"
done < <(printf '%s\n' "$myVariable"); echo "last value : '$aSingleLineOfMyVariable'"
a single line : 'a'
a single line : 'b'
a single line : 'c'
last value : ''		$aSingleLineOfMyVariable is lost again, continue reading
Whatever command is (including something like echo "$someVariable"), the <(command) construct (aka process substitution) expands to a file and, as such, can be fed into anything with < or >.

Solution :

previousIfs="$IFS"; IFS=$'\n'; for aSingleLineOfMyVariable in $myVariable; do
	echo "a single line : '$aSingleLineOfMyVariable'"
done; echo "last value : '$aSingleLineOfMyVariable'"; IFS="$previousIfs"
a single line : 'a'
a single line : 'b'
a single line : 'c'
last value : 'c'	$aSingleLineOfMyVariable is not lost this time 

This comment says this is because while loops create a subshell, whereas for loops don't. I'm afraid this is wrong... (it is !)

Let's check this :

Test #1 :
tmpFile=$(mktemp); echo -e 'a\nb\nc' > "$tmpFile"; while read item; do echo "item (during loop) : $item"; done < "$tmpFile"; echo "item (after loop) : $item"; rm "$tmpFile"
item (during loop) : a
item (during loop) : b
item (during loop) : c
item (after loop) :		empty variable
for item in {a..c}; do echo "item (during loop) : $item"; done; echo "item (after loop) : $item"
item (during loop) : a
item (during loop) : b
item (during loop) : c
item (after loop) : c		still exist outside of the loop 

Looks like it was right, after all ?

Test #2 :
for i in 'in for loop'; do myVariable='foo'; echo "$i"; done; echo "myVariable : $myVariable"
in for loop
myVariable : foo

A variable set within a for loop still exists after the loop.

while true; do myVariable='bar'; echo 'in while loop (1)'; break; done; echo "myVariable : $myVariable";
while [ -z "$i" ]; do myVariable='baz'; echo 'in while loop (2)'; i=1; done; echo "myVariable : $myVariable"
in while loop (1)
myVariable : bar
in while loop (2)
myVariable : baz

A variable set within a while loop still exists after the loop, whichever way the loop ends : break or regular exit. This proves the comment linked above WRONG !

while read item; do myVariable=meu; echo "item : '$item', myVariable : '$myVariable'"; done < <(echo -e 'ga\nbu\nzo'); echo "item : '$item', myVariable : '$myVariable'"
item : 'ga', myVariable : 'meu'
item : 'bu', myVariable : 'meu'
item : 'zo', myVariable : 'meu'
item : '', myVariable : 'meu'

Trying to workaround with a process substitution makes no difference

unset myVariable; for i in whatever; do read myVariable; echo "myVariable : '$myVariable'"; done; echo "myVariable : '$myVariable'"
or :
unset myVariable i; while [ -z "$i" ]; do read myVariable; echo "myVariable : '$myVariable'"; i=foo; done; echo "myVariable : '$myVariable'"
+ any +
myVariable : 'any'
myVariable : 'any'

A variable set with read, both in for and while loops, survives the end of the loop.

tmpFile=$(mktemp); echo -e 'ga\nbu\nzo' > "$tmpFile"; while read item; do myVariable=meu; echo "item : '$item', myVariable : '$myVariable'"; done < "$tmpFile"; echo "item : '$item', myVariable : '$myVariable'"; rm "$tmpFile"
item : 'ga', myVariable : 'meu'
item : 'bu', myVariable : 'meu'
item : 'zo', myVariable : 'meu'
item : '', myVariable : 'meu'

A variable set with a while read construct doesn't survive the end of the loop. Would that be the reason ?

The revelation (source) :
Indeed, the while read construct is the explanation of this behavior. While consuming lines of input, while read myVariable...
  1. catches a line of input and stores it in myVariable
  2. if it succeeds (i.e. not the end of the input), the while loop continues normally
  3. otherwise (i.e. no more input data) :
    1. myVariable gets an empty value
    2. this makes read return a UNIX_FAILURE
    3. the loop condition is false : the while loop ends
  4. once outside of the while loop, myVariable looks empty. It's actually been overwritten with an empty value by the last read

How to convert a relative path into an absolute path to source from anywhere ?

Situation :

Consider script.sh that sources functions.sh like this :
. ./functions.sh

Details :

This implies script.sh and functions.sh are in the same directory, and it works only if script.sh is launched from its own directory. Otherwise, the relative path ./functions.sh can not be resolved because ./ is interpreted as "the directory from which the command is launched".

Solution :

To workaround this, you can automatically translate the relative path into an absolute path before source-ing :
# Include an external file even though the current script is not launched from its own directory
directoryOfThisScript="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
. "$directoryOfThisScript/functions.sh"

Alternate solution :

Either : followed by : . "$directoryOfThisScript/functions.sh"
These methods works in most cases. BUT, I met a situation where it didn't do the job (something tricky / specific / weird, maybe ). Unfortunately, I forgot to write about it here at that time and can't remember the details (). I don't use this method anymore.

Alternate solution :

Full snippet :
######################################### includes ##################################################
nameOfThisScript=$(basename "${BASH_SOURCE[0]}")
directoryOfThisScript="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

configFile="$directoryOfThisScript/$nameOfThisScript.conf"
functionsFile="$directoryOfThisScript/functions.sh"

for fileToSource in "$configFile" "$functionsFile"; do
	source "$fileToSource" 2>/dev/null || {
		echo "File '$fileToSource' not found"
		exit 1
		}
done
######################################### /includes #################################################

How to return a value from a function ?

Usage :

There are several ways to do so :
return + $?
For integers only (and "not too big" ones, preferably). Actually, the returned number will be n modulo 256 :
foo() { return $i; }; for i in {254..258}; do foo $i; echo "$i $?"; done
254 254
255 255
256 0
257 1
258 2
echo + $(...)
For anything you like : numbers, strings, ...

Example :

#!/usr/bin/env bash

functionThatReturns() {
	return 42
	}

functionThatEchoes() {
	echo 42
	}

functionThatReturns
echo "1. '$?'"

result=$(functionThatReturns)
echo "2. '$result'"

functionThatEchoes
echo "3. '$?'"

result=$(functionThatEchoes)
echo "4. '$result'"










1. '42'		return + $? = OK


2. ''		return + $(...) = KO
42		this is the echo made in the function

3. '0'		this only means the function ended successfully


4. '42'		echo + $(...) = OK

How to get the position of a character in a string ?

Command When haystack has needle ...
as the 1st character 0 time exactly 1 time more than 1 time
haystack='hello world'; needle='h' haystack='hello world'; needle='z' haystack='hello world'; needle='w' haystack='hello world'; needle='o'
echo "$haystack" | grep -oab "$needle" | grep -oE '[0-9]+'
0
0-based index
UNIX_FAILURE
6
0-based index
4
7
0-based index
printf '%s\n' "$haystack" | grep -o . | grep -n "$needle" | grep -oE '[0-9]+'
1
1-based index
UNIX_FAILURE
7
1-based index
5
8
1-based index
tmp="${haystack%%$needle*}"; echo ${#tmp}
0
0-based index
11
length of $haystack
6
0-based index
4
position of 1st occurrence, 0-based index
echo "$haystack" | awk -v x="$needle" '{ print index($0, x) }'
1
1-based index
0
Looks like a reliable "not found" indicator
7
1-based index
5
position of 1st occurrence, 1-based index

How to display the xth, yth and zth lines of a stream of text ?

Let's imagine a command returned 10 lines of text, and we want to display only the 1st, 3rd and 8th lines.

With grep : let's ask grep to add line numbers, then select the lines to display (again with grep), then hide the line numbers :
for i in {a..j}; do echo "Line $i"; done | grep -n '' | grep -E '^(1|3|8):' | sed -r 's/^[0-9]+://'
Even simpler, with sed :
for i in {a..j}; do echo "Line $i"; done | sed -n '1p; 3p; 8p'

How to keep / remove the n leading / trailing characters of a string ?

Here are a few snippets with :
keep remove
leading
trailing

How to interrupt a loop ?

An example is worth 1000 words :
#!/usr/bin/env bash

listOfCommands=': break continue return exit'
for command in $listOfCommands; do
	echo "With '$command'"

	for i in {1..3}; do
		echo -n ' ==> '
		[ "$i" -eq 2 ] && eval "$command"
		echo $i
	done
	echo -e 'This comes after the loop\n'

done
With ':'
 ==> 1
 ==> 2		nothing special, as expected
 ==> 3
This comes after the loop

With 'break'
 ==> 1
 ==> This comes after the loop		stop the whole loop block and continue the script after it

With 'continue'
 ==> 1
 ==>  ==> 3		skip the rest of the current loop and start the next loop
This comes after the loop

With 'return'
 ==> 1
 ==> ./test.sh: line 9: return: can only `return' from a function or sourced script		complain and interrupt nothing
2
 ==> 3
This comes after the loop

With 'exit'
 ==> 1
 ==> 		stop the script, return to the shell prompt

How to test if a variable is an integer ?

UNIX_SUCCESS=0
UNIX_FAILURE=1
isInteger() {
	[[ $1 =~ ^[-+]?[0-9]+$ ]] && echo $UNIX_SUCCESS || echo $UNIX_FAILURE
	}
[ $(isInteger "$myVariable") -eq "$UNIX_FAILURE" ] && {
	(shout how unhappy you feel)
	}

How to parse a string character by character ?

You could try :
myString='hello'
length=$((${#myString}-1))
for i in $(eval echo "{0..$length}"); do
	echo $i : ${myString:i:1}
done
But this is even better (source) :
myString='hello'
for((i=0; i<${#myString}; i++)); do
	echo $i : ${myString:i:1}
done

How to store a command into a variable ?

Use eval.

If you want to store the result of a command (but not the command itself) into a variable, just use the $(command) construct :

now=$(date); echo "'now' was : $now"

How to check a string against a list of values ?

With a for loop :

To make sure a command line parameter matches a value from a list :

#!/usr/bin/env bash

valueToCheck="$1"
listOfAcceptedValues='foo bar baz'

UNIX_SUCCESS=0
UNIX_FAILURE=1
valueToCheckIsInTheList=$UNIX_FAILURE

for value in $listOfAcceptedValues; do
	echo "Testing '$valueToCheck' against '$value'"
	[ "$valueToCheck" == "$value" ] && { echo "'$valueToCheck' is a valid value."; valueToCheckIsInTheList=$UNIX_SUCCESS; break; }
done

[ "$valueToCheckIsInTheList" -eq "$UNIX_FAILURE" ] && { echo "'$valueToCheck' is not a valid value."; exit $UNIX_FAILURE; }

With a case construct :

This suits cases where different input values imply different behaviors of the script. Otherwise, using case may be overkill, and the for loop method may be more adapted.

#!/usr/bin/env bash

valueToCheck="$1"

case "$valueToCheck" in
	'foo')
		echo "'$valueToCheck' is a valid value."
		# do something with 'foo'
		;;
	'bar')
		echo "'$valueToCheck' is a valid value."
		# do something different with 'bar'
		;;
	'baz')
		echo "'$valueToCheck' is a valid value."
		# do something different again with 'baz'
		;;
	*)
		echo "'$valueToCheck' is not a valid value."
		# deal with it !!!
		;;
esac

With a regular expression :

This method may return false positives if used improperly. Indeed, to check whether foo is within foo bar baz, we can "regexp match" foo bar baz against foo (it matches), but fo and f also match.

It _may_ sound more logical to do it the other way round : matching foo against foo bar baz, but this obviously can't work.

To workaround false positive matches of fo and f, we must use "word boundary detectors" :
  • (^|[[:space:]]) and ($|[[:space:]]) (which is very well explained here) (this is possibly wrong or obsolete)
  • or \b
#!/usr/bin/env bash

listOfAcceptedValues='foo bar baz'
listOfValuesToCheck="foo bar baz poo 123 bam ofo fo f ''"

for valueToCheck in $listOfValuesToCheck; do
	[[ "$listOfAcceptedValues" =~ "$valueToCheck" ]] && result1='' || result1=' not'
	[[ "$listOfAcceptedValues" =~ (^|[[:space:]])"$valueToCheck"($|[[:space:]]) ]] && result2='' || result2=' not'
	echo -e "'$valueToCheck'\tis : (1)$result1 a valid value, \t(2)$result2 a valid value."
done
'foo'	is : (1) a valid value,		(2) a valid value.
'bar'	is : (1) a valid value,		(2) a valid value.
'baz'	is : (1) a valid value,		(2) a valid value.
'poo'	is : (1) not a valid value,	(2) not a valid value.
'123'	is : (1) not a valid value,	(2) not a valid value.
'bam'	is : (1) not a valid value,	(2) not a valid value.
'ofo'	is : (1) not a valid value,	(2) not a valid value.
'fo'	is : (1) a valid value,		(2) not a valid value.
'f'	is : (1) a valid value,		(2) not a valid value.
''''	is : (1) not a valid value,	(2) not a valid value.
Other example :
validUsers='kevin stuart bob'; for user in alice bob bobby; do regex="\b$user\b"; [[ "$validUsers" =~ $regex ]] && echo "'$user' is valid" || echo "'$user': NOPE"; done
'alice': NOPE
'bob' is valid
'bobby': NOPE

Other methods :

There are other methods, but they don't all apply to the same use cases, especially if they have to check untrusted values (i.e. user input) which may contain wildcards or stuff like that.

How to generate a string made of repetitions of any character / substring ?

Repeat a single character :

With Perl (source) :
perl -e 'print "X"x42; print "\n"'
Check :
generatedString=$(perl -e 'print "X"x42; print "\n"'); echo $generatedString; echo ${#generatedString}
With Python :
python -c "print('X' * 42)"
Check :
generatedString=$(python -c "print('X' * 42)"); echo $generatedString; echo ${#generatedString}
With Bash (inspired by...) :
The complex and ugly solution :

numbers=$(echo -n {01..42}; echo ' '); echo ${numbers//???/X}
Check :
numbers=$(echo -n {01..42}; echo ' '); generatedString=${numbers//???/X}; echo $generatedString; echo ${#generatedString}

How does it work ?
  1. We generate a series of 2-digit numbers, separated by spaces and with a final space so that we have n repetitions of digit-digit-space (n being the length of the final string) : 01 02 03 04 05 06 07 08 09 10 11 ... 40 41 42[SPACE]
  2. Then, using shell brace expansions, we replace every occurrence of any 3-character sequence (our digit-digit-space) with the character to be repeated (here : X)
This means this method is highly dependent on n, the number of repetitions :
0 < n < 10
We need to generate a list of 1-digit space-separated numbers (1 2 3 4 5[SPACE]), then replace every occurrence of 2 characters (digit-space) with X.
0 < n < 100
Same as above, with a list of 2-digit space-separated numbers, and replacing occurrences of 3 characters.
0 < n < 1000
Same as above, with a list of 3-digit space-separated numbers, and replacing occurrences of 4 characters.
The short and elegant solution (source) :
printf 'X%.0s' {1..20}

Repeat a string :

With Perl :
perl -e 'print "ABCD "x4; print "\n"'
ABCD ABCD ABCD ABCD[SPACE]
With Python :
python -c "print('ABCD ' * 4)"
ABCD ABCD ABCD ABCD[SPACE]
With Bash :
The complex and ugly solution :
numbers=$(echo -n {1..4}; echo ' '); echo "${numbers//??/ABCD }"
ABCD ABCD ABCD ABCD[SPACE]

Double quotes are necessary here because the string to repeat ends on a space character :

  • numbers=$(echo -n {1..4}; echo ' '); echo -n ${numbers//??/ABCD}; echo '.' : ABCDABCDABCDABCD. No need for quotes.
  • numbers=$(echo -n {1..4}; echo ' '); echo -n ${numbers//??/ABCD }; echo '.' : ABCD ABCD ABCD ABCD. Without quotes : no trailing space.
  • numbers=$(echo -n {1..4}; echo ' '); echo -n "${numbers//??/ABCD }"; echo '.' : ABCD ABCD ABCD ABCD[SPACE]. With quotes : trailing space.

The short and elegant solution (source) :
printf 'ABCD %.0s' {1..20}

How to compare a variable against a list of characters ?

Generic case :

#!/usr/bin/env bash

answerIsValid=''

while [ -z "$answerIsValid" ]; do
	echo "Continue ? [yn]"
	read answer
	[[ "$answer" == [yYnN] ]] && answerIsValid=1 || echo -e "Invalid answer\n"
done
  • This is effectively a list of characters, not a regular expression.
  • The characters list mustn't be quoted.

The list of characters can also be provided as a variable :

#!/usr/bin/env bash

answerIsValid=''
validCharacters='yYnN\[\]'

while [ -z "$answerIsValid" ]; do
	echo "Continue ? [yn]"
	read answer
	[[ "$answer" == [$validCharacters] ]] && answerIsValid=1 || echo -e "Invalid answer\n"
done
  • Special characters must be escaped.
  • The example above works either single or double-quoted.

This example script accepts [ or \[ as valid inputs, but .[ is rejected .

How to read command-line arguments in a script ?

Use getopts.

How to explode a string into an array ?

The basics :

Explode :
IFS=', ' read -a myArray <<< "$stringToExplode"
Iterate over elements :
for element in "${myArray[@]}"; do
	echo "$element"
done
Iterate over elements using key/value pairs :
for index in "${!myArray[@]}"; do
	echo "$index ${myArray[index]}"
done

Ready-to-use one-liners :

Explode string and iterate on values :

stringToExplode='Lorem ipsum dolor sit amet, ...'; oldIfs="$IFS"; IFS=' '; read -a myArray <<< "${stringToExplode}"; IFS="$oldIfs"; for element in "${myArray[@]}"; do echo "$element"; done

Lorem
ipsum
dolor
sit
amet,
...
Explode string and iterate on key/value pairs :

stringToExplode='Lorem ipsum dolor sit amet, ...'; oldIfs="$IFS"; IFS=' '; read -a myArray <<< "${stringToExplode}"; IFS="$oldIfs"; for index in "${!myArray[@]}"; do echo "$index ${myArray[index]}"; done

0 Lorem
1 ipsum
2 dolor
3 sit
4 amet,
5 ...

A word of warning :

After exploding a string into an array, the variable ${#arrayName} may contain the number of elements of arrayName +1, but it's not safe to rely on this variable :

stringToExplode='Lorem ipsum dolor sit amet, consectetur adipiscing elit.'; oldIfs="$IFS"; IFS=' '; read -a myArray <<< "${stringToExplode}"; IFS="$oldIfs"; echo -e "MAX ID : ${#myArray}\nLength : ${#myArray[@]}"

MAX ID : 5
Length : 8

Arrays can be sparse so you shouldn't use the length to get the last element :

stringToExplode='Lorem ipsum dolor sit amet, ...'; oldIfs="$IFS"; IFS=' '; read -a myArray <<< "${stringToExplode}"; IFS="$oldIfs"; echo -e "MAX ID : ${#myArray}\nLength : ${#myArray[@]}"; for index in "${!myArray[@]}"; do echo "$index ${myArray[index]}"; done; unset myArray[3]; echo -e "\nMAX ID : ${#myArray}\nLength : ${#myArray[@]}"; for index in "${!myArray[@]}"; do echo "$index ${myArray[index]}"; done

MAX ID : 5
Length : 6
0 Lorem
1 ipsum
2 dolor
3 sit
4 amet,
5 ...

MAX ID : 5
Length : 5
0 Lorem
1 ipsum
2 dolor
4 amet,
5 ...

To get the number of elements of an array, use ${#arrayName[@]} and forget about ${#arrayName}.

How to "retry until success" ?

Re-launch a failed command :

#!/bin/bash

# This function simulates the REAL job to do, which may fail.
doSomething() {
	random=$RANDOM
	echo "I'm working with '$1' and '$2' and '$3' (RANDOM = $random)"
	echo
	[ $random -lt 10000 ] && return 0 || return 1
	}

doSomethingUntilSuccess() {
	loopNumber=$((loopNumber+1))
	echo "Doing 'doSomethingUntilSuccess' (loop $loopNumber) ..."
	doSomething 'foo' 'bar' 42 || doSomethingUntilSuccess
	}

loopNumber=0
doSomethingUntilSuccess
echo 'DONE !'

Simpler, with an until loop :

#!/bin/bash

trap "{ echo 'CTRL-C detected... Bye-bye.'; exit 1; }" SIGINT

# This function is the job to redo if failed
doSomething() {
	random=$RANDOM
	echo "I'm working with '$1' and '$2' and '$3' (RANDOM = $random)"
	echo
	[ $random -lt 10 ] && return 0 || return 1
	}

loopNumber=1

until doSomething 'foo' 'bar' 42; do
	echo "$loopNumber loops so far, and running ..."
	loopNumber=$((loopNumber+1))
done

echo "DONE !"
echo "In $loopNumber loops "

How to sum numbers ?

Numbers listed in a file :

Making sums :
Pure Bash version :
sum=0; while read value; do sumBefore=$sum; sum=$((sum+value)); echo "$sumBefore + $value = $sum"; done < fileWithNumbers; echo $sum
bc version :
sum=0; while read value; do sumBefore=$sum; sum=$(echo "$sum+$value" | bc); echo "$sumBefore + $value = $sum"; done < fileWithNumbers; echo $sum
Awk version :
awk '{ sum += $1 } END { print sum }' fileWithNumbers
Which is the fastest ?
tmpFile=$(mktemp --tmpdir tmp.numbers.XXXXXXXX); >$tmpFile; for i in {1..1000}; do echo $i >> $tmpFile; done; echo 'Bash sum'; time (sum=0; while read value; do sum=$((sum+value)); done < $tmpFile; echo "$sum"); echo 'bc sum'; time (sum=0; while read value; do sum=$(echo "$sum+$value" | bc); done < $tmpFile; echo "$sum"); echo 'Awk sum'; time (awk '{ sum += $1 } END { print sum }' $tmpFile); rm $tmpFile
From the fastest to the slowest :
  1. Awk
  2. Bash
  3. bc

Numbers returned to STDOUT :

Dummy example :
echo -e 'result1=3\nresult2=4\nresult3=5' | awk -F '=' '{ sum += $2 } END {print sum}'
Total storage, in GiB, on internal drives :
df -m | awk '/^\/dev/ { sum += $2 } END { print sum/1024 }'
Number of occurrences of a given expression within a bunch of files :
grep -c 'a given expression' * | awk -F ':' '{ sum += $2 } END {print sum}'

How to check the number of arguments passed to a script ?

nbExpectedArgs=1
if [ $# -ne $nbExpectedArgs ]; then
	echo "Usage: $(basename $0) <argument>"
	exit 1
fi
To display a longer error message with a usage function :
#!/usr/bin/env bash

nbExpectedArgs=3

usage() {
	cat <<-EOF
	Usage: $(basename $0) <argument1> <argument2> <argument3>
	blah
	blah
	blah
	EOF
	}

if [ $# -ne $nbExpectedArgs ]; then
	usage
	exit 1
fi

The long error message is displayed with cat rather than echo.