./myScript.sh: errorLine: ./myScript.sh: Bad substitution
$()
construct or brace expansion) that is not supported by the shell used to execute it.$line
). There are several methods to do so :
Is one faster than the other ?
#!/usr/bin/env bash nbLines=5000 nbFieldsPerLine=100 fieldSize=10 fieldSeparator=';' fieldToExtract=76 tmpFile=$(mktemp --tmpdir='/run/shm') resultFile=$(mktemp --tmpdir='/run/shm') showStep() { stepDescription=$1 echo -e "\n$stepDescription" } showMethod() { methodDescription=$1 showStep "Getting the ${fieldToExtract}th field with the '$methodDescription' method" } function getDurationOfAction() { action=$1 { time "$1"; } 2>&1 | awk '/real/ { print $2 }' } showStep "Preparing source data : $nbLines lines of $nbFieldsPerLine fields ($fieldSize characters each)" for((i=0; i<"$nbLines"; i++)); do pwgen "$fieldSize" -N "$nbFieldsPerLine" -1 | xargs | tr ' ' "$fieldSeparator" >> "$tmpFile" done showMethod 'variable=$(echo | cut)' getData_echoCutVariableMethod() { while read line; do data=$(echo "$line" | cut -d "$fieldSeparator" -f "$fieldToExtract") done < "$tmpFile" } getDurationOfAction 'getData_echoCutVariableMethod' showMethod "echo | cut > $resultFile" getData_echoCutResultFileMethod() { while read line; do echo "$line" | cut -d "$fieldSeparator" -f "$fieldToExtract" > "$resultFile" done < "$tmpFile" } getDurationOfAction getData_echoCutResultFileMethod showMethod 'variable=$(echo | awk)' getData_echoAwkVariableMethod() { while read line; do data=$(echo "$line" | awk -F "$fieldSeparator" '{print $'$fieldToExtract'}') done < "$tmpFile" } getDurationOfAction getData_echoAwkVariableMethod showMethod "echo | awk > $resultFile" getData_echoAwkResultFileMethod() { while read line; do echo "$line" | awk -F "$fieldSeparator" '{print $'$fieldToExtract'}' > "$resultFile" done < "$tmpFile" } getDurationOfAction getData_echoAwkResultFileMethod rm "$tmpFile" "$resultFile"
Preparing source data : 5000 lines of 100 fields (10 characters each) Getting the 76th field with the 'variable=$(echo | cut)' method 0m7.684s Getting the 76th field with the 'echo | cut > resultFile' method 0m5.623s Getting the 76th field with the 'variable=$(echo | awk)' method 0m12.033s Getting the 76th field with the 'echo | awk > resultFile' method 0m10.156s
variable=$(echo | cut)
is almost 2 times faster than variable=$(echo | awk)
Preparing source data : 5 lines of 1000000 fields (10 characters each) Getting the 4th field with the 'variable=$(echo | cut)' method 0m4.868s Getting the 4th field with the 'variable=$(echo | awk)' method 0m5.098s
Preparing source data : 5 lines of 1000000 fields (10 characters each) Getting the 987654th field with the 'variable=$(echo | cut)' method 0m4.909s Getting the 987654th field with the 'variable=$(echo | awk)' method 0m6.294s
variable=$(echo | ...)
is slower than echo | ... > resultFile
. Possible explanations :
Preparing source data : 1000 lines of 10 fields (10 characters each) Getting the 8th field with the 'variable=$(echo | cut)' method 0m1.371s Getting the 8th field with the 'echo | cut > /run/shm/tmp.ofVrnUNU8E' method 0m1.025s Getting the 8th field with the 'variable=$(echo | awk)' method 0m2.355s Getting the 8th field with the 'echo | awk > /run/shm/tmp.ofVrnUNU8E' method 0m1.941s
Preparing source data : 1000 lines of 10 fields (10 characters each) Getting the 8th field with the 'variable=$(echo | cut)' method 0m1.396s Getting the 8th field with the 'echo | cut > /tmp/tmp.BIxB7VAAn7' method 0m1.551s Getting the 8th field with the 'variable=$(echo | awk)' method 0m2.516s Getting the 8th field with the 'echo | awk > /tmp/tmp.BIxB7VAAn7' method 0m2.465s
while IFS=, read -r field1 field2; do # do something with "$field1" and "$field2" done < input.csv
IFS=
+ read -r
is the "proper" way of doing this, but it requires to name ALL the data fields, even if they're not used inside the loop. Moreover, this can make the while...
line longer and decrease code readability when there are MANY data fields.function foo() {}
or foo() {}
(source) ?There is (almost) no difference when working on GNU/Linux :
functionName lineNumber /path/to/functions.sh
functionName lineNumber /path/to/functions.sh
The shebang is the first line of a script (shell, Python, PERL, ...) which instructs the operating system of which binary should be used to interpret and execute the script commands. shebangs usually start with #!
, optionally followed by a space.
As for shell scripts, and especially Bash scripts, there are several flavors of shebangs :
Shebang | Pro's | Con's |
---|---|---|
#!/bin/sh |
short and simple |
|
#!/bin/bash |
calling THE Bash binary with its absolute path : short, simple, efficient. This is the safest. | some may argue this is less portable because the Bash binary may not be /bin/bash but /usr/bin/bash or /usr/local/bin/bash or ... (but I guess symlinks would be created adequately in such situations anyway) |
#!/usr/bin/env bash |
find the Bash binary wherever it is (it picks the 1st answer from the output of env) : different install path (system-wide), customized path (user-level setting). This is more portable. |
|
source config.sh
declare -a myArray myArray[0]='john' myArray[1]='paul' myArray[2]='george' myArray[3]='ringo'alternate syntax (it is valid to declare + assign at once) :
declare -a myArray=(john paul george ringo)
echo "${myArray[@]}"
john paul george ringo
echo "${myArray[2]}"
george
for
loop :for item in "${myArray[@]}"; do echo "$item" done
john paul george ringo
myArray+=('the_5th_guy') echo "${myArray[@]}"
john paul george ringo the_5th_guy
declare -A myArray myArray[foo]='this is "foo"' myArray[bar]='this is "bar"' myArray[baz]='this is "baz"'
for key in "${!myArray[@]}"; do echo -e "key :\t$key" echo -e "value :\t${myArray[$key]}\n" done
key : bar value : this is "bar" key : baz value : this is "baz" key : foo value : this is "foo"
for value in "${myArray[@]}"; do
echo -e "value :\t$value\n"
done
value : this is "bar" value : this is "baz" value : this is "foo"
${#myArray}
is :
${#}
construct has only 1 meaning : return the length of the enclosed string. What's tricky with ${#myArray}
is that omitting to specify an index within the array actually refers to its 1st item (aka myArray[0], sources : 1, 2).
${#myArray}
is the length of the 1st item of myArray1 2 3 4
${#myArray[@]}
:
3 number of fruits 5 length of string apple
0 Lorem
1 ipsum
2 dolor
3 sit
4 amet
Length : 5
0 Lorem
1 ipsum
2 dolor
4 amet no more myArray[3]
Length : 4
This article is about coding styles, which is not only personal but also far from perfect (by design ). When it comes to declare a variable in a shell script to store a path, I see at least 2 method to do so, and so far I've still not settled on one or on the other (so writing this article may help) :
/
is in the value :somePath='/path/to/directory/' foo="${somePath}foo" bar="${somePath}bar"
/
in the value :somePath='/path/to/directory' foo="$somePath/foo" bar="$somePath/bar"
Pro's | Con's | |
---|---|---|
method 1 |
|
|
method 2 |
|
|
The method 1 is 70 characters long, and the method 2 : 67. But after removing characters that are typed in both methods (variable names, =
, $
, , quotes, ...), we get (with { and } being 2 keystrokes each on a french keyboard) :
Let's settle on method 2 ! (until we find further arguments )
horizontallist and a
for
loop :data='key1;value1 key2;value2'; for tuple in $data; do key=$(echo $tuple | cut -d ';' -f 1); value=$(echo $tuple | cut -d ';' -f 2); echo "tuple: '$tuple', key : '$key', value : '$value'"; done
verticallist and a
while read
loop :data="\without this, the 1st round has empty values
key1 value1
key2 value2
key3 value3"if the closing "
is on a new line, the last round has empty values
while read key value; do
echo "key: '$key', value: '$value'"
done <<< "$data"
key: 'key1', value: 'value1' key: 'key2', value: 'value2' key: 'key3', value: 'value3'
\
?data="\
key1 value1
key2 value2\added \
key3 value3"
while read key value; do
echo "key: '$key', value: '$value'"
done <<< "$data"
key: 'key1', value: 'value1' key: 'key2', value: 'value2key3 value3'The
\
added to the data is a line continuation
\
+ the non-printable [newline]
theWorldIsFlat=true
# ...do something interesting...
if $theWorldIsFlat; then
echo 'Be careful not to fall off!'
fi
Bash doesn't know booleans. This hack works because $myVariable is replaced by true at run time, which returns a Unix success (same would go on with false, returning a Unix failure).
If you're not convinced $myVariable is an alias of the true command, try these :
#!/usr/bin/env bash UNIX_SUCCESS=0 UNIX_FAILURE=1 returnBoolean() { wantedReturnValue="$1" case "$wantedReturnValue" in "$UNIX_SUCCESS") return $(true) ;; "$UNIX_FAILURE") return $(false) ;; esac } for result in $UNIX_SUCCESS $UNIX_FAILURE; do returnBoolean $result returnCode=$? [ "$returnCode" -eq "$result" ] && echo OK || echo KO done
unset a; [ "$a" ] && echo OK || echo KO; a=1; [ "$a" ] && echo OK || echo KO; a=0; [ "$a" ] && echo OK || echo KO
KO OK OK
#!/usr/bin/env bash myVar='Hello' dynamicVar="myVar" echo "The value of '$dynamicVar' is '${!dynamicVar}'."Will output :
The value of 'myVar' is 'Hello'.
for variableToCheck in variableA variableB variableC; do echo "The variable '$variableToCheck' has value : '${!variableToCheck}'." if [ -z "${!variableToCheck}" ]; then <deal with it !> fi done
When it comes to check the input of a script in search of missing parameters, it is better to simply count parameters rather than testing all variables. Indeed, when expecting command parameterA parameterB and getting command foo, you can't tell which of parameter A or B is missing (unless you're using named parameters, of course).
#!/usr/bin/env bash listOfLists='colors fruits cars' colors='red green blue' fruits='apple banana grape' cars='ferrari porsche lada' output='' # generate all permutations of the list of lists for aList in $listOfLists; do for anotherList in $listOfLists; do [ "$anotherList" = "$aList" ] && continue for oneMoreList in $listOfLists; do [ "$oneMoreList" = "$anotherList" ] || [ "$oneMoreList" = "$aList" ] && continue output="$output\nLISTS: 1.$aList 2.$anotherList 3.$oneMoreList" # we now have all the list of lists, let's list the contents of all those lists for item1 in ${!aList}; do for item2 in ${!anotherList}; do for item3 in ${!oneMoreList}; do output="$output\n$item1 $item2 $item3" done done done # contents over done done done echo -e "$output" | column -s ' ' -t
if file exists and is. For the sake of brevity, I've cut the
file existspart because no file can be
if it doesn't exist .
Option | true if ... |
---|---|
-b | file is a block device special file, such as /dev/sda : brw-rw---T 1 root disk 8, 0 sept. 22 14:23 /dev/sda |
-d | file is a directory |
-e | file exists, whatever type of file it is |
-f | file is a regular file, not a directory or a device |
-h -L | file is a symbolic link |
-r | file is readable |
-s | file has data (i.e. is not zero-sized) |
[ -t fd ] |
true if file descriptor fd is open and refers to a terminal (which highlights an interactive shell when testing the 0 , 1 or 2 file descriptors) |
-x | file is executable |
! | bitwise NOT. Must be followed by a whitespace. |
It is a safe practice to always quote tested strings.
Option | Usage |
---|---|
-n | string length is not null (length > 0) |
-z | string length is zero (length == 0) |
nonEmptyString='hello'; [ -n "$nonEmptyString" ] && echo A || echo B; [ -z "$nonEmptyString" ] && echo C || echo D A D emptyString=''; [ -n "$emptyString" ] && echo A || echo B; [ -z "$emptyString" ] && echo C || echo D B C unset unsetString; [ -n "$unsetString" ] && echo A || echo B; [ -z "$unsetString" ] && echo C || echo D B C
[ "$a" = "$b" ] [ "$a" == "$b" ] [[ "$a" == "$b" ]]
[ "$a" != "$b" ] [[ "$a" != "$b" ]]
[ "$a" -eq "$b" ] [[ "$a" = "$b" ]] [[ "$a" == "$b" ]]
[ "$a" -ne "$b" ] [[ "$a" != "$b" ]]
if [ $condition1 ] && [ $condition2 ]
if [[ $condition1 && $condition2 ]]
if [ $condition1 -a $condition2 ]
if [ $condition1 ] || [ $condition2 ]
if [[ $condition1 || $condition2 ]]
if [ $condition1 -o $condition2 ]
if
:if
operator :
if(true) { }Bash allows to mimic a unary
if
operator, but this looks error-prone (read below). Instead, this looks more reliable :
if [ "$myVar" -eq "$UNIX_SUCCESS" ]; then
fi
if [[ "$myVar" == "$UNIX_SUCCESS" ]]; then
fi
if
and booleans :0 1 0 0 0 0
[ whatever ]
construct is short for [ -n whatever ]
and tests the string length (source, details, examples)
0 actually testing the non-empty string true, so UNIX_SUCCESS 0 0 1 testing an empty string, so UNIX_FAILURE
1 true returns a UNIX_SUCCESS but displays nothing (i.e. empty string) 1 similar reason for false 0 echo will always display a string... 1 ...unless properly silenced
[ ]
, just chain commands with the proper operators.[[ ]]
.[ (test command) and [[ ("new test" command) are used to evaluate expressions :
[ hello ] && echo ok || echo ko ok testing any non-empty string is a success (details) [ '' ] && echo ok || echo ko ko testing an empty string is a failure (details) [ -x /bin/true ] && echo ok ok [ -x /bin/true -a -e /bin/plop ] && echo ok || echo ko ko [ $(which cp) ] && echo ok || echo ko ok [ $(which plop) ] && echo ok || echo ko ko the$()
construct returns an empty string, which causes the test to fail [ "$(which plop)" ] && echo ok || echo ko ko no change when quoting a regular empty string [ $(which cp) -a $(which ls) ] && echo ok || echo ko ok [ $(which cp) -a $(which plop) ] && echo ok || echo ko bash: [: /bin/cp: unary operator expected this is because$(which cp)
evaluates to/bin/cp
and$(which plop)
evaluates to(i.e. unquoted empty string, aka no argument) resulting in :
[ /bin/cp -a ]
[ "$(which cp)" -a "$(which plop)" ] && echo ok || echo ko ko no more error because""
is a valid argument to-a
[ '$(which cp)' -a '$(which plop)' ] && echo ok || echo ko ok testing literal strings$(which plop)
and$(which plop)
: not empty, then success
set -flag
set -o optionName
and set +o optionName
Read full details and examples about Bash script flags in the article dedicated to set.