BASH, the Bourne-Again shell - (on my way to) mastering the command line

mail

subshell

What is a subshell ?

A subshell happens when a shell fork()s a child process, and that child process does not exec() anything (source).
For more gory details about fork, exec, processes and subshells, read below.

The following commands create subshells:

There's a slight difference between :
  • a subshell
  • a child process that happens to be a shell
Check it with :
unset -v a; a=1
(echo "in the subshell, a is $a.")
sh -c 'echo "in the child shell, a is $a."'
in the subshell, a is 1.
in the child shell, a is .

More about fork, exec, processes and subshells :

https://tldp.org/LDP/abs/html/subshells.html

https://www.gnu.org/software/bash/manual/bash.html#Command-Execution-Environment


https://unix.stackexchange.com/questions/264169/on-fork-children-processes-and-subshells/264199#264199
- you're in a console (i.e. shell) and type : [command][ENTER]
- 'fork' makes a new process, with a new PID, that starts running in parallel from exactly where this one left off
- 'exec' replaces the currently-executing code with a new program loaded from somewhere, running from the beginning
	==> if this "new program" is the same binary (e.g. /bin/bash), there is nothing to 'exec'

so, when you spawn a new program :
- you first 'fork' yourself
- then 'exec' that program in the child
That is the fundamental theory of processes that applies everywhere, inside and outside of shells.

Subshells are forks, and every non-builtin command you run leads to both a 'fork' and an 'exec'.


https://unix.stackexchange.com/questions/264169/on-fork-children-processes-and-subshells/264192#264192
'fork', assuming all goes well, returns twice :
- one return is in the parent process (which has the original process ID)
- and the other in the new child process (a different process ID but otherwise sharing much in common with the parent process).

At this point, the child could 'exec' something, which would cause some "new" binary to be loaded into that process, though the child need not do that, and could run other code already loaded via the parent process (zsh functions, for example).
Hence, a fork may or may not result in a "completely new" process, if "completely new" is taken to mean something loaded via an 'exec' system call.

details on fork + exec
https://en.wikipedia.org/wiki/Fork%E2%80%93exec

Why does this matter ?

This matters because of variable scope.
I'm sure you already heard that :
What Happens in Vegas, Stays in Vegas.
subshells are like Las Vegas : if you send a variable there, what happens there, stays there .
In other words : any change made to a variable within a subshell will be invisible to the parent shell.

Let's check :

#!/usr/bin/env bash
a=3

displayA() {
	local message=$1
	echo -e "a = $a\t$message"
	}

main() {
	displayA 'in main, just started'

	let "a+=1"
	displayA 'in main, after increment'

	{
	let "a+=1"
	displayA 'in curly braces, after increment within braces'
	}

	displayA 'back in main'

	(
	let "a+=1"
	displayA 'in parentheses, after increment within parentheses'
	)

	displayA 'back in main'

	{
	let "a+=1"
	displayA 'increment + display as a background process'
	} &
	wait
	displayA 'back in main'

	}

main
a = 3   in main, just started
a = 4   in main, after increment
a = 5   in curly braces, after increment within braces
a = 5   back in main
a = 6   in parentheses, after increment within parentheses
a = 5   back in main
a = 6   increment + display as a background process
a = 5   back in main
mail

process substitution : <() and >()

Process substitution :
mail

Here Strings : <<<

Here Strings are a variant of here documents. The format is :
[n] <<< word

How it works

  1. word is submitted to
    • tilde expansion
    • parameter and variable expansion
    • command substitution
    • arithmetic expansion
    • quote removal
    but not to :
    • pathname expansion
    • word splitting
  2. a newline \n is appended
  3. the result is supplied as a single string to the command on its standard input (or file descriptor n if provided)
mail

Shell options and arguments

Shell commands are structured like :
command [options] [arguments]
with :

More definitions (sources : 1, 2) :

Considering :
command
argument
any word following command
option
argument starting with a -
non-option argument
with an example :
tail -n 3 myFile
  • tail is the command
  • -n is an option
  • 3 is the value of that option
  • myFile is what tail will work on, aka an operand
You can consider a non-option argument is a parameter that will be processed by command : string, file, directory, ...
mail

system variables (aka environment variables + shell variables)

The title of this article : "system variables" is intended as a generic term meaning both the environment variables and the shell variables

There's a subtle difference between them, which is mostly a matter of scope (source) :

  • environment variables are shell variables that are available system-wide because they have been exported.
  • shell variables only exist in a given shell context (such as an interactive shell, or a subshell created while executing a script).
Since any shell variable can become an environment variable once exported, I see no need of listing them in distinct articles.

!!
previous command (source : man -P 'less -p "\\!\\!"' history)
Technically, this is not a variable but a command. It's listed here because this article is where I'll start searching when facing such cryptic things .
!$
last argument of the preceding command. Said to be equivalent to $_ (see also)
$!
PID of last job run in background. Try it : pwd & echo $!
$#
number of parameters passed to the current script or function
Check it :
#!/usr/bin/env bash

tmpScript=$(mktemp tmpScript.XXXXXXXX)

cat << 'EOF' > "$tmpScript"
#!/usr/bin/env bash
main() { echo $# : $@; }
main "$@"
EOF

chmod +x "$tmpScript"
./"$tmpScript"
./"$tmpScript" foo
./"$tmpScript" foo bar
./"$tmpScript" 'foo bar'
rm "$tmpScript"
0 :
1 : foo
2 : foo bar
1 : foo bar
$$
PID of current process
  • It _may_ sound like a good idea to use this as part of a temporary file name so that concurrent executions of a script won't collide, but mktemp will do that better.
  • It's very convenient to detect which flavor of shell is being used : ps $$
    PID	TTY	STAT	TIME	COMMAND
    18060	pts/1	Ss	0:01	/bin/bash
$*
all the parameters passed to the current script, as a space-separated string (source) :
  • $* and $@ hold a b c d (4 distinct elements)
  • "$*" holds a b c d (a single string having spaces). This is not iterable because of the explicit double-quotes.
  • "$@" holds a b c d (the 3 script parameters). The double quotes protect spaces when they are part of the parameter itself.
Check it :
#!/usr/bin/env bash
# run :	./script.sh a b 'c d'

echo -e '\nwith $* :'
for i in $*; do echo $i; done

echo -e '\nwith $@ :'
for i in  $@; do echo $i; done

echo -e '\nwith "$*" :'
for i in  "$*"; do echo $i; done

echo -e '\nwith "$@" :'
for i in  "$@"; do echo $i; done
with $* :
a
b
c
d

with $@ :
a
b
c
d

with "$*" :
a b c d

with "$@" :
a
b
c d
$-
current options set for the shell, the doc says it is not completely reliable
$0
name of the shell or current shell script (see also ${BASH_SOURCE[0]})
$?
decimal exit code of previous process (aka "most recent pipeline").
This handles the value returned by return.
$@
all the parameters passed to the current script or function, as separate words. This is an "iterable list of strings" (source, see $* for details)
You should practically always use double quotes around "$@".
$_
final argument of previous command executed :
testFile='./test'; echo $_; echo hello >"$testFile"; echo $_; rm "$testFile"; echo $_
hello	this is the last argument of echo, even if there are some extra directives on the command line
./test	this refers to "$testFile" coming after rm : the value has been substituted
BASH_SOURCE[0]
this is almost a synonymous of $0 but has slight differences (source) :
${BASH_SOURCE[0]} $0
  • holds the current script name, when executed or sourced
  • is read-only
  • is Bash-specific
  • can be shortened to $BASH_SOURCE (Bash allows referring to the element 0 of an array with the array name itself)
  • holds the current script name only when executed (something else when sourced)
  • can be overwritten
  • is POSIX-compliant
Check it with this script :
#!/usr/bin/env bash
echo -e "
\$0:$0
\$BASH_SOURCE:$BASH_SOURCE
\${BASH_SOURCE[0]}:${BASH_SOURCE[O]}
" | column -s ':' -t
  • ./test.sh
    $0			./test.sh
    $BASH_SOURCE		./test.sh
    ${BASH_SOURCE[0]}	./test.sh
  • source test.sh
    $0			/bin/bash
    $BASH_SOURCE		test.sh
    ${BASH_SOURCE[0]}	test.sh
DISPLAY
  • indicates where the X server is. Format :
    host:displayNumber.screenNumber
    Numbers start at 0 !
  • To temporarily connect to a specific X server, just type :
    command_line -display (or --display ?) host:displayNumber.screenNumber
  • If you're connecting to the remote host via SSH, consider using the X11 forwarding option.
EDITOR
define which editor to use with :
http_proxy
Set it with :
export http_proxy=http://host:port
Same goes on for https_proxy and ftp_proxy.
HISTFILE
Upon Bash process termination, all typed commands are written into an history file — ~/.bash_history — which absolute path is stored in HISTFILE.
HOME
current user's home directory, and default for the cd builtin command
About unquoted tilde expressions : ~whatever (source) :
  • if whatever is a valid user name — like ~bob— the tilde expression is expanded into Bob's home directory
  • if whatever is an empty string, ~ is expanded into $HOME
  • otherwise, whatever is left unchanged
HOSTNAME
The name of the current host
IFS
LANG (1, 2, 3)
  • sets the default locale, i.e. the locale used when no more specific setting (LC_COLLATE, LC_NUMERIC, LC_TIME, ) is provided; it doesn't override any setting, it provides the base value.
  • prefixing a shell command with LANG=C is a common hack to ensure the output message will be displayed in the default language (i.e. english), which is important on commands that select/filter text
  • list the available locales
LANGUAGE
  • not listed in the Bash reserved variables names (???)
  • usage :
    LANGUAGE=en find charlie
    	find: 'charlie': No such file or directory
    LANGUAGE=fr find charlie
    	find: 'charlie': Aucun fichier ou dossier de ce type
LC_NUMERIC
defines whether numbers should be displayed with a point or a comma as a decimal separator
LOGNAME
Depending on the Unix / Linux flavor :
  • either the name of the user that initially logged in, even though this user executed su afterwards (source : 1, 2)
  • or a synonymous for USER
OLDPWD
The previous working directory as set by the cd builtin (see PWD)
PATH
Path to binaries. Since this is a colon (:)-separated list, to append a new value :
  • DON'T :
    PATH=path/to/some/directory
    this would overwrite the whole content of the variable
  • DO :
    PATH=$PATH:path/to/some/directory
PIPESTATUS
Array holding exit statuses of last executed foreground pipe. Indexes start at 0.
true | true | false | true | false | false | true; echo ${PIPESTATUS[*]}
0 0 1 0 1 1 0
true | true | false | true | false | false | true; echo "${PIPESTATUS[0]} ${PIPESTATUS[1]} ${PIPESTATUS[2]} ${PIPESTATUS[3]} ${PIPESTATUS[4]} ${PIPESTATUS[5]} ${PIPESTATUS[6]}"
0 0 1 0 1 1 0
PPID
The shell's parent process identifier (or parent's pid if you like)
PROMPT_COMMAND
The contents of this variable are executed as a regular Bash command just before Bash displays a prompt.
PS1
is the shell prompt (details)
PWD
The current working directory as set by cd (see OLDPWD). Using currentDir=$(pwd) is double work since PWD is already available.
RANDOM
SHELL
  • full pathname to the shell
  • if not set when the shell starts, Bash assigns to it the full pathname of the current user's login shell
SHLVL
number of nested subshells (starts at 1). Try it : echo -n $SHLVL; bash -c 'echo -n $SHLVL; bash -c "echo -n \$SHLVL"; echo -n $SHLVL; exit; '; echo -n $SHLVL will output :
12321
TMPDIR
directory in which Bash creates temporary files for its own use
UID
The numeric real user id of the current user, readonly.
USER
the current user name, replaces an overkill $(whoami) in scripts

This variable is often set as an environment variable by Bash login startup files, but it is not actually a Bash builtin (source).

mail

WARNING: terminal is not fully functional, Error opening terminal: screen.xterm-256color.

Situation

Running some commands/binaries (man, less, nano, ) in some specific conditions (like within an SSH session nested inside a screen session), you may get errors like :

Details

This is because the TERM environment variable is not / improperly set.

Solution

export TERM=xterm-256color
Or fix it permanently.
mail

-bash: !whatever: event not found, exclamation marks, events, what are these ?

Long story short :
  1. this has to do with the shell history
  2. history entries are named events
  3. the shell interprets inputs like !something as commands to recall an event (details)

So, if you just want to use an exclamation mark (not followed by a space) in a string, just simple-quote it :

echo Hello world !
Hello world !
echo Hello world ! Hello everybody !
Hello world ! Hello everybody !
echo ah!ah!ah!
bash: !ah!ah!: event not found
echo "ah!ah!ah!"
bash: !ah!ah!: event not found
echo 'ah!ah!ah!'
ah!ah!ah!
echo Hello everybody !!
Surprise... but remember !! can express more than just extreme happiness
mail

Bash shortcut keys

Bash has Emacs (default, see table below) and Vi shortcut sets. To define your favorite shortcut style :
ALT CTRL
a move to the beginning of the line
conflicts with screen's CTRL-a-... shortcuts
c send the SIGINT signal to stop the current process
d delete everything after the cursor exits the current shell
e move to the end of the line
l clear the screen
q resume from CTRL-s
r search history of commands
s pause output to the screen. Useful when running a verbose command :
for i in {1..1000}; do echo $i; sleep 0.1; done
x-x Jump back and forth between the cursor position and the beginning of the line
z send the SIGTSTP signal to the current foreground process to send it to the background. Use fg to bring it back to the foreground.
_ Undo the last keystroke. Can be repeated to undo several keys back
. paste the last argument of the previous command. Repeat to cycle back in previous commands
mail

How to search + replace on a specific line of a file ?

Situation

I have :
testString='I like bananas.\nbananas are great.\nbananas are what I prefer.\nMy favorite fruit is bananas.'; echo -e "$testString"
I like bananas.
bananas are great.
bananas are what I prefer.
My favorite fruit is bananas.
I want :
I like bananas.
bananas are great.
oranges are what I prefer.
My favorite fruit is bananas.

Solution

With awk (source) :

  • return the changed line only :
    echo -e "$testString" | awk '/prefer/ { gsub("bananas","oranges",$0); print $0 }'
    oranges are what I prefer.
  • return all lines, including the changed one :
    echo -e "$testString" | awk '/prefer/ { gsub("bananas","oranges",$0); } { print $0 }'
    I like bananas.
    bananas are great.
    oranges are what I prefer.
    My favorite fruit is bananas.
  • to do the same to a text file, consider -i inplace

With sed :

echo -e "$testString" | sed -r '/prefer/ s/bananas/oranges/'

I like bananas.
bananas are great.
oranges are what I prefer.
My favorite fruit is bananas.

testFile='./testFile'; echo -e "$testString" > "$testFile"; echo 'BEFORE :'; cat "$testFile"; sed -i '/prefer/ s/bananas/oranges/' "$testFile"; echo -e '\nAFTER :'; cat "$testFile"; rm "$testFile"

BEFORE :
I like bananas.
bananas are great.
bananas are what I prefer.
My favorite fruit is bananas.

AFTER :
I like bananas.
bananas are great.
oranges are what I prefer.
My favorite fruit is bananas.

At first sight, sed looked simpler to me than Awk, but Awk is also able to work on a specific field of every line, or on a specific field of a specific line.

mail

How commands are read by Bash : quotes, wildcards, expansions, substitutions

After you enter some text and press , Bash will (source) :
  1. read input
  2. ignore the comment sign # and the rest of the line —if any
  3. break it up into words and operators, obeying the quoting rules (escape characters, simple and double quotes) :
    • backslash \ : preserve the literal value of the following character (except newline)
    • single quotes '' : preserve the literal value of each character enclosed within the quotes. A single quote may not occur between single quotes, even when preceded by a \.
    • double quotes "" : preserve the literal value of all characters enclosed within the quotes, except for $, `` and \.
    No * within quotes.
  4. perform alias expansion
  5. substitute the tokens into simple and compound commands (e.g. if, for, while, [[, case, constructs. More)
  6. perform shell expansions (details at gnu.org, tldp.org) :
    1. brace expansion : {}
    2. tilde expansion : ~
    3. variables expansion : $myVariable is substituted with its value
    4. commands substitution :
      • $(command) (or `command`) is substituted with its output
      • command is executed in a subshell
      • details at : gnu.org, tldp.org
    5. arithmetic expansion : $((arithmeticExpression)) is substituted with its result
    6. process substitution
    7. word split : split the result of previous expansions by SPACE (actually, each character of $IFS is a delimiter).
    8. file name expansion : if any word contains *, ? or[, it is considered as a pattern and replaced with an alphabetically sorted list of file names matching the pattern (if any. Otherwise, leave the special character as-is)
  7. redirections —if any
  8. execute the command
  9. wait for the command to complete and collect its exit status

Examples :

Quoting shell variables is mandatory to protect scripts from unexpected effects should unquoted variables contain SPACE characters (or any other character considered by the shell as a word separator, such as TAB or NEWLINE. Read more about IFS). Indeed, the SPACE character is one of the common argument separators for the shell. Here come the quotes (but they must be set wisely ).
Let's play !

The naive stage

fileName='my file'; touch $fileName; ls -1
file
my
rm $fileName; ls -1
(both gone)
touch "$fileName"; ls -1
my file
rm "$fileName"; ls -1
(gone)
touch ${fileName}; ls -1
file
my
rm ${fileName}; ls -1
(both gone)
touch ${fileName}1 ${fileName}2; ls -1
file1
file2
my
ls -1 $fileName*
file1
file2
my
ls -1 "$fileName*"
ls: cannot access my file*: No such file or directory
rm ${fileName}1 ${fileName}2; ls -1
(all gone)
touch "${fileName}1" "${fileName}2"; ls -1
my file1
my file2
ls -1 $fileName*
no double quotes
ls: cannot access my: No such file or directory
ls: cannot access file*: No such file or directory
ls -1 "$fileName*"
* inside double quotes
ls: cannot access my file*: No such file or directory
ls -1 "$fileName"*
* outside double quotes
my file1
my file2

Explanations

Considering we have run : fileName='my file'; touch "${fileName}1" "${fileName}2", (so we actually have files my file1 and my file2), let's list files :

ls $fileName*

  1. variable expansion : ls my file*
  2. word split (already split up) : ls my file*
  3. file name expansion : no file found match the pattern file*, so no change : ls my file*
  4. result

ls "$fileName*"

  1. word/operator breakup : $fileName is recognized as a variable and will be substituted
  2. variable expansion : ls "my file*"
  3. word split : because of the quotes, the [SPACE] between my and file can not be used to split : ls "my file*"
  4. file name expansion : because of the quotes again, the * is taken literally, nothing to expand. No file named exactly my file* (i.e. having a * in the file name) found, so no change : ls "my file*"
  5. result

ls "$fileName"*

  1. word/operator breakup : $fileName is recognized as a variable and will be substituted
  2. variable expansion : ls "my file"*
  3. word split : (as above) nothing to split : ls "my file"*
  4. file name expansion : files my file1 and my file2 are found and substituted : ls my file1 my file2
  5. result
mail

Regular Expressions in shell context

Since I already dealt with that topic earlier and spread information in many places, here's a collection of hyperlinks to the corresponding articles / sections until I clean this up :

On UTF-8-capable systems (and generally speaking : extended charsets), characters lists such as a-z include special characters like à, é or î. Thus, characters lists (a-z) can not be used in regular expressions to discriminate ASCII/non-ASCII characters. To do so, the solution is to build a complete list of all characters to match against with a regular expression : abcdefghijklmnopqrstuvwxyz (source).

mail

Unset variables in Bash / Ksh

For any further reference, here's what happens during an if statement in Bash and in Ksh while the tested variable is actually unset :
Shell Output
if [ ${undefinedVariable} -eq 0 ]; then echo 'this is the "THEN" part'; else echo 'this is the "ELSE" part'; fi
Bash
-bash: [: -eq: unary operator expected
this is the "ELSE" part
Ksh
ksh: [: argument expected
this is the "ELSE" part
if [ ${undefinedVariable} -ne 0 ]; then echo ''this is the "THEN" part'; else echo ''this is the "ELSE" part'; fi
Bash
-bash: [: -ne: unary operator expected
this is the "ELSE" part
Ksh
ksh: [: argument expected
this is the "ELSE" part

We note that :

mail

Here Documents : <<EOF

Send several commands via SSH :

ssh bob@sshServer << EOC
command1
command2
EOC

Send several lines of text to a command via a | :

cat << EOL | grep base
Roses are #ff0000
Violets are #0000ff
All my base are belong to you
EOL

Create a new file :

Step-by-step version :

  1. cat << EOF > newFile.txt
  2. some text, line 1
  3. some text, line 2
  4. some text, line n
  5. EOF

Big-bang version :

cat << EOF > newFile.txt
some text, line 1
some text, line 2
some text, line n
EOF
  • Data is written to newFile.txt
  • EOF is the "end of file" tag (aka stop token). Any string can be used instead.
cat << EOF > newFile.txt
line 1, not indented
    line 2, space-indented
	line 3, TAB-indented
EOF
cat newFile.txt; rm newFile.txt
  • Indentation works only with spaces, not with TAB.
  • The stop token must have no leading whitespace.

Stop token hacks

Single-quoted stop token (source)

This disables :
  • variable expansion (changing $USER into stuart)
  • command substitution (changing $(command) into the result of executing command)
  • arithmetic expansion (changing $((1+1)) into 2)
cat << EOF
current user : $USER
today : $(date +"%a %b %d")
2 apples + 1 banana is $((2+1)) fruits
EOF
current user : stuart
today : Fri Nov 17
2 apples + 1 banana is 3 fruits
cat << 'EOF'
simple quotes : $USER
today : $(date +"%a %b %d")
2 apples + 1 banana is $((2+1)) fruits
EOF
simple quotes : $USER
today : $(date +"%a %b %d")
2 apples + 1 banana is $((2+1)) fruits

Dash (-) -prefixed stop token (source) :

This is a cosmetic hack improving readability of scripts since its allows indenting the heredocs too. The - in the stop token suppresses leading tabs in the output.

  • this has no effect on lines indented with spaces, including the line of the stop token itself which must be TAB-indented. If space-indented, the stop token line becomes invisible which causes an error :
    ./myScript.sh: line 63: warning: here-document at line 23 delimited by end-of-file (wanted `EOF')
    ./myScript.sh: line 64: syntax error: unexpected end of file
  • there must be no space between << and -
  • when pasting the code below into a terminal running GNU screen, the TAB characters are stripped out and both versions seem to produce the same output
cat << EOF
indent : none
  indent : SPACE * 2
	indent : TAB * 1

EOF

cat <<-EOF
indent : none
  indent : SPACE * 2
	indent : TAB * 1

EOF
indent : none
  indent : SPACE * 2
	indent : TAB * 1

indent : none
  indent : SPACE * 2
indent : TAB * 1

What if I want to output some special characters without disabling variable expansion ?

value=42; exampleFile='/tmp/myExampleFile.txt'; cat << EOF > "$exampleFile"
if (\$something > $value)
    blah

if (\$anything < ($value/2))
    pooh
else
    nomnom

if (true || false || whatever)
    who_cares
EOF
cat "$exampleFile"; rm "$exampleFile"
outputs :
if ($something > 42)
    blah

if ($anything < (42/2))
    pooh
else
    nomnom

if (true || false || whatever)
    who_cares

How to pipe a multiline heredoc into a command ?

You can do things like :
cat << EOSQL | sqlplus -s / as sysdba | grep -Ev '^$'
SELECT DISTINCT(TRUNC(last_refresh)) FROM dba_snapshot_refresh_times;
query1;
query2;
EOSQL
Or even :
echo -e "query1;\nquery2;" | sqlplus -s / as sysdba | grep -Ev '^$'
But it is simpler to do :
sqlplus -s / as sysdba << EOSQL | grep -Ev '^$'
query1;
query2;
EOSQL
Remember :
  • cat << EOF is fine when redirecting into a file
  • when redirecting to a command, command << EOF looks more appropriate (see useless use of cat)
mail

How to remove the header line from a command output ?

Let's imagine a command (such as a DB query) that outputs something like :
HEADER
data line 1
data line 2
data line 3
You can retrieve all lines except the header ...
... with grep :
echo -e "HEADER\ndata line 1\ndata line 2\ndata line 3" | grep -v 'HEADER'
Requires to know how to match the header line, i.e. knowing HEADER.
... with sed :
  • echo -e "HEADER\ndata line 1\ndata line 2\ndata line 3" | sed -n '2,$ p'
  • or even simpler : echo -e "HEADER\ndata line 1\ndata line 2\ndata line 3" | sed 1d
... with tail :
echo -e "HEADER\ndata line 1\ndata line 2\ndata line 3" | tail -n +2
mail

Why does this {start..stop..step} output {start..stop..step} instead of a sequence of numbers ?

The "step" feature of the brace expansion is a new feature of Bash 4 (source). To get Bash version :
mail

How to display the n leading / trailing characters from each line of a file ?

Leading characters :

Given the data file :
for i in {1..1000}; do echo $RANDOM >> data.txt; done
sed can do it :
sed -r 's/(^.{3}).*$/\1/g' data.txt
But it's overkill as cut can do it way easier :
cut -c -3 data.txt

Trailing characters :

No such option in cut, so let's use sed :
echo hello | sed -r 's/.*(.{3})$/\1/g'
mail

Job control

Job control is nothing but the ability to stop / suspend / resume the execution of processes. A jobId is displayed when starting a process in the background :
user@host $ emacs & vlc &
[1] 10367
[2] 10368
Here, emacs is the 1st command we've launched, 1 is its jobId and 10367 is its PID.

List the current jobs :

jobs
[1]-	Running	emacs &
[2]+	Running	vlc &
jobs -l
[1]-	10367	Running	emacs &
[2]+	10368	Running	vlc &
Field Value Description Example
[n]
Job ID
  • To be used with fg, bg, wait, kill, ...
  • The job ID must be prefixed by a %
+ or -
  • + : current job
  • - : previous job
10367 PID
Job status :
  • Running : currently running (not stopped / suspended)
  • Stopped : job is suspended

The %jobId syntax used to refer to a job is also known as jobspec.

Suspend a running job :

Resume a suspended job :

  • In the foreground : fg %2
  • In the background : bg %5
mail

Curly brackets & shell Brace Expansion

Syntax Description Example
{value1,value2,value_n}
String generation
Generate as many strings as the number of parameters, including a prefix and/or a suffix
echo foo{1,2,3}bar
foo1bar foo2bar foo3bar
  • {start..stop}
  • {start..stop..step}
Generate a string for each parameter from the specified interval, including a prefix and/or a suffix
  • the step parameter appears with Bash 4 (details)
  • this construct should be preferred to seq because it won't start subprocesses
  • to pass start, stop and step as variables, see this example
echo test_{1..2}{a..b}_
test_1a_ test_1b_ test_2a_ test_2b_
echo {a..z..7}
a h o v
more examples
${parameter:-default}
Use default value
If parameter is unset or null, default (which may be an expansion) is substituted. Otherwise, the value of parameter is substituted.
This can be used to make some function parameters optional (by giving them a default value when omitted) :
myFunction() {
	local myVariable=$1
	local myOtherVariable=${2:-42}
	
	}
key='value'; echo ${key:-'nothing'}; unset key; echo ${key:-'nothing'}
value
nothing
key='value'; default=42; echo ${key:-$default}; unset key; echo ${key:-$default}
value
42
${parameter:=default}
Assign default value
If parameter is unset or null, default (which may be an expansion) is assigned to parameter. The value of parameter is then substituted.
  • if parameter is set : do nothing
  • otherwise (i.e. parameter is unset or null) : assign default to parameter (i.e. alters parameter)
key='value'; result=${key:='nothing'}; echo $key; unset key; result=${key:='nothing'}; echo $key
value
nothing
${parameter:+value}
Use value if parameter exists
If parameter exists, substitute (e.g. return) value (which may be an expansion).
Otherwise (parameter is null or unset), return nothing.
key='value'; echo ${key:+'key exists'}; unset key; echo ${key:+'key exists'}
key exists
(empty line)
${parameter?message}
Return parameter, or display message if unset
value=42; echo ${value?this variable is unset.}; unset value; echo ${value?this variable is unset.}
42
bash: value: this variable is unset.
${parameter:offset:length}
Substring Expansion
expands to up to length characters of parameter starting at the character specified by offset (0-indexed)
myString='0123456789'; echo ${myString:4:3}
456
if :length is omitted, go all the way to the end
myString='0123456789'; echo ${myString:4};
456789
if offset is negative (use parentheses!), the starting point is determined by counting backward from the end of parameter
myString='0123456789'; echo ${myString:(-4):3}
678
if length is negative (no parentheses required), length characters are trimmed from the end of parameter
myString='0123456789'; echo ${myString:5:-2}
567
The last character of a string : ${parameter:(-1):1}
${#myString}
String length
length of the string myString
Some documents/websites may state that ${#myArray} represents the number of items in the array myArray. This is wrong / obsolete (maybe it worked —with warnings— in previous Bash versions).
(details)
Looks like this works fine with multi-byte characters
myString='foo bar'; echo ${#myString}
7
a=$(echo -e '\u2639\u263a'); echo -e "$a\t${#a}"
☹☺	2
${parameter#pattern}
Remove pattern from the beginning of parameter
The pattern is matched against the beginning of parameter. The result is the expanded value of parameter with the shortest match deleted.
This can be used to retrieve the extension of a file name.
myString='abcdef'; echo ${myString#abc}; echo ${myString#[abc]}; myString='aaabbbccc'; echo ${myString#a*b}
def
bcdef
bbccc
name=file.txt; echo ${name#*.}
txt
${parameter##pattern}
As above, but the longest match is deleted.
This can be used to retrieve the current directory name.
myString='aaabbbccc'; echo ${myString##a*b}
ccc
cd /var/log && echo ${PWD##*/}
log
${parameter%pattern}
Remove pattern from the end of parameter
The pattern is matched against the end of parameter. The result is the expanded value of parameter with the shortest match deleted.
This can be used to :
  • retrieve a file name without extension. If the extension is known, basename can do it very easily :
    name=file.txt; echo ${name%.*}
    file
  • retrieve the directory name given an absolute or relative file name :
    for fileName in /absolute/path/to/file relative/path/to/file fileWithoutPath; do echo ${fileName%/*}; done
    /absolute/path/to
    relative/path/to
    fileWithoutPath		ooops : this is not a directory
    Workaround :
    for fileName in /absolute/path/to/file relative/path/to/file fileWithoutPath; do [[ "$fileName" =~ / ]] || fileName="./$fileName"; echo ${fileName%/*}; done
    /absolute/path/to
    relative/path/to
    .			at least we get a directory
  • build a logfile name based on the script name :
    script='/path/to/my script.sh'; logFile=$(basename "$script"); logFile=${logFile%.*}'.log'; echo "$logFile"
    my script.log
myString='aaabbbccc'; echo ${myString%a*b}; echo ${myString%b*c}
aaabbbccc
aaabb
${parameter%%pattern}
As above, but the longest match is deleted.
myString='aaabbbccc'; echo ${myString%%b*c}
aaa
myString='abcdef-123'; echo ${myString%%-+([0-9])}
abcdef
${parameter/search/replace}
In parameter, replace the 1st match of search with replace.
myString='abcd abcd'; echo ${myString/cd/CD}
abCD abcd
${parameter//search/replace}
As above, but every match of search is replaced.
myString='abcd abcd'; echo ${myString//cd/CD}
abCD abCD
  1. ${parameter^}
  2. ${parameter^^}
  3. ${parameter,}
  4. ${parameter,,}
Return parameter with :
  1. 1st character uppercase
  2. all characters uppercase
  3. 1st character lowercase
  4. all characters lowercase
(source)
myString='hello, world'; echo ${myString^}; echo ${myString^^}; myString=${myString^^}; echo ${myString,}; echo ${myString,,}
Hello, world
HELLO, WORLD
hELLO, WORLD
hello, world

Output a sequence of characters or numbers (details):

ascending numbers :
echo {2..8} : 2 3 4 5 6 7 8
reverse order letters :
echo {z..a} : z y x w v u t s r q p o n m l k j i h g f e d c b a
descending numbers with leading zeros (source) :
echo {100..00..10} : 100 090 080 070 060 050 040 030 020 010 000
mail

Shell exit codes

Code Meaning
0 success
(aka "UNIX_SUCCESS" in my scripts)
1 catchall for general errors
(aka "UNIX_FAILURE" in my scripts)
2 misuse of shell builtins
124 specific case with timeout
126 command invoked cannot execute
127 command not found
128 invalid argument to exit
128 + n fatal error signal n
130 script terminated by CTRL-c
255* exit status out of range

Exit codes over 255

On some special cases (such as programs launching shell commands as a child process), the exit code may be shown on 2 bytes :
Bits Meaning
15-8 shell command (child process) exit code
7 =1 if a core dump was produced
6-0 signal number that killed the process
  • So, given the 32512 exit code, which gives 0111 1111 0000 0000 in binary, we can deduce that the shell exit code was 127.
  • Faster solution : final exit code = exit code modulo 255 :
    • echo $((32512%255))
      127
    • with bc :
      32512%255
      127
      The modulo operator % fails when bc is started with its -l flag (source). Workaround : update aliases or start bc with /usr/bin/bc
mail

How to amend and replay the previous command ?

With !! :

!! replays the previous command :
$ apt-get install package
$ sudo !!
To amend and replay a previous command :
!!:s/wrong/right

With ^wrong^right^ :

  1. ls /rome
    ls: cannot access '/rome': No such file or directory
    Ooops
  2. Fix it with the ^wrongString^rightString^ syntax :
    ^rome^home^
    This performs strings substitution and executes :
    cd /home
    ls /home
    lost+found	bob	kevin	stuart
mail

Control Bash history with HISTCONTROL

HISTCONTROL=ignoredups
Log only once repeated commands
HISTCONTROL=ignorespace
Don't log commands prefixed with a space
HISTCONTROL=ignoreboth
Equivalent to HISTCONTROL=ignoredups:ignorespace

HISTCONTROL value is a colon (:) -separated list.

mail

How to customize the shell prompt ?

Foreword :

To proceed :
  1. open ~/.bashrc in your favorite text editor
  2. make changes, save and exit
  3. load changes :
    source ~/.bashrc

A basic colorless prompt : stuart@myWorkstation:~[0]$ :

export PS1='\u@\h:\w[\j]\$ '
Flag Usage
\u userName
\h hostname (up to first .)
\w current working directory
\j number of shell children jobs

Add colors + a green / red square indicating the status of the previous command (more about colors) : stuart@myWorkstation [0] ~$ :

NOCOLOR="\[\e[0m\]"
RED="\[\e[1;31m\]"
GREEN="\[\e[0;32m\]"
YELLOW="\[\e[1;33m\]"
BLUE="\[\e[1;34m\]"
SQUARE="\342\226\210"

export PS1="\`if [ \$? = 0 ]; then echo '${GREEN}'; else echo '${RED}'; fi\`$SQUARE $RED\u$NOCOLOR@$BLUE\h $NOCOLOR[$YELLOW\j$NOCOLOR] \w\$ "

Improvement : also display the current Git branch (if any) : stuart@myWorkstation [0][master] ~$ :

NOCOLOR="\[\e[0m\]"
RED="\[\e[1;31m\]"
GREEN="\[\e[0;32m\]"
YELLOW="\[\e[1;33m\]"
BLUE="\[\e[1;34m\]"
SQUARE="\342\226\210"

export PS1="\`if [ \$? = 0 ]; then echo '${GREEN}'; else echo '${RED}'; fi\`$SQUARE $RED\u$NOCOLOR@$BLUE\h $NOCOLOR[$YELLOW\j$NOCOLOR]\`git branch &>/dev/null; if [ \$? ]; then git branch 2>/dev/null | awk '/^\*/ {print \"[$BLUE\"\$2\"$NOCOLOR]\"}'; fi\` \w\$ "
  • the final \$ is an actual $ displayed as part of the prompt
  • to be dynamic, $PS1 requires to embed code, so that it is executed again each time $PS1 is displayed. Variables defined outside of the prompt definition actually are constants.
  • I've not been able (so far) to use the $(...) construct rather than `...` for process substitution
  • This runs git branch twice, which I don't like. See below for a fix.

Improvement of this improvement (still gives stuart@myWorkstation [0][master] ~$ ) :

NOCOLOR="\[\e[0m\]"
RED="\[\e[1;31m\]"
GREEN="\[\e[0;32m\]"
YELLOW="\[\e[1;33m\]"
BLUE="\[\e[1;34m\]"
SQUARE="\342\226\210"

PROMPT_COMMAND='[ $? = 0 ] && squareColor="$GREEN" || squareColor="$RED"; \
currentGitBranch=$(git branch 2>/dev/null | awk '"'"'/^\*/ {print $2}'"'"'); \
[ -n "$currentGitBranch" ] && displayBranch="[$BLUE$currentGitBranch$NOCOLOR]" || displayBranch=''; \
PS1="$squareColor$SQUARE $RED\u$NOCOLOR@$BLUE\h $NOCOLOR[$YELLOW\j$NOCOLOR]$displayBranch \w\$ "'

Fix of the improvement of this improvement : (virtualenv) stuart@myWorkstation [0][master] ~$ :

NOCOLOR="\[\e[0m\]"
RED="\[\e[1;31m\]"
GREEN="\[\e[0;32m\]"
brightGreen="\[\e[1;32m\]"
YELLOW="\[\e[1;33m\]"
BLUE="\[\e[1;34m\]"
lightPurple="\[\e[1;35m\]"
SQUARE="\342\226\210"

PROMPT_COMMAND='[ $? = 0 ] && squareColor="$GREEN" || squareColor="$RED"; \
[ -z "$VIRTUAL_ENV" ] && displayVirtualEnv='' || displayVirtualEnv="($brightGreen$(basename "$VIRTUAL_ENV")$NOCOLOR) "
currentGitBranch=$(git branch 2>/dev/null | awk '"'"'/^\*/ {print $2}'"'"'); \
[ -n "$currentGitBranch" ] && displayBranch="[$lightPurple$currentGitBranch$NOCOLOR]" || displayBranch=''; \
PS1="$displayVirtualEnv$squareColor$SQUARE $RED\u$NOCOLOR@$BLUE\h $NOCOLOR[$YELLOW\j$NOCOLOR]$displayBranch \w\$ "'
$VIRTUAL_ENV is defined in virtualenvironmentBaseDir/bin/activate
mail

How to display colors in Bash ?

Displaying colors is a shared responsibility between the shell and the terminal running it. Colors and decorations (bold, italic, background color, ...) are specified with ANSI escape sequences.

List of color codes (full list with details and examples) :

color codes are (mostly) like : 0;xx = dark, 1;xx = light
Black       0;30		White         1;37
Dark Gray   1;30		Light Gray    0;37
Red         0;31		Light Red     1;31
Green       0;32		Light Green   1;32
Brown       0;33		Yellow        1;33
Blue        0;34		Light Blue    1;34
Purple      0;35		Light Purple  1;35
Cyan        0;36		Light Cyan    1;36

Demo : display all available colors :

for lightOrDark in 0 1; do for colorCode in {30..36}; do echo -e "\e[${lightOrDark};${colorCode}m${lightOrDark};${colorCode}"; done; done
mail

Bash wildcards and patterns

Usage

Bash has a built-in pattern matching parser (which must not be mistaken with actual regular expressions). Wildcards can be used with any command that accepts file names as arguments (i.e. almost anything ).

When receiving a command :
  1. Bash searches for wildcards substitutions to perform on file names : are there any existing files matching the wildcards ?
    • yes : perform substitutions
    • no : leave the command as-is
  2. execute the command obtained from the step above

Wildcards / patterns :

The period . is NOT a wildcard in Bash.
Wildcard replaced by
~ the path to the home directory of the current user
Within quotes, the ~ character is NOT substituted. Check :
echo ~ '~' "~"
/home/bob ~ ~
use $HOME instead
? any single character
* any sequence of characters (including the empty string)
[abcde]
[a-e]
[!abcde]
[!a-e]
exactly one character from the list : abcde (see examples below)
exactly one character from the range : "a" to "e"
any single character that is not listed : abcde
any single character that is not within the range : "a" to "e"
{foo,bar} exactly one entire word in the options given
`myCommand` anything between the backticks must be considered as a command
Even though this works fine, it is now old-fashioned, and the $(myCommand) construct should be preferred.

Some more "exotic" things (source) :

Since I'm more "fluent" in regular expressions than Bash wildcards, I'll explain with regular expressions
Those require extglob to be enabled, which is done by default since Bash 4 (source))
regular expression match corresponding Bash wildcard
x? 0 or 1 occurrence of x (i.e. optional x) ?(x)
x+ 1 or more occurrences of x +(x)
x* 0 or more occurrences of x *(x)
[^x] anything but x !(x)
touch /tmp/myFile_{foo,bar,baz}

ls /tmp/myFile* | grep -E 'ba[rz]'; ls /tmp/myFile*[rz]
ls /tmp/myFile* | grep -E 'o+'; ls /tmp/myFile*+(o)

ls /tmp/myFile* | grep -E '[^o]$'; ls /tmp/myFile*!(o)		<-- ko :-(
ls /tmp/myFile* | grep -E '[^o].$'; ls -1 /tmp/myFile??!(o)?

rm /tmp/myFile_{foo,bar,baz}

Example

Command with a wildcard Returns
ls *
echo *
all non-hidden files from the current directory
ls *n
echo *n
all non-hidden files which name ends with a n
ls *n* all non-hidden files which name contains a n, even as the last character
ls *.* all non-hidden files which name contains a ., even as the last character, but NOT as the first character as this would be an hidden file
ls *foo* all non-hidden files which name contains foo, or just displays ls : can not access *foo* : no file or directory of this type if no such file exists
echo *foo* all non-hidden files which name contains foo, or just displays *foo* if no such file exists (no wildcard substitution made before "echoing")
ls ?[ae]* all non-hidden files which name contains either a a or a e as the 2nd character
touch foo[123]; ls foo[123] : no matching file found, so no substitution possible : Bash took it literally
echo BEFORE; touch foo1 foo2 foo3; ls; echo AFTER; rm foo[123]; ls
BEFORE
foo1  foo2  foo3
AFTER
(void)
files matching the wildcard exist, so Bash made the corresponding substitutions before running rm
mail

(non-)?(login|interactive) shells

Different flavors of shell :

Shell type Technical definition Description Use case
login
  • shell whose first character of argument zero is -
  • or one started with --login
This is a shell that is opened after authenticating, either locally (e.g. at boot time) or via SSH : the prompt is displayed after a successful login (going through /bin/login). The login shell is the first process that executes under your user ID when you log in for an interactive session.
  • virtual terminal (Ctrl-Alt-Fn)
  • SSH
non-login The prompt is displayed without re-authenticating. shell emulator within a GUI :
  • gnome-terminal
  • xfce4-terminal
interactive
  • a shell started :
    • without non-option arguments (unless -s is specified)
    • and without the -c option whose standard input and error are both connected to terminals
  • or one started with -i
PS1 is set and $- includes i if bash is interactive, allowing a shell script or a startup file to test this state.
The shell :
  • waits for keyboard input to execute commands.
  • can be either a login or non-login shell.
  • virtual terminal (Ctrl-Alt-Fn)
  • SSH
  • emulator such as gnome-terminal, xfce4-terminal or xterm
non-interactive A non-interactive shell is usually present when a shell script is running. It is non-interactive because it is processing a script and not waiting for user input between commands. For these shell invocations, only the environment inherited from the parent shell is used.
  • scripts
  • daemons / "rc scripts"

And because an image is worth 1000 words (full-size version) :

Files read :

  • interactive login shell :
    1. /etc/profile
    2. ~/.bash_profile
    3. ~/.bash_login
    4. ~/.profile
  • interactive non-login shell :
    1. /etc/bash.bashrc
    2. ~/.bashrc
  • non-interactive login shell (daemons only) :
    1. None. Inherits from its parent shell
  • non-interactive non-login shell :
    1. Such shells don't exist (?).

How to set environment variables for a specific daemon ?

This applies specifically to Shinken not being able to send requests outside of our network because of proxy settings : the variables set in the ~/.* files above simply add no effect
Finally, the QnDsolution was to add to /etc/default/shinken (in the top of the file, after the heading comments) :
export http_proxy="http://kevin:password@proxyHost:port/"
export https_proxy="http://kevin:password@proxyHost:port/"

Other version / doesn't agree (!)

Upon user login, the OS starts a shell. Bourne shells read commands from ~/.profile when invoked as the login shell. Bash (which is a Bourne shell) reads commands from ~/.bash_profile when invoked as the login shell. If ~/.bash_profile doesn't exist, it reads from ~/.profile instead.

A shell launched at any other time (terminal emulator within GUI environment, or through a SSH connection) :

  • is NOT a login shell but an interactive shell
  • doesn't read ~/.profile or ~/.bash_profile
  • reads ~/.bashrc

Therefore :

  • ~/.profile is the place to put stuff that applies to your whole session, such as programs that you want to start when you log in (but not graphical programs, they go into a different file), and environment variable definitions.
  • ~/.bashrc is the place to put stuff that applies only to bash itself, such as alias and function definitions, shell options, and prompt settings. (You could also put key bindings there, but for bash they normally go into ~/.inputrc.)
  • ~/.bash_profile can be used instead of ~/.profile, but you also need to include ~/.bashrc if the shell is interactive. I recommend the following contents in ~/.bash_profile :
    if [ -r ~/.profile ]; then . ~/.profile; fi
    case "$-" in *i*) if [ -r ~/.bashrc ]; then . ~/.bashrc; fi;; esac
mail

~/.bashrc is not executed when opening a new SSH session

When opening a new connection :
  1. If ~/.bash_profile exists, it's executed, then "STOP". If it doesn't exist, go on to the next step.
  2. if ~/.profile exists, it's executed, then "STOP". If it doesn't exist, go on to the next step.
  3. if ~/.bashrc exists, it's executed, then "STOP".

~/.bash_profile or ~/.profile must end on :

. ~/.bashrc

mail

When Bash outputs : -bash: /bin/ls: Argument list too long

Situation

When a directory contains numerous files, trying to ls, cp, mv, rm these files leads to a Bash error : Argument list too long

Solution

To take action on these files with Bash, use a construct such as :

for i in *tmp; do command $i; done

Alternate solution