Awk

immediately stop processing the current record and go on to the next one

see examples : 1, 2

checkPageExists() {
	local page=$1
	curl -sI "$page" | awk '/^HTTP\/1.1/ {
		if ($2=="200")
			exit 0
		else
			exit 1
		 }'
	}

main() {
	
	for page in $pageList; do
		checkPageExists "$page" || continue
		
	done
	}

An example is worth one thousand words., so enjoy your reading !

Tutorials & basic examples :

List "enabled" repositories :

The URL linked below lists repositories :

[docker-ce-stable]
name=Docker CE Stable - $basearch
baseurl=https://download.docker.com/linux/centos/$releasever/$basearch/stable
enabled=1
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-stable-source]
name=Docker CE Stable - Sources
baseurl=https://download.docker.com/linux/centos/$releasever/source/stable
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

I want to list the IDs of those having enabled=1 :

curl -s https://download.docker.com/linux/centos/docker-ce.repo | awk ' \
	BEGIN		{ repoId=""; } \
	/^ *$/		{ repoId=""; } \
	/^\[.*\]/	{ repoId=$0; } \
	/^enabled=1/	{ if (repoId!="") { result=gensub(/[\[\]]/, "", "g", repoId); print result; } }'

docker-ce-stable

Find who's the link / who's the target in a symlink :

The snippet below serves just to play with Awk since it can be replaced by the much more efficient command readlink.

while read line; do
	echo -e "\t$line"
	echo "$line" | awk '{ link=$1; target=""; for(i=2; i<=NF; i++) { if($i!="->") link=link" "$i; else break; } for(j=i+1; j<NF; j++) target=target""$j" "; target=target""$NF; print " LINK: \047"link"\047, TARGET: \047"target"\047"; }'
	echo
done < <(find -type l -exec ls -l {} + | awk '{ for (i=1; i<9; i++) $i=""; print $0; }')

Explanations :

find -type l -exec ls -l {} + | awk '{ for (i=1; i<9; i++) $i=""; print $0; }': turn this :
lrwxrwxrwx 1 kevin developers 57 Apr 5 11:03 'path/to/link' -> 'path/to/target'
into this :
path/to/link -> path/to/target
i.e. remove the metadata shown by ls -l, which is made of the 8 first line fields.
\047: code to let Awk's print display single quotes ' (see comments of this answer)

By default, Awk prints the matching lines, so anything like { print $0 } or even { print } works, but is redundant.
There are some slight differences between print and printf
There are many ways of specifying which fields from the input should be output :
- list them explicitly :
  echo {a..z} | awk '{ print $8, $5, $12, $12, $15 }'
```
h e l l o
```
- list them in a loop :
  echo {a..z} | awk '{ for(i=3; i<10; i++) printf $i }'
```
cdefghi
```
- list them counting from the end of the input line, using NF :
  echo {a..z} | awk '{ print $NF, $(NF-1), $(NF-2) }'
```
z y x
```

print :

with no argument, print the whole input line :
echo -e 'line 1\nline 2\nline 3' | awk '{print}'
```
line 1
line 2
line 3
```
with 1 argument, print it :
echo -e 'line 1\nline 2\nline 3' | awk '{print $2}'
```
1
2
3
```
with more than 1 argument :
- when the arguments are separated by commas : print arguments separated by SPACE (default) or the specified OFS
- when the arguments are separated by spaces : print arguments concatenated
echo -e 'line 1\nline 2\nline 3' | awk '{print $2,$1}'; echo -e 'line 1\nline 2\nline 3' | awk '{print $2 $1}'
```
1 line
2 line
3 line
1line
2line
3line
```
Appends a carriage return \n to the output (printf doesn't) :
echo | awk '{print "Hello world"}'; echo | awk '{printf "Hello world"}'

printf :

Doesn't add a trailing \n
Supports the C-style printf(string, expression list) syntax :
echo | awk '{printf("%d is The Answer to The Great Question.", 42)}'

sprintf :

Returns without printing, what printf would have printed out with the same arguments.

Here's a very basic example (not-so-perfect but you'll get the idea ) :

echo 'abc' | awk '{
	switch ($0) {
		case /[[:lower:]]+/:
			print "lowercase"
			break
		case /[[:upper:]]+/:
			print "uppercase"
			break
		}
	}'

As a one-liner that can be pasted into the shell :

echo 'abc' | awk '{ switch ($0) { case /[[:lower:]]+/: print "lowercase"; break; case /[[:upper:]]+/: print "uppercase"; break; } }'

Run this script :

#!/usr/bin/env bash

value1=120
# initial value : 160
# tested values : 200, 140, 100
#	decreasing from the initial value : more bass sounds

value2=0.5678
# initial value : 0.87055
# tested values : 0.747, 0.777, 0.789
#	increasing values above 3.xxx : extreme bass sounds (?), hardly audible
#	around 0.5xxxxx : nice chime sounds

value3=13
# initial value : 10
# tested values : 13, 17, 26
#	increasing values : more high-pitched sounds
#	26 makes some 'D2-R2' blips

value4=128	# no effect so far :-(
# initial value : 128

awk "function wl() {
		rate=64000;
		return (rate/$value1)*($value2^(int(rand()*$value3)))};
	BEGIN {
		srand();
		wla=wl();
		while(1) {
			wlb=wla;
			wla=wl();
			if (wla==wlb)
				{wla*=2;};
			d=(rand()*10+5)*rate/4;
			a=b=0; c=$value4;
			ca=40/wla; cb=20/wlb;
			de=rate/10; di=0;
			for (i=0;i<d;i++) {
				a++; b++; di++; c+=ca+cb;
				if (a>wla)
					{a=0; ca*=-1};
				if (b>wlb)
					{b=0; cb*=-1};
				if (di>de)
					{di=0; ca*=0.9; cb*=0.9};
				printf(\"%c\",c)};
			c=int(c);
			while(c!=$value4) {
				c<$value4?c++:c--;
				printf(\"%c\",c)};};}" | aplay -r 64000

BEGIN

a BEGIN rule is executed once only, before the first input record is read (example)

BEGINFILE

see ENDFILE

END

an END rule is executed once only, after all the input is read (example)

ENDFILE

This is a gawk extension. The ENDFILE rule :

is called when gawk has finished processing the last record in an input file. For the last input file, it will be called before any END rules
is executed even for empty input files
allows to catch errors

A basic example

echo -e 'line A\nline B\nline C\nD' | awk '/^line/ { system("echo "$NF) }'

this command is absolutely useless : I just needed a dummy working example
don't forget that Awk variables must stay outside of quotes

How to store the result of a system command into a variable ? (source)

There are several methods to run a shell command from Awk :

with `system(myCommand)`

myVariable=system(myCommand) stores the return code of myCommand into myVariable

awk 'BEGIN { result1=system("true"); result2=system("false"); print result1" "result2; }'
```
0 1
```

with `myCommand | getline`

awk 'BEGIN {
	myCommand = "date --iso-8601=seconds"
	myCommand | getline myResultVariable
	close(myCommand)
	print "Current date = "myResultVariable
	}'

Mixing Awk and Bash tests :

[ -e someFile ] && rm someFile; for i in 1 2; do
	echo -e 'hello someFile world' | awk '{ print "\ninput : "$0; if(system("[ -f "$2" ]")) { print $2" exists" } }'
	ls someFile; touch someFile
done; rm someFile

this begins by making sure that the file someFile does not exist
then there's a for loop that runs twice
each run echoes some text to awk via a pipe :
1. display the input value as-is
2. make a shell-based test using system
3. display some text accordingly
ls to confirm someFile exists or not, then it is touched
loop
remove someFile at the very end (Boy scout rule)

input : hello someFile world
someFile exists							mmmkay
ls: cannot access 'someFile': No such file or directory		make up your mind !

input : hello someFile world
someFile

This is because Awk and Bash disagree on what makes a "success" :

	success	failure
Awk	anything else	0
Bash	0	anything else

when the file exists, [ -f "$2" ] is a Bash success : 0
system returns this code as-is

but for Awk : if(system(Bash success)) is false

echo | awk '{ if(system("true")) print "ok" }'
(nothing)

echo | awk '{ if(system("false")) print "ok" }'
ok

To make this work, the solution is to negate the result of the Bash test in the original command :

if(system("[ ! -f "$2" ]")) {

The standard Debian setup comes with /usr/bin/awk (don't know where this one comes from, awk ?), which has basic / limited functionality :

doesn't support the .{n} syntax
Only gawk supports {} (source)

Once gawk is installed :

ls -l $(which awk)

lrwxrwxrwx 1 root root 21 Oct 11 2016 /usr/bin/awk -> /etc/alternatives/awk*

md5sum /etc/alternatives/awk $(which gawk)

23a5b5a3d9ba0d2c6277dbdaf2557033	/etc/alternatives/awk
23a5b5a3d9ba0d2c6277dbdaf2557033	/usr/bin/gawk

Once gawk is installed, it can be invoked with awk.

$n

the n^th element of the current line ($0 being the whole line itself) :

for i in {1..4}; do echo 'a b c d' | awk '{print "Item '$i' of line \""$0"\" is "$'$i'"."}'; done

FILENAME

name of the current input file
- when reading from standard input
empty string inside a BEGIN rule

FS

Field Separator. Can be set with -F

NF

number of fields in the current line :
echo 'a b c' | awk '{print NF}'; echo 'joe jack william averell' | awk '{print NF}';
```
3
4
```
It is often used to refer to the last field of a line :
echo 'a b c' | awk '{print $NF}'; echo 'joe jack william averell' | awk '{print $NF}';
```
c
averell
```

NR

number of records processed so far (which can be approximated to the number of the current row, starting at 1) :

for i in {a..e}; do echo $i; done | awk '{ print "line "NR":\t" $0}'

line 1: a
line 2: b
line 3: c
line 4: d
line 5: e

OFS

Output Field Separator. It is automatically inserted between fields by print. Defaults to a single space.

This is not a CLI flag, it goes into the "action" part :

echo {a..z} | awk '{OFS="."; print $1,$3,$5,$7}'
```
a.c.e.g
```
No need to repeat the definition for every line of input :
echo {a..z} | awk 'BEGIN{OFS="PLOP"} {print $1,$3,$5,$7}'
```
aPLOPcPLOPePLOPg
```

RS

Records Separator

defaults to \n (NEWLINE) : by default Awk considers 1 record == 1 line of input
gawk also accepts regular expression

Extract specific fields from log files :

awk '$9 == "searchedKeyword" {print $7}' file.log | sort | uniq -c | sort -nr | head -n 10
awk '$6 ~ "30." {print $5" "$6}' file | ...

~ is the Awk operator to match a regular expression.

Bash (source)

Replace a substring :

${string/substring/replacement} : replace 1^st occurrence
${string//substring/replacement} : replace all occurrences
myString='Hello World'; echo ${myString//[eo]/ab} : outputs Habllab Wabrld

Test whether a string matches a RegExp (source) :

testString='Hello World'; if [[ $testString =~ ^.*o.*o.*$ ]]; then echo "MATCHES"; else echo "DOESN'T MATCH"; fi

PERL

Apply a regExp to a string :

perl -e '$ARGV[0]=~ m/..(.)/; print $1' abcdef
echo AZERqsdfWXCV | xargs perl -e '$ARGV[0]=~ m/.{4}(.{4}).*(.)$/; print "$1 $2"'

sed

Extract (in CSV format) URL + hit/miss + generation time from a Varnish log :

sed -r 's/.*GET ([^ ]*).*(hit|miss) ([0-9.]*).*/\1;\2;\3/' access.log > result.log

Extract (in CSV format) URL + HTTP error code from Lighttpd log :

sed -r 's/^.*GET ([^ ]*).*HTTP\/1\.1" ([0-9]*).*$/\1;\2;/' /var/log/lighttpd/www.example.com.log > result.log

Same as above with HTTP 500 errors only + sorting results by descending number of occurrences :

logFile='/var/log/lighttpd/www.example.com.log'; resultFile='./result.csv'; tmpFile=$(mktemp --tmpdir tmp.result.XXXXXXXX); grep '" 500 ' $logFile | sed -r 's/^.*GET ([^ ]*).*HTTP\/1\.." ([0-9]*).*$/\1;\2;/' > $tmpFile; cat $tmpFile | sort | uniq -c | sort -nr > $resultFile; rm $tmpFile

Using grep 1^st because sed can't find a match on every line, as we're reporting only on HTTP 500 errors.

Extract (in CSV format) several fields from Apache logs stored in a year/month/day directory tree :

resultFile='~/result.csv'; tmpFile=$(mktemp --tmpdir tmp.XXXXXXXX); csvHeader='web server;IP;HTTP method;URL used by method;full URL;'; echo $csvHeader > $tmpFile; logFilePath='/path/to/logfiles/'; startYear='2013'; endYear='2013'; startMonth='04'; endMonth='04'; startDay='01'; endDay='18'; for year in $(seq $startYear $endYear); do for month in $(seq $startMonth $endMonth); do for day in $(seq $startDay $endDay); do [ ${#month} -eq 1 ] && month='0'$month; [ ${#day} -eq 1 ] && day='0'$day; logFile=$logFilePath/$year/$month/$day/$year$month$day'-access.log'; echo "PROCESSING $logFile ..."; grep 'example.com' $logFile | grep -v 'GET' | sed -r 's/^.*(webServer(1|2)).* ([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+) .*\] "([A-Z]*) (.*) HTTP.*" [0-9]+ [0-9]+ "([^"]+)".*/;\1;\3;\4;\5;\6/' | sort | uniq >> $tmpFile; done; done; done; cat $tmpFile | sort | uniq -c | sort -nr >> $resultFile; rm $tmpFile

Awk is a programmable text filter. Its input can be :

a file : awk [options] myFile
the standard input stdin : stream of text from previous command | awk [options]

Output :

by default : the standard output stdout (source)
any file you like with output redirection : awk [options] inputFile > outputFile
Don't forget that awk's default action is to print matching lines. So any action limited to { print $0 }, albeit correct, is redundant .
details on print

An Awk script is made of 3 blocks :

pre-process : BEGIN
process
post-process : END

Awk reads the input line by line, then applies the specified filter(s) to detect whether or not to process the current line. Before starting processing a line, Awk splits it into fields and stores fields values in $1 (1^st field), $2, ..., $NF (last field). $0 is the whole input line. The fields separator (specified with FS) defaults either to [SPACE] or [TAB] (details).

There is no need to use grep together with Awk as Awk "RegExp matches" lines to process.

Filters :

Criteria	select matching lines	select not matching lines
line number within input	awk 'NR==n {doSomething}' echo -e 'a\nb\nc\nd' \| awk 'NR==3' c	echo -e 'a\nb\nc\nd' \| awk 'NR==3 {next}; {print}' a b d
line vs regular expression	awk '/regEx/ {doSomething}' echo -e 'foo\nbar\nbaz' \| awk '/bar/ {print $0}' bar echo -e 'foo\nPool ID : 1234\nbar\nID du pool : 4321\nbaz' \| awk '/(Pool ID\|ID du pool)/ {print $NF}' 1234 4321	awk '!/regEx/ {doSomething}' echo -e 'foo\nbar\nbaz' \| awk '!/a/ {print $0}' foo (source, example)
line vs number of fields Comparison Operators	echo -e 'field1\nfield1\tfield2\nfield1\tfield2\tfield3' \| awk 'NF == 2 {print $0}' field1 field2
field vs number Comparison Operators special case with trailing unit letter	echo -e 'foo\t12\nbar\t34\nbaz\t56' \| awk '$2 > 25 {print $0}' bar 34 baz 56 Awk is smart enough to strip leading zeroes : echo {01..10} \| awk '$3 > 2 { print "ok" }' ok echo {01..10} \| awk '$3 > 3 { print "ok" }' (void) echo {01..10} \| awk '$3 >= 3 { print "ok" }' ok echo {0001..10} \| awk '$3 >= 3 { print "ok" }' ok Trying to filter data based on line numbers returned by grep -n with a construct like : grep -n --color=always [options] \| awk -F ':' '$n > x {doSomething}' may fail because of the returned color codes. echo -e 'FOO\nBAR\nBAZ' \| grep -n --color=always '`.A`' \| awk -F '`:`' '$1>2 {print $0}' echo -e 'FOO\nBAR\nBAZ' \| grep -n '`.A`' \| awk -F '`:`' '$1>2 {print $0}'
field vs string	awk '$n == "value" {doSomething}' for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$2 == "bar2" {print $0}' foo2 bar2 baz2	awk '$n != "value" {doSomething}' for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$2 != "bar2" {print $0}' foo1 bar1 baz1 foo3 bar3 baz3
field vs regular expression limitations	awk '$n ~ /regEx/ {doSomething}' for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$2 ~ /a.1/ {print $0}' foo1 bar1 baz1 find the shortest path : echo -e "bla dir1/\nbla dir1/dir2/\nbla dir1/dir2/dir3/" \| awk '$NF ~ /^[^/]*\/$/ {print $NF}' dir1/	awk '$n !~ /regEx/ {doSomething}' for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$2 !~ /a.1/ {print $0}' foo2 bar2 baz2 foo3 bar3 baz3
field vs regular expression with `if / else` construct (source)	for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '{ if($2 ~ "a.2") {print "MATCH : "$2 } else {print "NO MATCH"} }' NO MATCH MATCH : bar2 NO MATCH
several conditions	awk 'condition1 logicalOperator condition2 logicalOperator ... conditionN {doSomething}' logicalOperator can be (source) : `&&` : logical AND `\|\|` : logical OR for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$2 ~ "^ba.." && $3 == "baz3" {print $0}' foo3 bar3 baz3 for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$1 ~ "1$" \|\| $3 ~ "3$" {print $0}' foo1 bar1 baz1 foo3 bar3 baz3 for i in {6..22}; do echo "a b c d e f g h $i"; done \| awk '$NF==7 \|\| $NF==21 {print $0}' a b c d e f g h 7 a b c d e f g h 21 echo \| awk '`1==1 && (2==1 \|\| 3==3)` { print "ok" }' ok

Numerical field with trailing unit letter or text :

If the numerical value has a unit letter, it doesn't work anymore :

echo -e "foo\t8U\nbar\t34U\nbaz\t56U" | awk '$2 > 25 {print $0}'

foo	8U	ooops !
bar	34U
baz	56U

solution :

echo -e "foo\t8U\nbar\t34U\nbaz\t56U" | awk 'strtonum($2) > 25 {print $0}'

bar	34U
baz	56U

Try it :

df -h | awk 'strtonum($5) > 75 {print $0}'
df -h | awk 'BEGIN {gsub(/%/, "", $5)} {if(strtonum($5) > 50) {print $0}}'

strtonum() looks smart enough to handle trailing units (source) :

awk 'BEGIN {
	print "trailing unit (single letter) : " strtonum("123U")
	print "trailing unit (word) : " strtonum("123potatoes")
	print "leading unit (single letter) : " strtonum("Y123")
	print "leading unit (word) : " strtonum("banana123")
	}'

trailing unit (single letter) : 123	OK
trailing unit (word) : 123		OK
leading unit (single letter) : 0	KO
leading unit (word) : 0			KO

Flag	Usage
-F x	use x as the input Field separator x can be several characters long : echo 'GAABCDBUABCDZOABCDMEU' \| awk -F 'ABCD' '{print $1,$2,$3,$4}' default field separator : any run of `spaces` and/or `tabs` and/or `newlines` (excluding leading and trailing runs) (details)
-i awkLibrary --include awkLibrary	load the specified library awkLibrary (example)
-v variable=value	declare a variable (example) use multiple -v to declare several variables : -v variable1=value1 -v variable2=value2

If Awk's exit statement is invoked with a numeric value, this numeric value is used as the exit status code.

Otherwise (source) :

Exit status	Description
`0` aka UNIX_SUCCESS	No problem during execution, including when no match was found. Check it : for char in a b; do echo "$char" \| awk '/`a`/'; echo $?; echo; done a 0 0
`1` aka UNIX_FAILURE	An error occurred
`2`	Fatal error

Process log files :

Count occurrences of an error message in a log file :

This code removes the [10-Oct-2012 18:15:46 UTC] fields from every logfile line. This is why Awk is taught to display all fields starting from the 4^th :

awk '/^\[/ { for (i=4;i<=NF;i++) printf $i " ";print ""}' logFile

printf adds no carriage return after printing. print does.

Then count occurrences :

awk '/^\[/ { for (i=4;i<=NF;i++) printf $i " ";print ""}' logFile | sort | uniq -c | sort -nr

From a multiple-fields line, displays fields starting from the 4^th :

In a log file such as :

[13-Nov-2013 03:03:35 Europe/Paris] PHP Warning: Memcached::touch(): ... in ....php on line 45
[13-Nov-2013 03:04:42 Europe/Paris] PHP Warning: file_get_contents(http://...): HTTP/1.0 404 Not Found in ...php on line 202
...

let's say you'd like to remove the date/time field to group and count similar errors. To do so :

awk '{ for (i=1;i<=3;i++) $i="";print }' file.log | awk '{sub(/^[ \t]+/, ""); print}' | sort | uniq -c | sort -nr

the 1^st Awk command replaces the first 3 fields with an empty string, so that the line only contains the remaining fields, starting from the 4^th as required
the 2^nd Awk command just removes leading whitespaces (source)

Match a keyword from a variable :

Looks like you can't use a variable name within the // operator to select the matching line :

DON'T : echo -e 'apple\nbanana\ncarrot' | awk -v letterToMatch='b' '/letterToMatch/'
DO : echo -e 'apple\nbanana\ncarrot' | awk -v letterToMatch='b' '$1 ~ letterToMatch'…...

extra examples to illustrate the above :

echo -e 'fruit: apple\nfruit: banana\nvegetable: carrot' | awk -v stuffToMatch='b' '/stuffToMatch/'
(nothing)
echo -e 'fruit: apple\nfruit: banana\nvegetable: carrot' | awk -v stuffToMatch='b' '$0 ~ /stuffToMatch/'
(nothing)
⇒ no match found, as said above


echo -e 'fruit: apple\nfruit: banana\nvegetable: carrot' | awk -v stuffToMatch='b' 'stuffToMatch'
fruit: apple
fruit: banana
vegetable: carrot
⇒ matches everything


echo -e 'fruit: apple\nfruit: banana\nvegetable: carrot' | awk -v stuffToMatch='b' '$0 ~ stuffToMatch'
fruit: banana
vegetable: carrot

echo -e 'fruit: apple\nfruit: banana\nvegetable: carrot' | awk -v stuffToMatch='b' '$1 ~ stuffToMatch'
vegetable: carrot

echo -e 'fruit: apple\nfruit: banana\nvegetable: carrot' | awk -v stuffToMatch='b' '$2 ~ stuffToMatch'
fruit: banana
⇒ 3 examples work as expected

for httpCode in 301 302 304; do echo -n "Code $httpCode : "; awk -v needle="$httpCode" '$6 ~ needle {print " "}' logFile | wc -l; done

Selecting PID's :

ps --ppid 1 | awk '/d$/ {print $1}'

Lists processes whose parent's PID is 1, then selects processes whose name ends in 'd', and prints the corresponding PID, which is the line field #1.

Specifying the field separator :

awk -F ':' '{ print "username: " $1 "\t\tuid:" $3 }' /etc/passwd

List all ports and PIDs on which a Mongodb instance is listening :

netstat -laputen | awk '/mongo/ {print "IP:port = "$4"\tPID = "$9}' | sort | uniq

Select non-empty lines :

echo -e 'A\tB\tC\tD\nE\tF\tG\tH\n\nI\tJ\tK\tL' | awk '!/^$/ {print $3}'

C
G
K

Dark wizardry ?

This awk command made me scratch my head quite a bit : it returns fields from 2 distinct lines, that even are not contiguous ().
To figure this out, I simplified it, and let the magic happen :

echo -e 'key1\tvalue1\nkey2\tvalue2' | awk '/key1|key2/ { printf $2 " " }'

value1 value2

Explanation :

the input (either an echo, a line "piped" in, or a whole file) is perfectly "normal" : there is no hack regarding field separators or end of line markers.
the /key1|key2/ part of the awk command is a "normal" regular expression alternation
the printf $2 " " part simply prints the 2^nd field of each matching line, followed by a space

So what's the trick ?
Let's have a deeper look at how awk works and what we're instructing it to do with the echo | awk command above :

no pre-process, so let's start eating lines and doing things
awk splits the input into distinct lines
awk reads the 1^st line : key1\tvalue1
does it match the regular expression ? Yes, so print the 2^nd field and a space character : value1
done with this line, continue with the next line
read the 2^nd line : key2\tvalue2
does it match the regular expression ? Yes again, so print the 2^nd field and a space character : value2
The trick is that awk does not automatically add a newline character after printing. So the output of any step is printed right after the output of the previous step. This is why, at this step of the procedure, the output looks like : value1 value2
done with this line, no next line
no post-process
the end !

Criteria	select matching lines	select not matching lines
line number within input	awk 'NR==n {doSomething}' echo -e 'a\nb\nc\nd' \| awk 'NR==3' c	echo -e 'a\nb\nc\nd' \| awk 'NR==3 {next}; {print}' a b d
line vs regular expression	awk '/regEx/ {doSomething}' echo -e 'foo\nbar\nbaz' \| awk '/bar/ {print $0}' bar echo -e 'foo\nPool ID : 1234\nbar\nID du pool : 4321\nbaz' \| awk '/(Pool ID\|ID du pool)/ {print $NF}' 1234 4321	awk '!/regEx/ {doSomething}' echo -e 'foo\nbar\nbaz' \| awk '!/a/ {print $0}' foo (source, example)
line vs number of fields Comparison Operators	echo -e 'field1\nfield1\tfield2\nfield1\tfield2\tfield3' \| awk 'NF == 2 {print $0}' field1 field2
field vs number Comparison Operators special case with trailing unit letter	echo -e 'foo\t12\nbar\t34\nbaz\t56' \| awk '$2 > 25 {print $0}' bar 34 baz 56 Awk is smart enough to strip leading zeroes : echo {01..10} \| awk '$3 > 2 { print "ok" }' ok echo {01..10} \| awk '$3 > 3 { print "ok" }' (void) echo {01..10} \| awk '$3 >= 3 { print "ok" }' ok echo {0001..10} \| awk '$3 >= 3 { print "ok" }' ok Trying to filter data based on line numbers returned by grep -n with a construct like : grep -n --color=always [options] \| awk -F ':' '$n > x {doSomething}' may fail because of the returned color codes. echo -e 'FOO\nBAR\nBAZ' \| grep -n --color=always '`.A`' \| awk -F '`:`' '$1>2 {print $0}' echo -e 'FOO\nBAR\nBAZ' \| grep -n '`.A`' \| awk -F '`:`' '$1>2 {print $0}'
field vs string	awk '$n == "value" {doSomething}' for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$2 == "bar2" {print $0}' foo2 bar2 baz2	awk '$n != "value" {doSomething}' for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$2 != "bar2" {print $0}' foo1 bar1 baz1 foo3 bar3 baz3
field vs regular expression limitations	awk '$n ~ /regEx/ {doSomething}' for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$2 ~ /a.1/ {print $0}' foo1 bar1 baz1 find the shortest path : echo -e "bla dir1/\nbla dir1/dir2/\nbla dir1/dir2/dir3/" \| awk '$NF ~ /^[^/]*\/$/ {print $NF}' dir1/	awk '$n !~ /regEx/ {doSomething}' for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$2 !~ /a.1/ {print $0}' foo2 bar2 baz2 foo3 bar3 baz3
field vs regular expression with `if / else` construct (source)	for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '{ if($2 ~ "a.2") {print "MATCH : "$2 } else {print "NO MATCH"} }' NO MATCH MATCH : bar2 NO MATCH
several conditions	awk 'condition1 logicalOperator condition2 logicalOperator ... conditionN {doSomething}' logicalOperator can be (source) : `&&` : logical AND `\|\|` : logical OR for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$2 ~ "^ba.." && $3 == "baz3" {print $0}' foo3 bar3 baz3 for i in {1..3}; do echo "foo$i bar$i baz$i"; done \| awk '$1 ~ "1$" \|\| $3 ~ "3$" {print $0}' foo1 bar1 baz1 foo3 bar3 baz3 for i in {6..22}; do echo "a b c d e f g h $i"; done \| awk '$NF==7 \|\| $NF==21 {print $0}' a b c d e f g h 7 a b c d e f g h 21 echo \| awk '`1==1 && (2==1 \|\| 3==3)` { print "ok" }' ok

next

exit

break

if else

Awk examples

Tutorials & basic examples :

List "enabled" repositories :

Find who's the link / who's the target in a symlink :

print

print, printf, sprintf

print :

printf :

sprintf :

switch case

Make music with awk

Awk internal keywords

system

A basic example

How to store the result of a system command into a variable ? (source)

with system(myCommand)

with myCommand | getline

Mixing Awk and Bash tests :

awk vs gawk

Awk internal variables

Tailor files / strings / substrings with Awk / Bash / PERL / sed

awk