awk

awk -f awk-program-file input-file

awk 'awk-program' input-file

awk program

Program element

search-pattern { program-action }

The curly braces are necessary. If left out, the default action print is invoked.

If no search-pattern is specified, all lines will be matched.

program-action consists of 0, 1, or more functions. If more than one function is used, they are seperated by a semicolon (;).

Search patterns

BEGIN

BEGIN matches befor the first line of the input file is read.

END

/regular-expression/

var ~ /regular-expression/

Matches if var matches the regular expression regular-expression

/begin-of-block-regex/,/end-of-block-regex/

Starts to match with the first line that matches begin-of-block-regex, stops to match with the line that matches end-of-block-regex.

Can also be used on variables: var ~ /re1/,/re2/

awk '$1 ~ /seven/,/nine/ {print $2}' input-file

one:        1
two:        2
three:      3
four:       4
five:       5
six:        6
seven:      7
eight:      8
nine:       9
ten:       10
eleven:    11
twelve:    12
thirteen:  13
fourteen:  14
fifteen:   15
sixteen:   16
seventeen: 17
eighteen:  18
nineteen:  19
twenty:    20

var_1==var_2

Built in functions

print x, y, z

If print is called without any arguments, $0 will be printed.

printf

printf format-string, expr_1, expr_2,... expr_n

c: single character
d: decimal number
e: exponential
f: float
g: e or f, whichever is shorter
o: octal number
s: string
x: hexadecimal number

The default format for numbers is %.6g, but can be overriden with OFMT.

length

length (variable)

substr

substr (variable, position-first-char, length)

split

split (some_string, array_var [, field-seperator])

NOTE, in order to get the first element in array_var after the split, use the index 1 (not 0, as c programmers might be used to).

awk '{split ($0, a, "e"); print a[1]}' sample_file

just some text
to demonstrate
how split works

just som
to d
how split works

if (condition) action

if (condition) action
else           action

if (condition) {
  action_1;
  action_2;
  ..
  action_n;
}

if (condition) {
  action_1;
  action_2;
  ..
  action_n
}
else {
  action_m;
  ..
  action_q
}

while

Built in variables

The following script prints only the lines that contain exactly two words:

awk ' NF == 2 ' <filename>

Number of record. That is the line number awk is currently processing.

awk ' {printf "%3d %s\n", NR, $0} ' u

FILENAME

Record seperator. Default: newline, can be changed with the -F command line option.

OFS

ORS

OFMT

The default for numeric output, default: %.6g. See also printf.

$0, $1...

$0 is the whole line being processed. $1 is the first word, $2 the second and so on. The words are the lines splitted on the record seperator RS.

The field variables are not read only, so $2="foo bar" is perfectly valid.

Mathematical operators

Arrays

awk supports one-dimensional arrays. The index can be strings and integers

a[1]="one"
b["two"]="2"

Examples

Thanks

Thanks to Dom Bragge who notified my of an error on this page.


	René Nyffenegger's collection of things on the web
	René Nyffenegger on Oracle - Most wanted - Feedback - Follow @renenyffenegger