René Nyffenegger's collection of things on the web
René Nyffenegger on Oracle - Most wanted - Feedback -
 

awk

awk -f awk-program-file input-file
awk-program-file is a file that contains an awk program.
awk 'awk-program' input-file

awk program

program-element_1
program-element_2
...
program-element_n

Program element

search-pattern { program-action }
The curly braces are necessary. If left out, the default action print is invoked.
If no search-pattern is specified, all lines will be matched.
program-action consists of 0, 1, or more functions. If more than one function is used, they are seperated by a semicolon (;).

Search patterns

BEGIN

BEGIN matches befor the first line of the input file is read.

END

END matches after the last line of the input file is read.

/regular-expression/

Matches the regular expression regular-expression

var ~ /regular-expression/

Matches if var matches the regular expression regular-expression
var can (and usually is) one of $0, $1....

/begin-of-block-regex/,/end-of-block-regex/

Starts to match with the first line that matches begin-of-block-regex, stops to match with the line that matches end-of-block-regex.
Can also be used on variables: var ~ /re1/,/re2/
The following example demonstrates the use of this feature:
awk '$1 ~ /seven/,/nine/ {print $2}' input-file
Here's the input-file:
one:        1
two:        2
three:      3
four:       4
five:       5
six:        6
seven:      7
eight:      8
nine:       9
ten:       10
eleven:    11
twelve:    12
thirteen:  13
fourteen:  14
fifteen:   15
sixteen:   16
seventeen: 17
eighteen:  18
nineteen:  19
twenty:    20
And here's the output:
7
8
9
17
18
19

var_1==var_2

Matches if var_1 is equal to var_2.
Possible operators>:
  • ==
  • <
  • <=
  • !=
  • >
  • >=

Built in functions

print

print x, y, z
If print is called without any arguments, $0 will be printed.

printf

printf format-string, expr_1, expr_2,... expr_n
  • c: single character
  • d: decimal number
  • e: exponential
  • f: float
  • g: e or f, whichever is shorter
  • o: octal number
  • s: string
  • x: hexadecimal number
The default format for numbers is %.6g, but can be overriden with OFMT.

length

length (variable)

substr

substr (variable, position-first-char, length)

split

split (some_string, array_var [, field-seperator])
NOTE, in order to get the first element in array_var after the split, use the index 1 (not 0, as c programmers might be used to).
awk '{split ($0, a, "e"); print a[1]}' sample_file
sample_file:
just some text
to demonstrate
how split works
Output:
just som
to d
how split works

if

if (condition) action
if (condition) action
else           action
if (condition) {
  action_1;
  action_2;
  ..
  action_n;
}
if (condition) {
  action_1;
  action_2;
  ..
  action_n
}
else {
  action_m;
  ..
  action_q
}

while

Built in variables

NF

Number of words in a line.
The following script prints only the lines that contain exactly two words:
awk ' NF == 2 ' <filename>

NR

Number of record. That is the line number awk is currently processing.
The following example 'numbers' the lines in a file:
awk ' {printf "%3d %s\n", NR, $0} ' u

FILENAME

Current filename.

FS

Input field seperator. Default: space or tab

RS

Record seperator. Default: newline, can be changed with the -F command line option.

OFS

The output field seperator, default: space.

ORS

The output record seperator, default: newline.

OFMT

The default for numeric output, default: %.6g. See also printf.

$0, $1...

$0 is the whole line being processed. $1 is the first word, $2 the second and so on. The words are the lines splitted on the record seperator RS.
The field variables are not read only, so $2="foo bar" is perfectly valid.

Mathematical operators

  • +
  • -
  • *
  • /
  • %
  • ++
  • --
  • op=
    op being one of +, -, *, /

Arrays

awk supports one-dimensional arrays. The index can be strings and integers
a[1]="one"
b["two"]="2"

Examples

Thanks

Thanks to Dom Bragge who notified my of an error on this page.