Using awk

Cooperation with the shell

In all the examples thus far, the awk program is in a file and is retrieved using the -f flag, or it appears on the command line enclosed in single quotes, as in the following example:

   awk '{ print $1 }' ...
Since awk uses many of the same characters as the shell does, such as $ and ", surrounding the awk program with single quotes ensures that the shell passes the entire program unchanged to the awk interpreter.

Now, consider writing a command addr that searches a file addresslist for name, address, and telephone information. Suppose that addresslist contains names and addresses in which a typical entry is a multiline record such as the following:

G. R. Emlin
600 Mountain Avenue
Murray Hill, NJ 07974

Records are separated by a single blank line.

You want to be able to search the address list by issuing commands like the following:

addr Emlin

To do this, create a program of the following form:

   awk '
   BEGIN	{ RS = "" }
   ' addresslist
The problem is how to get a different search pattern into the program each time it is run.

There are several ways to do this. One way is to create a file called addr that contains the following lines:

   awk '
   BEGIN	{ RS = "" }
   ' addresslist
The quotes are critical here. The awk program is only one argument, even though there are two sets of quotes, because quotes do not nest. The $1 is outside the quotes, visible to the shell, which then replaces it by the pattern Emlin when you invoke the command addr Emlin.

A second way to implement addr relies on the fact that the shell substitutes for $ parameters within double quotes:

   awk "
   BEGIN	{ RS = \"\" }
   " addresslist
Here you must protect the quotes defining RS with backslashes so that the shell passes them on to awk, uninterpreted by the shell. $1 is recognized as a parameter, however, so the shell replaces it by the pattern when you invoke the following command:

addr pattern

A third way to implement addr is to use ARGV to pass the regular expression to an awk program that explicitly reads through the address list with getline:

   awk '
   BEGIN   { RS = ""
             while (getline < "addresslist")
                if ($0 ~ ARGV[1])
                   print $0
   } ' $*
All processing is done in the BEGIN action.

Notice that you can pass any regular expression to addr; in particular, you can retrieve parts of an address or telephone number, as well as a name.

Next topic: Spanning multiple lines
Previous topic: The system function

© 2003 Caldera International, Inc. All rights reserved.
SCO OpenServer Release 5.0.7 -- 11 February 2003