DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 
Using awk

Word frequencies

Here we show how to use associative arrays for counting. Suppose you want to count the number of times each word appears in the input, where a word equals any contiguous sequence of non-blank, non-tab characters. The following program prints the word frequencies, sorted in decreasing order:

        { for (w = 1; w <= NF; w++) count[$w]++ }
   END  { for (w in count) print count[w], w | "sort -nr" }
The first statement uses the array count to accumulate the number of times each word is used. Once the input has been read, the second for loop pipes the final count, along with each word, into the sort command. Running this program on the first two paragraphs of this chapter produces output that starts as follows:
   6 awk
   4 the
   4 programs
   4 for
   4 and
   3 you
   ...

Next topic: Accumulation
Previous topic: Generating reports

© 2003 Caldera International, Inc. All rights reserved.
SCO OpenServer Release 5.0.7 -- 11 February 2003