Automating frequent tasks

How to control program performance

As mentioned earlier, in any shell script, 90% of the computational load is imposed by about 10% of the script. The bottlenecks to look out for are as follows:

To improve the performance of a shell script, you need to be constantly aware of these considerations. Any activity that takes place in a main loop is likely to yield a big performance improvement if you can find a way to reduce the amount of disk I/O or number of processes it requires. Activities that require a large data file may be speeded up by switching to several smaller files, if possible. (A small file is one that is less than eight or ten kilobytes long; for technical reasons such files can be opened and scanned more rapidly than larger files.)

The standard development cycle, which should be applied to shell procedures as to other programs, is to write code, get it working, thoroughly test it, measure it, and optimize the important parts (outlined above), looping back to earlier stages wherever necessary. The time(C) command is a useful tool for optimizing shell scripts. time is used to establish how long a command took to execute:

   $ time ls
   real	0m0.06s
   user	0m0.03s
   sys	0m0.03s
The values reported by time are the elapsed time during the command (the real time); the time the system took to execute the system calls within the command (the ``sys'' time); and the time spent processing the command itself (the user time). In practice, only the first value, the real time, is relevant at this level. Note that this is the output from the Korn shell's built-in time command; the Bourne shell output may vary. (If you have the Development System, the timex(ADM) command offers additional facilities.)

Because the SCO OpenServer system is multi-tasking, it is impossible to accurately judge how long a program is taking to run by any other means; a seemingly slow process may be the result of an unusually heavy load being placed on the computer by some other user or process. Each timing test should be run several times, because the results are easily disturbed by variations in system load.

A useful technique is to encapsulate the body of a loop within a function, so that the sole activity within the loop is to call that function; you can then time the function, and time the loop as a whole. Alternatively, you can time individual steps in the process to see which of them are taking longest.

Next topic: Number of processes generated
Previous topic: How programs perform

© 2003 Caldera International, Inc. All rights reserved.
SCO OpenServer Release 5.0.7 -- 11 February 2003