Software factors that influence performance
The way in which applications are written
usually has a large impact on performance.
If they make inefficient use of processing power,
memory, disk, or other subsystems,
it is unlikely that you will improve the situation significantly
by tuning the operating system.
The efficiency of the algorithms used by an application,
or the way that it uses system services,
are usually beyond your control
unless you have access to source code.
Some applications such as large relational database systems
provide extensive facilities for performance monitoring and tuning
which you should study separately.
Some applications also provide information
about any necessary operating system tuning or configuration
that is needed to get the best performance.
For example, a database management system may use
raw access to a disk partition
rather than going through the buffer cache.
If such is the case,
you can probably reduce the buffer cache size
to the minimum needed by the operating system,
networking, and any administration utilities.
This frees up memory for the application's own use.
Such applications may also require changes
to the default values of STREAMS
and interprocess communication resources
(semaphores, shared data, and message queues),
or the configuration of an additional
device driver
into the
kernel.
If you are using applications that have been developed locally,
and you have access to the source code for these,
there are several areas to look at when
optimizing their performance:
-
Examine the results of execution
profiling provided by the
monitor(S)
function using
prof(CP)
to show where a program spends most of its time. The
timex(ADM)
command can also provide valuable information about a
program's time spent executing in system and user mode,
its use of real memory,
and the amount of I/O data it processed.
-
Use
crash(ADM)
to discover the size of a process's
regions
(text, data, stack, shared libraries and so on), or
ps(C)
to find out the size of its swappable image (data and stack)
and its total
virtual memory
usage.
You can also use the
crash(ADM)
utility to find out what files a process has open.
-
Use
trace(CP)
to discover the names and arguments of the system calls
used by a program.
Factors that you should take into account when assessing the
efficiency of an application's code are:
-
Does it use algorithms that make best use of time and storage?
-
How does the application handle shared access to data records or files?
If it prolongs locking, or locks more than necessary,
this will cause an apparent performance problem.
It should at least warn
that the resource is locked by another user or process,
and perhaps allow read-only access.
Consideration should also be given to
who gets access when the resource is finally unlocked.
-
Are the optimization features of the compiler being used?
-
Are shell scripts being used
where a C program would be more efficient?
If you cannot avoid using shell scripts,
the Korn shell contains many built-in commands
that avoid the need to fork and exec subprocesses.
See
``Tuning script performance''
for more information.
-
Is it using large numbers of system calls?
System calls are expensive in processing
overhead
and may cause a
context switch
on the return from the call.
You can use
trace(CP)
to discover the system call usage of a program.
-
Is it using inefficient
read(S)
and
write(S)
system calls to move small numbers of characters at a time
between user space and kernel space?
If possible use buffered I/O to avoid this.
-
Are formatted reads and writes to disk being used?
Unformatted reads and writes are much more efficient
for maintaining precision, speed of access,
and generally need less disk space.
-
Is the application using memory efficiently?
Many older applications use disk extensively
since they were written in the days
of limited core storage and expensive memory.
-
Does the application call
malloc(S)
to allocate memory?
The version of the libc library in SCO OpenServer
makes much more use of dynamic data structures
in its implementation than did older versions.
This reduces the memory
that libc needs initially
but it can cause a process' heap to grow faster.
malloc's memory allocation algorithm
uses the traditional first-fit strategy
and does not perform any garbage collection.
If malloc cannot find
a suitably sized unallocated piece of memory in its free list,
it will acquire more pages of memory for use
even if the total amount of memory available on the free list
is larger than that requested.
It is left to the application programmer
to try and avoid fragmenting memory.
Memory leakage can occur if you do not call
free(S)
to place blocks of memory back in the malloc pool
when you have finished with them.
-
Does the application group together routines that are used together?
This technique (known as localization of reference)
tends to reduce the number of
text
pages
that need to be accessed when the program runs.
(The system does not load pages of program text into memory
when a program runs
unless they are needed for the program's execution.)
-
Does the application use shared libraries or dynamic linked libraries
(DLLs)?
The object code of shared libraries
can be used by several applications at the same time;
the object code of DLLs is also shared
and is only loaded when an application needs to access it.
Using either type of library
is preferable to using statically linked libraries
which cannot be shared.
-
Does the application use library routines and system calls
that are intended to enhance performance?
Examples of the APIs provided are:
-
Memory-mapping loads files directly into memory for processing
(see
mmap(S)).
-
Fixed-priority scheduling allows selected time-critical processes
to control how they are scheduled
and ensure that they execute when they have work to perform.
Applications can use the predictable scheduling behavior to improve
throughput
and reduce
contention
(see
sched_setparam(S)
and
sched_getparam(S)).
-
Support for high performance
asynchronous I/O, semaphores and latches,
and high-resolution timers and
spin locks
for use by threaded applications (see
aio(FP),
semaphore(FP),
and
time(FP)).
Previous topic:
Hardware factors that influence performance
© 2003 Caldera International, Inc. All rights reserved.
SCO OpenServer Release 5.0.7 -- 11 February 2003