Shared libraries

Tuning the shared library code

Some suggestions for how to organize shared library code to improve performance are presented here. They apply to paging systems.

The non-shared C library contains several diverse groups of functions. Many processes use different combinations of these groups, making the paging behavior of any shared C library difficult to predict. A shared library should offer greater benefits for more homogeneous collections of code. For example, a database library probably could be organized to reduce system paging substantially, if its static and dynamic calling dependencies were more predictable.

Profile the code

First, profile the code that might go into the shared library (see prof(CP)).

Choose library contents

Based on profiling information, make some decisions about what to include in the shared library. a.out file size is a static property, and paging is a dynamic property. These static and dynamic characteristics may conflict, so you have to decide whether the performance lost is worth the disk space gained. See ``Choosing library members''. for more information.

Organize to improve locality

Try to improve locality of reference by grouping dynamically related functions. If every call of funcA generates calls to funcB and funcC, try to put them in the same page. cflow(CP) (documented in the Programmer's Reference) generates this static dependency information. Combine it with profiling to see what things actually are called, as opposed to what things might be called.

Align for paging

Arrange the shared library target's object files so that frequently used functions do not unnecessarily cross page boundaries. When arranging object files within the target library, be sure to keep the text and data files separate. You can reorder text object files without breaking compatibility; the same is not true for object files that define global data. Use name lists and disassemblies of the shared library target file, to determine where the page boundaries fall.

After grouping related functions, break them into page-sized chunks. Although some object files and functions are larger than a single page, most of them are smaller. Use the infrequently called functions as glue between the chunks. Because the glue between pages is referenced less frequently than the page contents, the probability of a page fault decreases.

After determining the branch table, arrange the library's object files without breaking compatibility. Put frequently used, unrelated functions together because they probably will be called randomly enough to keep the pages in memory. System calls go into another page as a group, and so on. The following example shows how to change the order of the C library's object files:

   Before			After

#objects #objects ... ... printf.o strcmp.o fopen.o malloc.o malloc.o printf.o strcmp.o fopen.o ... ...

Avoid hardware thrashing

Improved performance by arranging the typical process to avoid cache entry conflicts. If a heavily used library had both its text and its data segment mapped to the same cache entry, the performance penalty would be particularly severe. Every library instruction would bring the text segment information into the cache. Instructions that referenced data would flush the entry to load the data segment.

Next topic: Checking for compatibility
Previous topic: Providing compatibility with non-shared libraries

© 2003 Caldera International, Inc. All rights reserved.
SCO OpenServer Release 5.0.7 -- 11 February 2003