On the cost of libraries



Following up to my earlier post where I complained about the many 
libraries used by gnome I did some research into the actual costs.

Methology:
Create a shared library (lib1.so) with 10000 symbols in it.
Create an app that links agains this library, accesses all symbols in it 
twice, and comparing subtracting the time it took for the second run from 
the first run, thus calculating the approximate time spent in symbol 
lookup.

In order to compare the cost of having many libraries linked to the app i 
also created 40 small (just 2 symbols in each) libraries that could be 
linked in to the app.

Result:
The result depends on the link order of the libraries. If lib1.so is 
specified in as the first library on the link line, then the speed of 
symbol lookup in that library is the same, independent of how many 
libraries are linked in. (This is due to the way the symbol resolver 
works, it looks up the symbol in each library in the RTLD_GLOBAL space in 
the order they were loaded.)

If i on the other hand link the small libraries before the large one i get 
a speed difference.

Here are some results on my Athlon 700Mhz
N libs   usecs/lookup   delta
=============================
1        2.43           
1+10     3.98           1.55
1+20     6.24           2.26
1+30     9.62           3.38
1+40    12.96           3.34

As we can see, linking to 40 extra libs increased the cost of symbol 
lookups with more than a factor of 5.

Now, does this matter?

The extra cost is only on the first reference to the symbol, all following 
references to the symbol are unaffected. So this is mainly a startup time 
issue.

Symbol lookup for 10000 symbols took 26 msec with 1 library, and 132 msec 
for 40 libs. Extrapolating to 60 libs (this is what gnucash had) gives 
about 185 msec.

185 milliseconds extra startup time is noticeable, but how many apps 
actually dereference 10000 symbols at startup? Gtk+ 2.0 with dependencies 
currently exports about 4500 symbols, and it is concievable that the full 
Gnome 2 API will export about 10000 symbols.

No app will use all the symbols in the library though, and the 
measurements I made are worst case (all symbols are found in the last 
library). In reality most symbols referenced are probably in the lower 
layers (fan-out effect as higher level functions call lower lever), and 
lower level libraries are (hopefully) linked earlier. 

As a rough guess i think the penalty linking to 60 libraries is on the 
order of 50 milliseconds extra startup time. I think this is something we 
can live with.

Note that this only discussed one aspect of libraries, the splitting 
libraries aspect. Libraries do have other costs too, such as paging in 
code and fixups. Most of the fixup cost can be fixed by using ELF 
prelinking, and page-in costs could be lowered with a grope-like tool.

To summarize, I think our current model is good and adding libraries 
is not a huge problem for us, although I would recommend against random 
splitting of libraries (i.e. the 10 guppy libraries linked to by gnucash 
might serve as a bad example).

/ Alex





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]