Putting the library bloat thing in perspective



Now that we've been discussing which new languages we should incorporate
to GNOME, I think it's useful to perform a (rather crude) analysis on a
typical GNOME desktop.  Dependency analysis, that is.

Why?
* we'll answer the question of 'what libs are most commonly used'

Why does this answer help?
* it helps to decide which libraries are most commonly distributed, and
by knowing most used dependencies, helps us decide which software is
most easily included without incurring into library bloat.
* it helps us determine which libs to target as development platforms
for.  For instance if libz is used twice as much as liby, you know libz
is a safer bet than liby, for platform libs.

You can also glean info on which GNOME binding is more actively used.

This is just information.  As such, it may help only in the context of a
decision process.  It may not help too.  Feel free to reproduce this on
your computer and see what differences you encounter.

Let's see:

running this small sh script:


echo -n > /tmp/analysis
for a in /bin/* /usr/bin/* /sbin/* /usr/sbin/* ; do
      ldd $a | awk ' { print $1 } ' >> /tmp/analysis
done
sort < /tmp/analysis | grep -v '^not$' > /tmp/sorted
uniq -c < /tmp/sorted > /tmp/counted
sort -g -r -t : /tmp/counted > /tmp/final
rm -f /tmp/analysis /tmp/sorted /tmp/counted


gives me a /tmp/final for which the top ten lines are:

   2355	/lib/ld-linux.so.2
   2355	libc.so.6
   1086	libm.so.6
    850	libdl.so.2
    752	libz.so.1
    630	libpthread.so.0
    529	libresolv.so.2
    505	libX11.so.6
    490	libXext.so.6
    462	libgcc_s.so.1

So, as we can see, /lib/ld-linux.so.2 is (no surprise here) the most
widely linked-to library, with 2355 linking binaries.  Then, X11 libs,
the zlib compression library and the math/threading libs.

Alright. 10-20 say:

    458	libstdc++.so.5
    455	libfreetype.so.6
    451	libSM.so.6
    451	libICE.so.6
    439	libjpeg.so.62
    438	libexpat.so.0
    419	libXrender.so.1
    419	libfontconfig.so.1
    417	libXrandr.so.2
    417	libXft.so.2

The standard C++ lib.  FreeType, JPEG libraries, and font/X11 libs.

Lines 21-30 begin to show QT/KDE libs.  Only at line 47 do GNOME libs
appear.  libgmodule is linked to 189 times.

This is a preliminary analysis.  RPM DB analysis follows:

RPM analysis:

A quick script modification lets us reveal which RPM packages are more
required by other packages, by virtue of discovering which linked-to
libraries are registered to which package in the RPM DB.  Package
analysis is more useful than just plain lib analysis because it lets us
spot more quickly the relation between packages "with the name they are
distributed which" and software that uses it:


echo -n > /tmp/analysis
for a in /bin/* /usr/bin/* /sbin/* /usr/sbin/* ; do
      ldd $a | awk ' { print $1 } ' >> /tmp/analysis
done
sort < /tmp/analysis | grep -v '^not$' > /tmp/sorted
xargs rpm -q --whatprovides < /tmp/sorted > /tmp/inrpmdb
sort < /tmp/inrpmdb > /tmp/sorted
uniq -c < /tmp/sorted > /tmp/counted
sort -g -r -t : /tmp/counted > /tmp/final
rm -f /tmp/analysis /tmp/sorted /tmp/counted


Let's see results.  Top ten:

   8702	glibc-2.3.2-101.1
   4237	XFree86-libs-4.3.0-42
   1746	kdelibs-3.2.0-0.1
    760	glib2-2.2.3-1.1
    752	zlib-1.2.0.7-2
    713	krb5-libs-1.3.1-6
    616	netpbm-9.24-12
    480	pango-1.2.5-1.1
    464	freetype-2.1.4-5
    462	libgcc-3.3.2-1

No surprises here right?  Although the KDE libs are very, very widely
used here.  That's no surprise for me either because I'm more of a KDE
user (been waiting for G2.6 and Spatial Nautilus).  And because most KDE
libs come in a single package (bad release practice in my opinion, but
it seems to work for them, where everything is a bit more centralized
than in the GNOME camp).

>From a statistical standpoint, the use frequencies were expected to be
distributed in this fashion (a 1/n style curve).  The first lib is used
twice the time as the second lib.

10-20:

    458	libstdc++-3.3.2-1
    457	openssl-0.9.7a-23
    439	libjpeg-6b-29
    438	expat-1.95.5-3
    419	fontconfig-2.2.1-6.1
    382	gtk2-2.2.4-5.1
    363	libart_lgpl-2.3.16-1
    324	libpng-1.2.2-17
    309	e2fsprogs-1.34-1
    291	XFree86-Mesa-libGL-4.3.0-42

My my, kde libs are more used than the standard c++ lib.  That's a
surprise.  Second comes gtk2.  Gurus may probably want to add GTK and
GNOME library numbers into one number to really see a fair comparison. 
GNOME provides more "separate" packages (I find this is a good thing).


Conclusions:

on an average distro (average = mine):
* KDE is more targeted to than the QT lib.
* the X11 libs are pervasively used
* the GNOME lib separation makes my stats look toasted (but manual
adding which I will NOT do at this time of the morning should provide an
accurate picture)
* esound is more used than aRTS
* gtk2 is four times as used as gtk1
* only 4 packages are using DBUS
* neon is used 2 times, libghttp 2 times, libsoup 3 times
* id3lib is 5 times, libid3tag 1 time
* librsvg is linked to only 4 times (didn't GNOME have pervasive SVG
support????)
* guile is used 6 times
* wxGTK is used 1 time
* gnomevfs is used 104 times, as opposed to kio 203 times
* there's no Sun Java compliant Java VM
* gnome-libs-1.4 are used a LOT (149 times)
* bonoboui is more linked to than gnomeui


I'm certain some of you will spin those factoids in a negative way, and
some of you will find a way to use these facts to support your ill views
=).  I will certainly use these to spin my ill views and support my
negative ways as well =).  I would publish this as a paper (properly
formatted, spun and spit at, I could force it down my college teachers'
throats), but I'm just 2 lazy.

For the reader:

grep your /tmp/final for all libs the base GNOME depends upon, and sum
their frequencies (column #1) to tally them up to KDE's libs. grep
/tmp/final for the bindings libs and find out which binding is more
used.

Keep the discussion going.

Oh:  On the mono/java discussion, I think that:
* Mono is technically superior to Java and can run code faster and in
more languages.  Fast enough for 99% of GNOME apps' needs.
* Java is more strategically positioned than Mono and would atract a
larger developer base.  Devels can develop to the GNU Classpath which is
mostly known for them, while Mono newcomers from the .NET field would
need to use a different set of class libs for lots of stuff.
* Java is here now, Mono is "nearly there", for 80% of development
needs.  This is an important factor that I haven't seen considered yet
at all.
* I like mono more than java (?)  That probably has to do with me
equating Miguel much more closely to "a good person" than James
Gosling.  After all, Miguel had much to do with this awesome Evolution
thingie which I use every single day.
* Both managed code environs will have to wait till official declaration
of interested parties.
* GNOME needs a modern managed code compiled environment, at the
platform level, so people can write GNOME libs in managed code, with a
minimum of bugs (think "no memory leaks", think "exception handling"). 
This needs to go hand in hand with some VERY unobtrusive glue that
allows C apps and libs to use the managed code libraries (preferably
100% transparent).  Think OS-support level if you need to.
* GNOME needs a managed code scripting environment.  This scripting
environment needs to be able to manipulate apps as a macro client, and
needs (push button, write in label, hide window) to be able to
manipulate apps' MODELS as objects (documents, filesystems, stuff).  We
are so fortunate in depending on the executable bit.  We can write
secure scripts, and UNIX provides the administrator with powerful
facilities to absolutely restrict script execution no matter what
(noexec /tmp, noexec /home, stuff).
-- 
Manuel Amador (Rudd-O) <amadorm usm edu ec>
not signing this cuz I'm on a laptop




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]