Re: ARC & GNOME [Was: How we make decisions...]




Sean:

On Tue, 2004-12-07 at 15:55 -0600, Brian Cameron wrote:

Mike:


On Tue, 07 Dec 2004 14:09:46 -0600, Brian Cameron wrote:


I would also be very interested in dedicating time to improve the
overall developer documentation found on developer.gnome.org.

OK, just to clarify Sean and I weren't talking about API reference docs,
more documentation on what libraries made breaking changes at which points
so ISVs can get information relevant to backwards compatibility. It's a
slight tangent to the API docs discussion :)

Right.  I did catch that, but I am also interested in working to generally
improve the developer documentation in GNOME, and not just API reference
docs.  In other words, getting better documentation about interface change
in libraries and applications from release-to-release would be very useful
to me.

One of my goals with such a sight would be not only to document such
changes but serve as a center for education and collaboration to help
_avoid_ such breaks.  Documenting breaks might help developers to a
small extent, but whether a break is documented or not the users are
still screwed over due to apathy or carelessness.

Right.  There is a real need to both document change and also to ensure
that things do not break.  Though these two tasks are related, they are
probably separate tasks.  Improving the documentation tools and enhancing
the existing documentation will help to improve the first, while it
is probably necessary to use tools to assist with the second.

I'm attaching a script that we use here at Sun to verify that symbols
and functions are not changing from release to release.  The tool assumes
that if you were comparing, say GNOME 2.0 with 2.6, that you have both
the GNOME 2.0 libraries and the 2.6 libraries installed on the same
machine.  The script then generates symbol reports in /tmp that hightlight
differences.  I believe I heard that the community has similar tools for
testing such things, but I wanted to share what we use here at Sun.
The script might require some hacking to make it work on Linux as well
as Solaris.

If there are similar tools the community uses to do such checking, I'd
be interested in learning more about them.

Of particular irritation is when packagers take a library that is
perfectly versioned and otherwise developed sanely and make a package
that isn't.  Take Fedora Core and its curl package, for example.  They
have a single package that holds the latest major ABI/API version of
libcurl and curl.  So if you have App A that uses libcurl.so.2 and App B
that uses libcurl.so.3, you are completely screwed.  You can't even
(easily) do a manual rpm -i install of both versions because both
packages also include the curl command line utility.  This is a perfect
example of a library with proper versioning that packagers break, and
that's avoidable.  (For what it's worth, yes I did file a bug on Fedora
for this, although it has, after several months, still not even seen a
reply.)

However, I think that the gtk-doc tool might be a good place to document
library breakages.  Currently the gtk-doc tool doesn't provide the ability
to put comments in the code to highlight when interfaces change (when
structures change, when file-integration points change, when interaction
with environment variables change, when enumerations change, etc.).
Currently the only thing gtk-doc provides that is in this arena is
information about what release new function interfaces were added.

If gtk-doc were enhanced to provide more information about interface change,
then this might be a good way to document such change.  gtk-doc could
certainly be enhanced to provide good information about what interface
changes happened from release-to-release.

If gtk-doc can notice interface change information, the tools should be
modified to raise utter hell when the interface *does* change, because
it shouldn't.  Ever.  Not until the 3.0 series.

gtk-doc isn't really written to automatically notice change in interfaces.
Instead it is a tool that takes comments provided in the code and builds
HTMLized docs from them.  While it isn't a good tool for identifying
change, it is a good way to clearly document changes from release-to-release.
Since the gtk-docs are what developers generally use to understand the
API, including such information there would ensure that people notice
such things.

For example, it isn't considered ABI breakage to add new values to an
enumeration in a minor release.  The gtk-docs currently doesn't highlight
such information, so the developer might only notice the change if they
compared old documentation with the new documentation (not very likely).
Therefore, a developer probably won't realize that making use of the
new enumeration value will mean that their compiled code won't work with
older versions of the library.  Including such information in the gtk-docs
would allow developers to make more informed choices about the interfaces
they choose to use.

I would like to see a general-purpose (and preferably cross-language
where possible) tool for generating an ABI description for a library and
allowing comparisons between two versions.  So during release you could
save a copy of the ABI definition, and during development you could
generate snapshots and compare it with the last release's version.  The
tool could then let you know whether ABI is broken (major version bump),
extended (minor version bump), or identical (revision or no version
bump).  The same tool could perhaps also be used for symbol table
generations for ELF libraries.

Let me know if the attached scripts are at all useful to you.  They are
what we use to try and verify library compatibility from release-to-release.

--

Brian
#!/bin/sh
#
# @(#) - ReviewLibs.sh 1.2 04/07/30
#
#    Run LSD 2 on a number of GNOME libararies
#	David J. Brown - July 2004
#

# V1 is the directory where the earlier version of the system lives
# V2 is the directory hierarchy where the later version lives
V1=/home/bc99092/work/gnome/arc/check-libs/2.0-libs
V2=/usr/lib

LIBS1="libglib-2.0 \
"

LIBS2=" libgobject-2.0 \
	libgmodule-2.0 \
"
LIBS3="libpango-1.0 \
	libpangox-1.0 \
	libpangoft2-1.0 \
"
LIBS4="	libgtk-x11-2.0 \
	libatk-1.0 \
"
LIBS5="	libgdk-x11-2.0 \
	libgdk_pixbuf-2.0 \
	libgdk_pixbuf_xlib-2.0 \
"
LIBS6="	libglade-2.0 \
"
# Note: Following is missing on S9u7/GNOME 2.6 
#
LIBS="	libxml2  \
"
REVIEWLIBS="$LIBS1 $LIBS2 $LIBS3 $LIBS4 $LIBS5 $LIBS6"

# main()
#
    for library in $REVIEWLIBS; do
	echo
	echo "--- Running symbols comparison on $library ---"
	GetSyms.sh ${V1}/${library}.so ${V2}/${library}.so
    done
#!/bin/sh
#
# @(#) - GetSyms.sh 1.3 04/07/30
#
#    Simple binary-level interfaces differences analysis tool
#
#    Given two versions of a library:
#	1. Collect each library's .so dependencies
#	2. Get symbols from each version of the library and compare them
#
#	David J. Brown - July 2004

# checklib - 
#	usage: check_lib <fullpathname_of_library>
#
#	Verify the library's existence, returning "NULL" if not found;
#	else follow any symlink to get the filename of the actual
#	ELF object and return that
#
check_lib() {
    LIB=$1
    if [ ! -f $LIB ] ; then
	echo NULL
    elif [ -L $LIB ] ; then
	# Chase symlink to find filename for real ELF object
	#
	LIB_real=`ls -l $LIB | nawk 'BEGIN {FS=">"}; {print $2}'`
	echo $LIB_real
    else
	echo $LIB
    fi
}

# so_dependencies -
#	Determine the various shared object dependencies
#	of a library
#
#    usage: so_dependencies <DIRECTORY_NAME> <LIBRARY_BASENAME>
#
so_dependencies() {
    DIR=$1; LIB=$2

    echo "Report for: $LIB" > /tmp/$LIB.report
    echo "" >> /tmp/$LIB.report

    echo "Direct shared object dependencies ($LIB)" >> /tmp/$LIB.report
    echo "" >> /tmp/$LIB.report
    dump -Lv ${DIR}/$LIB | egrep "NEEDED|SONAME|RUNPATH" >> /tmp/$LIB.report
    echo "" >> /tmp/$LIB.report

    echo "All shared object dependencies ($LIB)" >> /tmp/$LIB.report
    ldd -r ${DIR}/$LIB 2>&1 >> /tmp/$LIB.report
    echo "" >> /tmp/$LIB.report
}

# get_syms -
#	Extract the various categories of symbol of the given library
#
#    usage: get_syms <LIBRARY_BASENAME>
#
#	Uses: 
#	    /tmp/$LIB.syms	- a file listing all the symbols
#	Creates:
#	    /tmp/$LIB.imported
#	    /tmp/$LIB.data
#	    /tmp/$LIB.func
#
#	Working file (removed after use):
#	    /tmp/$LIB.defs
#
get_syms() {

    LIB=$1
    grep UNDEF /tmp/${LIB}.syms > /tmp/${LIB}.imported
    grep -v UNDEF /tmp/${LIB}.syms > /tmp/${LIB}.defs
    grep OBJT /tmp/${LIB}.defs | egrep -v "_DYN|_TABLE_|_etext|_end|_edata" \
	> /tmp/${LIB}.data
    grep FUNC /tmp/${LIB}.defs > /tmp/${LIB}.func
    rm /tmp/${LIB}.defs
}

# get_simple_lists -
#	Extract and sort simple lists for each symbol type 
#	of the given library
#
#    usage: get_lists <LIBRARY_BASENAME>
#
#	Uses:
#	    /tmp/$LIB.imported
#	    /tmp/$LIB.data
#	    /tmp/$LIB.func
#	Creates:
#	    /tmp/$LIB.imported.list
#	    /tmp/$LIB.data.list
#	    /tmp/$LIB.func.list
#
get_simple_lists() {

    LIB=$1
    for syms in func data imported ; do
	if [ x$DEBUG = xtrue ] ; then
	    echo Doing syms: /tmp/${LIB}.$syms ...
	fi
	nawk 'BEGIN { FS="|" }; {print $8}' < /tmp/${LIB}.$syms \
		| sort > /tmp/${LIB}.${syms}.list
    done
}

# report_symbols -
#	Report the symbols in each category in a library-specific report
#
#    usage: report_symbols <LIBRARY_BASENAME>
#
#	Uses:
#	    /tmp/$LIB.imported.list
#	    /tmp/$LIB.data.list
#	    /tmp/$LIB.func.list
#	Creates:
#	    /tmp/$LIB.report
#
report_symbols() {
    LIB=$1

    count=1
    for syms in data func imported ; do
	sym_count=`wc /tmp/$LIB.${syms}.list | nawk '{print $1}'`

	case $syms in
	    data) 
		echo "Global Data defined ($sym_count): ($LIB)" \
		    >> /tmp/$LIB.report ;;
	    func) 
		echo "Functions defined ($sym_count): ($LIB)" \
		    >> /tmp/$LIB.report ;;
	    imported) 
		echo "Imports ($sym_count): ($LIB)" >> /tmp/$LIB.report 
		;;
	esac

	echo "" >> /tmp/$LIB.report
	sed 's/^/	/' /tmp/$LIB.${syms}.list >> /tmp/$LIB.report

	echo "" >> /tmp/$LIB.report
    done
}

# report_diffs -
#	Report on the library's symbol differences for each category
#	Use a simple method based on joining the symbols lists from each file
#
#    usage: report_diffs <LIB1_BASENAME> <LIB2_BASENAME>
#
#	Uses (files in /tmp):
#	    $REPORT 	- The overall comparison report file
#	    $LIB1.data.list, $LIB2.data.list
#	    $LIB1.func.list, $LIB2.func.list
#	    $LIB1.imported.list, $LIB2.imported.list
#	Creates (files in /tmp):
#	    $LIB1.jlist	- Joined list of unique symbols in each lib
#	Working files (in /tmp, removed after use):
#	    $LIB1.list, $LIB2,list
#
report_diffs() {
    LIB1=$1
    LIB2=$2
    GREP=/usr/xpg/bin/grep

    for syms in data func imported ; do

	# Count and list the deletions and additions per symbol category
	#
	case $syms in
	    data)
		LABEL="Global Data" ;;
	    func)
		LABEL="Functions" ;;
	    imported)
		LABEL="Imports" ;;
	esac

	# Determine the number of deletions, and additions -
	#    Simple algorithm:
	#	Label all syms in first file with a leading "1 "
	#	Label all syms in first file with a leading "2 "
	#	Use join(1) on field 2 (symbol name) to report only unique
	#
	sed 's/^/1 /' /tmp/$LIB1.$syms.list > /tmp/$LIB1.list
	sed 's/^/2 /' /tmp/$LIB2.$syms.list > /tmp/$LIB2.list
	join -v 1 -v 2 -j 2 /tmp/$LIB1.list /tmp/$LIB2.list > /tmp/$LIB1.jlist
	rm /tmp/$LIB1.list /tmp/$LIB2.list

	echo "$LABEL changes: (from $LIB1 to $LIB2)" >> /tmp/$REPORT 
	for change in deleted added; do
	    case $change in

	    deleted) 
		# Syms followed by a "<sp>1" are only in the first version
		#
		sym_count=`grep " 1" /tmp/$LIB1.jlist | wc | nawk '{print $1}'`
		echo "  $LABEL removed ($sym_count):" >> /tmp/$REPORT

		grep " 1" /tmp/$LIB1.jlist| nawk '{print $1}' \
		    | sed 's/^/	/' >> /tmp/$REPORT

		echo "" >> /tmp/$REPORT
		;;
	    added) 
		# Syms followed by a "<sp>2" are only in the second version
		#
		sym_count=`grep " 2" /tmp/$LIB1.jlist | wc | nawk '{print $1}'`
		echo "  $LABEL added ($sym_count):" >> /tmp/$REPORT

		grep " 2" /tmp/$LIB1.jlist| nawk '{print $1}' \
		    | sed 's/^/	/' >> /tmp/$REPORT
		echo "" >> /tmp/$REPORT
		;;
	    esac
	done

	echo "" >> /tmp/$REPORT
    done
}

usage() {
	echo $0 - Compare interfaces in two versions of a library
	echo "    usage: $0 <library-version1> <library-version2>"
	exit 1
}

# XXX - Set the following to zero to avoid the syms analysis phase
#	(for debugging the report generation mostly)
#
DO_ANALYZE=true
DEBUG=false

# main() 
#
#	Synposis: "Let the games begin"
#
#
    if [ $# -lt 2 ] ; then 
	usage 
    fi

    LIBV1=$1
    LIBV2=$2

    #   Get directory and basenames of the libraries
    #
    DIR1=`dirname $LIBV1`
    DIR2=`dirname $LIBV2`
    echo DIR1 is $DIR1, DIR2 is $DIR2

    LIB1=`check_lib $LIBV1`
    if [ $LIB1 = NULL ] ; then
	echo "Warning: File $LIBV1 not found.  Exiting ..."
	exit 2
    fi

    LIB2=`check_lib $LIBV2`
    if [ $LIB2 = NULL ] ; then
	echo "Warning: File $LIBV2 not found.  Exiting ..."
	exit 3
    fi

    echo LIB1 \(non symlink\): $LIB1, LIB2 \(non symlink\): $LIB2

    #   Initiate the master report file
    #
    REPORT=`echo $LIB1 | sed 's/.so.*//'`.report

    echo "" > /tmp/$REPORT
    echo "Report for: $LIB1 vs. $LIB2" >> /tmp/$REPORT
    echo "" >> /tmp/$REPORT

    # XXX debug reporting
    if [ x$DEBUG = xtrue ]; then
	echo REPORT file is $REPORT
	lines=`wc /tmp/$REPORT | nawk '{print $1}'`
	echo "After creating $REPORT, it is $lines lines"
    fi


    #   I. Determine direct (and all) shared object dependencies for 
    #	   the libraries
    #
    so_dependencies $DIR1 $LIB1
    so_dependencies $DIR2 $LIB2

    # XXX debug reporting
    if [ x$DEBUG = xtrue ]; then
	lines=`wc /tmp/$LIB1.report | nawk '{print $1}'`
	echo "After so_deps LIB1 report is $lines lines"
    fi

    #   II. Do Symbols analysis and differences
    #
    #	A family of files is generated:
    #
    # (Lib version 1)
    #	$LIB1.syms	-	Full symbols list
    #	$LIB1.defs	-	Exports (all global syms defined)
    #	$LIB1.func	-	Exports: Functions
    #	$LIB1.data	-	Exports: Global Data
    #	$LIB1.imported	-	Imports
    #
    #	$LIB1.func.list	-	Function Exports - Simple List
    #	$LIB1.data.list	-	Data Exports - Simple List
    #	$LIB1.imported.list -	Imports - Simple List
    #
    # (Lib version 2)
    #	$LIB2.syms	-	Full symbols list
    #	$LIB2.defs	-	Exports (all global syms defined)
    #	$LIB2.func	-	Exports: Functions
    #	$LIB2.data	-	Exports: Global Data
    #	$LIB2.imported	-	Imports
    #
    #	$LIB2.func.list	-	Function Exports - Simple List
    #	$LIB2.data.list	-	Data Exports - Simple List
    #	$LIB2.imported.list -	Imports - Simple List
    #
    # (Differences - note: named using Lib Ver. 1)
    #	$LIB1.jlist	 -	Symbol differences in a join file
    #

if [ x$DO_ANALYZE = xtrue ] ; then
    echo "Performing Symbols Analysis ..."

    # 1. First we want to find all the Function and Global-data
    #      items in the library's symbol table, whether defined or
    #      undefined:

    nm ${DIR1}/$LIB1 | grep "GLOB" > /tmp/${LIB1}.syms
    nm ${DIR2}/$LIB2 | grep "GLOB" > /tmp/${LIB2}.syms


    # 2. Next we separate out the undefined symbols from that
    #       list, since those are the imported (depended-upon)
    #       interfaces obtained from some other library/libraries.

    get_syms $LIB1
    get_syms $LIB2

    # 3. Extract simple symbol lists from each of the above 3 files
    #   
    #       awk -f [in-file] 'BEGIN {FS="|"}; {print $8}' | sort > [out-file]
    #   
    #       e.g.:
    #   	awk -f libpango-1.0.so.0.0.5.func 'BEGIN {FS="|"}; \
    #	   	{print $8}' | sort > libpango-1.0.so.0.0.5.func

    get_simple_lists $LIB1
    get_simple_lists $LIB2

    # 4. Determine per-category differences:
    #	diff -c libpango.5.func libpango.399.func > pango.func.diffs
    #
    #    Note: No longer required.  We now simply use join(1) in reporting
    #
    #    get_diffs $LIB1 $LIB2
fi
# (if DO_ANALYZE)


    #  III. Append symbols analysis and differences to report

    echo "Writing per-version reports ..."

    report_symbols $LIB1
    echo "   $LIB1.report"

    # XXX debug reporting
    if [ x$DEBUG = xtrue ]; then
	lines=`wc /tmp/$LIB1.report | nawk '{print $1}'`
	echo "After report_syms($LIB1), its report is $lines lines"
    fi

    report_symbols $LIB2
    echo "   $LIB2.report"

    # XXX debug reporting
    if [ x$DEBUG = xtrue ]; then
	lines=`wc /tmp/$LIB2.report | nawk '{print $1}'`
	echo "After report_syms($LIB2), its report is $lines lines"

    fi

    echo "Reporting Differences ..."

    # Append the symbols differences reports
    #
    report_diffs $LIB1 $LIB2

    # XXX debug reporting
    if [ x$DEBUG = xtrue ]; then
	lines=`wc /tmp/$REPORT | nawk '{print $1}'`
	echo "After report_diffs $REPORT is $lines lines"
    fi

    # Append symbols report from library's first version to the overall report
    #
    cat /tmp/$LIB1.report >> /tmp/$REPORT
    rm /tmp/$LIB1.report

    # XXX debug reporting
    if [ x$DEBUG = xtrue ]; then
	lines=`wc /tmp/$REPORT | nawk '{print $1}'`
	echo "Appended $LIB1.report, $REPORT is now $lines lines"
    fi

    # Append symbols report from library's second version to the overall report
    cat /tmp/$LIB2.report >> /tmp/$REPORT
    rm /tmp/$LIB2.report

    # XXX debug reporting
    if [ x$DEBUG = xtrue ]; then
	lines=`wc /tmp/$REPORT | nawk '{print $1}'`
	echo "Appended $LIB2.report, $REPORT is now $lines lines"
    fi




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]