Re: easy login time win



On Friday at 15:39, Michael Meeks wrote:

> On Mon, 2004-11-22 at 22:38 -0500, Havoc Pennington wrote:
>> The basic situation here is that optimizing the parser can't possibly be
>> a big win; we need to avoid loading the zillion translations.
>> Avoiding the parsing entirely is the way to save a really noticeable
>> amount of time.
>
> 	Out of interest; are we storing the zillion translations in memory ? we
> saved a nice chunk of space with b-a-s when we incrementally added
> translations as necessary on-demand by re-parsing [ a very rare
> use-case ].

Another not-so-simple-in-practice proposal, but I still think it's
worth it.

Since translations are always present in MO files as well, perhaps the
solution is to move translations out of generated .schemas files for
default installations? 

Software that wants to use Schemas descriptions would then have to do
something like dgettext(GETTEXT_PACKAGE, "Original Schema string").

The problem here is establishing GConf key/schema <-> GETTEXT_PACKAGE
mapping (it will also probably slow down operation of software using
descriptions, but that's a non-issue since we're talking about
programs which interact with the user such as gconf-editor). 

This would be not-so-hard to implement on GConf side, but not so
trivial when descriptions are actually asked for.  Getting the above
mapping is the hardest thing to do: gconf_engine_get_schema() can
easily support that, but we need all applications to "register" their
schemas with the required translation domain.

Quite a twist in "How to install GConf schemas" section, but I think
it's worth deviating somewhat here for the decent speed gains.  This
can be easily profiled, just strip all <locale name!="C"> tags, for
instance with the attached Python script (my $sysconfdir/gconf/schemas/ 
has gone from 14MB to around 1.6MB, didn't try restarting Gnome with
this changes to see how much improvement could we gain; anyone has
some time to try this?).  Be sure to backup your .schemas first, and
then run as "./strip-translation.py *.schemas".

Note that we can still maintain backwards compatibility here, but
we'd probably need to introduce separate data store (i.e. file) to
hold schema files -> GETTEXT_PACKAGE mapping (so, we cannot put it in
.schemas itself, unless older gconf's wouldn't have problems with a
new tag).

> 	I guess, the most obvious thing to do is to a) glup lots of small files
> together, and then b) split them up into lots of small files again on
> translation lines :-)

I think the problem here is maintaining backwards compatibility.  Or
I'm missing something?

> 	Either way; we see the same parsing performance problems with the
> gnome-mime-data (only ~300ms on start though), and I guess
> ultimately .desktop files too.

I doubt .desktop files have this same problem: they're commonly very
slow, and I believe most of time gets spent on disk seeks, not on
parsing translations.

Unless you vere actually talking about "many small files" problem in
the first place :)

Cheers,
Danilo

#!/usr/bin/env python
import sys, libxml2, os, re

def remove_translations(node):
    child = node.children
    while child:
        nextnode = child.next
        if child.name == 'locale' and child.hasProp('name') and child.prop('name') != 'C':
            child.unlinkNode()
            child.freeNode()
        elif child.name != 'locale':
            remove_translations(child)
        child = nextnode

filenames = sys.argv[1:]

for filename in filenames:
    try:
        ctxt = libxml2.createFileParserCtxt(filename)
        ctxt.parseDocument()
        doc = ctxt.doc()
        if doc.name == filename:
            remove_translations(doc)
            
            out = file(filename, 'w')
            out.write(doc.serialize('utf-8', 1))

    except:
        print >> sys.stderr, "Some error with '%s'." % (filename)



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]