[xml] "atoms" for name/attribute strings



Forgive me for throwing out wild ideas that might have large consequences for libxml2 (whose distilled wisdom I greatly respect from my brief acquaintance so far), but I had a wild idea that might greatly speed up tree processing after parsing.

If all name (and namespace) and attribute (xmlChar *) strings were "atomized" (like Lisp atoms or Python strings--hashed to unique strings) by the parser, then all input tree processing (after atomizing all "interesting" name/attribute strings) would be reduced to comparing string pointers, not string values.

This could be a parser configuration setting (like entity expansion) that would only affect clients that requested it, and wouldn't necessarily affect a whole lot of the code. (In fact, it might be completely localizable in the form of an allocation function called for such names.)

Is this is a good idea, looked at from any of you old grizzled veterans' points of view?

Am I volunteering to do the work? Maybe... ;-)

Cheers!
--Chris Ryland / Em Software, Inc. / www.emsoftware.com




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]