Re: memory allocations / libxml



On Mon, Mar 04, 2002 at 09:42:07AM +0000, Michael Meeks wrote:
> 
> On Fri, 2002-03-01 at 21:37, Havoc Pennington wrote:
> > > 	xmlStrndup
> > > 	xmlParseAttValue
> > 
> > UIHandler, no surprise there either.
> 
> 	I don't understand what that's meant to mean. There are various fixes
> possible here, clearly writing a custom on-stack XML parser for simple
> fragments would be one (not so trivial) possibility, but the number of
> discrete XML fragment parses should be extremely tiny, mostly people
> should be using the 'set_prop' API which does no parsing in the common
> case.
> 
> 	Of course, it seems possible (to me), but is perhaps scads of coding,
> to dup the buffer being parsed, and then scribble '\0's on it when we
> hit significant lexical tokens [if we're doing a SAX parse], to avoid
> doing:
> 
> 700x
> 	tmp = xmlStrndup (part_of_buffer, 7);
> 	ctxt->uiCharacters (ctxt, tmp);
> 	xmlFree (tmp);
> 
> With 1 big dup instead [ or a chunked dup / but then the chunk reading
> becomes perhaps far more painful, I havn't looked at the code ].
> 
> 	Is that an easy thing to do Daniel ? clearly I'd rather use libxml than
> writing my own XML parser :-) we already use the fast SAX interface.

  No idea what you're speaking of. Write your own parser, and maintain it !

> 	Also; I imagine that libXML could (probably) for short strings use
> alloca for this sort of copy/call/free sequence - is that a feasible
> suggestion ? that'd kill locking and malloc overhead and speed up the
> parser nicely.

  Again no idea what you're speaking about. Allocca is not portable,
I won't use it .

> 	The thing that worries me more about libxml looking at an strace of
> it's operation in gconf, bonobo-activation, nautilus etc. is this:
> 
> [snip]
>        I'm hoping this one is easy to fix; here is an strace -ttt trace
> of me calling xmlParseFile on a really quite small file :-) as you would
> expect I would not imagine that we need the umpteen redundant read
> syscalls all returning 0 :-)
> 
>         Any chance of a fix ? it gets worse with bigger files I have:
> 
> [pid  4424] 1014818823.507812 read(10, "", 4096) = 0
> ...
> [pid  4424] 1014818823.524004 read(10, "", 4096) = 0
> [pid  4424] 1014818823.524106 close(10) = 0

   I will fix it the day you will report it normally. I.e. either by
posting the problem on the mailing list of through bugzilla. Sending mail
ending up in my inbox are getting there with 500 others and possibly deleted
discarded or forgotten . Time you learn about process ...

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]