[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [xml] Freeing large documents extremely slow



On Thu, Feb 14, 2008 at 10:35:07PM -0500, Edward Z. Yang wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Over at phpdoc (the PHP Documentation Project), we regularly deal with
> very large XML files that occupy approximately 400-500 MB in the RAM.
> Lower-end, low RAM systems (the two sample ones we've seen were
> single-core, with 256-512 MB physical memory, so they have to use the
> swap partition.)

  If you are hitting swap, all bets are off w.r.t. attempting to
guess if it's normal behaviour or not. If you can't reproduce the
delay on a machine which does not swap, then that mean it's a 
matter of memory allocation handling, and basically an OS/libc
level question, not something libxml2 related.

> What appears to happen is that the allocation of memory goes reasonably
> quickly, but then when the PHP libxml extension attempts to deallocate
> the memory with xmlFreeDoc(), the system hangs, and we need to send a
> SIGINT to terminate the process.
> 
> Any ideas why this may be happening? Also, would this be a bug in the
> PHP XML bindings, or libxml itself?

  The only thing i can think of is that the PHP bindings had a bug
leading to trying to xmlFree() a wrong address, and if you have 
memory debug activated, then libxml2 will try to dump the current 
memory list in .memdump . That's the only case i know where freeing
a document ends up taking noticeable time, sometime looking hung.
  I strongly suggest using gdb, attaching to the php process and debugging
what's going on as the proper way to investigate. I can't do that for you
and i can't debug it unless you can reproduce the problem with xmllint,

  sorry,

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]