Re: [xml] iterating through an XML document?



On Thu, Jun 14, 2007 at 10:39:24PM +0200, Torsten Mohr wrote:
Hi,

  In general no. Please do not try to assume you will be able to get
libxml2 to ignore data. This may work or not, and the DTD is usually not a
garantee because document are usually not valid. Instead of trying to build
a dangerous pile of assumtion to try to avoid processing a few nodes,
please code the full algorithm, and skip those nodes there. You will avoid
wasting a lot of time on design, coding, testing and when your users
actually start to use the code. It's not like testing if a node is text and
just white spaces is hard so what ???

thanks for your hints.  Ok, you convince me easily, of course i want to
write proper code without any assumptions that at some point break my code.

Also, as an inbetween solution i tried to iterate over the document (already
loaded) and remove those parts that are text nodes that just contain
white-spaces.

It seems to me that having a loop over some node->children and removing
some of them in that same loop is somehow not a good idea, at least glibc
aborts my program due to double-freeing memory.  So i had to program it

  I don't see why this should not work

like this:

I really can't debug your code, no time for this, but I can give the follwing
advice: don't use isspace() for XML code as this is locale dependant and if
suddently someone run your code with a different locale the behaviour will be
different and you really don't want that !
Test the characters code points instead 
    http://www.w3.org/TR/REC-xml/#NT-S
   
Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]