Re: [xml] iterating through an XML document?

On Thu, Jun 14, 2007 at 09:20:30PM +0200, Torsten Mohr wrote:

thanks a lot for the answers i got to my question.  It pretty much
explained the behaviour.  My understandig of XML was different before.

I'd like to use XML as a file format for a program i want to write
and in that file format i'd like to ignore everything that is not
XML.  Below you describe that by using a DTD i can skip the unwanted
whitespaces when reading the XML (if i understand correctly).

I just googled for some DTD descriptions and the descriptions i found
explained a lot.  But i would not have a clue on how to ignore whitespace
that comes from e.g. xmllint --format in.xml > out.xml.

Could anybody please give me a hint on what options to set or what DTD
to use so that a formatted and an XML with all tags in one line without
whitespaces give the same result?  Is it possible to place such a DTD as
a string into memory?

  In general no. Please do not try to assume you will be able to get libxml2
to ignore data. This may work or not, and the DTD is usually not a garantee
because document are usually not valid. Instead of trying to build a dangerous
pile of assumtion to try to avoid processing a few nodes, please code the
full algorithm, and skip those nodes there. You will avoid wasting a lot
of time on design, coding, testing and when your users actually start
to use the code. It's not like testing if a node is text and just white
spaces is hard so what ???


Red Hat Virtualization group
Daniel Veillard      | virtualization library
veillard redhat com  | libxml GNOME XML XSLT toolkit | Rpmfind RPM search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]