Re: [xml] iterating through an XML document?



Hi,

thanks a lot for the answers i got to my question.  It pretty much
explained the behaviour.  My understandig of XML was different before.

I'd like to use XML as a file format for a program i want to write
and in that file format i'd like to ignore everything that is not
XML.  Below you describe that by using a DTD i can skip the unwanted
whitespaces when reading the XML (if i understand correctly).

I just googled for some DTD descriptions and the descriptions i found
explained a lot.  But i would not have a clue on how to ignore whitespace
that comes from e.g. xmllint --format in.xml > out.xml.

Could anybody please give me a hint on what options to set or what DTD
to use so that a formatted and an XML with all tags in one line without
whitespaces give the same result?  Is it possible to place such a DTD as
a string into memory?


Best regards,
Torsten.


But it seems that too many text nodes are output, also for nodes that
do not have any content there is a text node with some whitespace
characters in it.

Do you know why this could happen?  How can i skip them?

Consider
<p><em>It's all</em> <b>exciting!</b></p>
and you'll see that the space between </em> and <b> is important.

If you write a DTD, you can have libxml discard space in "element
only context", i.e. where no text is allowed other than spaces.
But otherwise you'll get all the spaces.

This is a consequence of how XML works, and is not limited to
libxml.

Liam



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]