[xml] Multiple CDATA blocks




Hi there,

We are mixing libxml-2 and the older gnome-xml. At one side we use
libxml-2 to write XML, and at the same side parsing the data back to our
original structure works perfectly.

At the other side we are (still) working with rather older technology
like ORBit-1.x, GLib-1.x and some softwares and libraries that depend on
this older technology of which our vendor is not yet planning to start
porting stuff.

So basically, yea I know that you people are going to slaughter me using
an axe after tearing of my fingernails one by one and worst of all
really really going to hate me for using this older libraries.

And.. thats okay .. for me (so I can live with that fact)

The thing is, however, that we have a problem. After we wrote a rather
large block of text in a CDATA-tag (up to 10 kilobytes), the XML-file
contains two CDATA-blocks. The second one appended with the first one is
indeed the original text. Libxml-2, who wrote it that way, is fine with
that. They say that the CDATA-tag really is nothing more than the
knowledge for the XML-parser to not evaluate characters like <, > and &.
So I can understand that splitting up the data into two such CDATA-
fields gives the exact same result.

However

When we parse this using the older gnome-xml library, the second CDATA-
block fails to parse. It's as if there was no CDATA-block for this
second piece of data at all. So it starts throwing parser problems at
chars like &, < and >.

This is what libxml-2 wrote:

<mytag><![CDATA[
A lot of text 
A lot of &<> text
A lot of text
A lot of text]]><![CDATA[
A lot of text
A lot of text
A lot of text
A lot of &<> text
A lot of &<> text]]>
</mytag>


It looks like gnome-xml interpreted it like this:


<mytag><![CDATA[
A lot of text 
A lot of &<> text
A lot of text
A lot of text]]>
A lot of text
A lot of text
A lot of text
A lot of &<> text
A lot of &<> text
</mytag>


I have less control over what will be inserted in that CDATA-field. It's
not always happening either. 

Could it be that this is behaviour of libxml-2 when in my data, the
string "]]>" is found?




-- 
Philip Van Hoof, Software Developer @ Cronos
home: me at freax dot org
gnome: pvanhoof at gnome dot org
work: Philip dot VanHoof at Cronos dot Be
http://www.freax.be, http://www.freax.eu.org




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]