Re: [xml] Content normalization



Hi,

i really dont like the idea of having to parse the output i generate
once again...

The problem where this appears is in syncml handling. SyncML is often
used to send VCards, VEvents etc which are required to have \r\n as a
line ending. The xml output is then parsed into wbxml and send to some
device like a mobile or PDA etc. The problem is that these devices
expect the vcards to have \r\n as line ending and they dont do replace
the entity reference and they also dont normalize \r\n to \n.

The questions is what to do.
The option of filtering the output again seems awkward to me...
And if i understand the xml specs correctly sending and \r\n as _output_
is considered valid (http://www.w3.org/TR/REC-xml/#NT-S) since they are
to be removed in input anyways. So it should be possible to choose
wheter to escape \r or not.

Eric Haszlakiewicz wrote:
On Fri, Jul 01, 2005 at 11:12:48AM -0400, Daniel Veillard wrote:

On Fri, Jul 01, 2005 at 05:07:41PM +0200, Armin Bauer wrote:

but what i would like to get is the output with the "
" replaced by
0x0D.
so is it possible to disable this normalization of text nodes?

 No, this is a requirement to be conformant to the XML spec, not negociable,
and any XML parser will do it anyway.
    http://www.w3.org/TR/REC-xml/#sec-line-ends


        While the XML spec is nice to follow, telling him that doesn't help him
solve his problem.

        Armin, there is a function that use can use to get the data contents
of that element in a way that expands any character references like &#xD.
In perl XML::DOM land it is called expandEntityRefs, but I don't know what
it is when working directly with libxml2.  You can use that if you need just
that one element.
        Translating 
 within an entire xml doc is something you'd probably
have to do yourself.  Writing a function to search through the output
for that shouldn't be too hard, or pipe the output through sed (if it's
convinient to do so).

eric

Attachment: signature.asc
Description: OpenPGP digital signature



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]