Re: [xml] Content normalization
- From: Daniel Veillard <veillard redhat com>
- To: Armin Bauer <armin bauer desscon com>
- Cc: xml gnome org, Eric Haszlakiewicz <erh+libxml nimenees com>
- Subject: Re: [xml] Content normalization
- Date: Thu, 7 Jul 2005 17:49:48 -0400
On Thu, Jul 07, 2005 at 10:57:05PM +0200, Armin Bauer wrote:
The problem where this appears is in syncml handling. SyncML is often
used to send VCards, VEvents etc which are required to have \r\n as a
line ending. The xml output is then parsed into wbxml and send to some
device like a mobile or PDA etc. The problem is that these devices
expect the vcards to have \r\n as line ending and they dont do replace
the entity reference and they also dont normalize \r\n to \n.
Well then it's not expected to be processed as XML, so why does this
concerns libxml2 ?
The questions is what to do.
No the question is "are those XML application you're using or communicating
with ?", if they don't do normalization nor replace the entities references
then the answer is no. You're not manipulating XML but something which looks
like XML but isn't expected to be processed as such. I'm not fond of adding
yet another set of APIs or switches to cope with yet another pseudo XML
parsing framework. That's why we went to errors being fatal in XML and
stringent on spec errors violations. Those tools don't allow you to benefits
from the standardization benefits, complain about them not about libxml2
being compliant.
The option of filtering the output again seems awkward to me...
they process XML in broken ways, that's why.
And if i understand the xml specs correctly sending and \r\n as _output_
is considered valid (http://www.w3.org/TR/REC-xml/#NT-S) since they are
to be removed in input anyways. So it should be possible to choose
wheter to escape \r or not.
No because *any* compliant XML parser must do the replacement on input
and replace numeric character references by the character with the associated
code point when passing data to the application. They have problem processing
XML input generated by libxml2 only because they are not compliant. They
are not XML compliant applications, period. Sorry you will have to cope
with this, as this has no place in an XML toolkit IMHO, or use the HTML
serializer or serialize the trees yourself or post process. The added work
on your side is only due to non compliance on theirs.
Daniel
--
Daniel Veillard | Red Hat Desktop team http://redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]