Re: [xml] Is ignoring namespace and DTDs an xmllint or libxml2 problem?



On Sat, Jan 07, 2012 at 06:54:22PM +0100, Michael Ludwig wrote:
Thomas Gagne schrieb am 06.01.2012 um 11:47 (-0500):
I'm unclear how I can reformat the file

   tgagne ubuntu:~/tmp$ cat a.xml
   <ns:a>
   </ns:a>

without getting the errors

   tgagne ubuntu:~/tmp$ xmllint --format --recover a.xml
   a.xml:1: namespace error : Namespace prefix ns on a is not defined
   <ns:a>
         ^

Is it a problem with xmllint or libxml2?

These days most would say it's a problem with your XML, which is not
namespace-valid.

  yes but it should *just* raise the namespace error which it does, but
should not generate information loss.

Then, my copy of xmllint (Windows, 20707) reformats the doc alright, it
just also emits the warning you're seeing and, significantly, drops the
part of the name before the colon.

You'd need a parser configuration that has namespaces switched off. That
should be available as "xmllint --sax1" (the old SAX 1 didn't know about
namespaces), but even then xmllint emits the warning and drops the part
before the colon, which I think is a bug.

  SAX1 parser is namespace aware just that the callback API isn't.

I hate the idea of information loss in the context of XML processing
and since support for XML Namespace is still an optional feature
I fixed the SAX2 default callbacks to not loose the prefix in this
case or if an attribute uses an undefined namespace. SAX2 should be
fixed both in case of using a parser dictionnary or not.

  I will push the patches once I'm off the train ;-)

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]