Re: [xml] Strange validation errors in libxml 2.9.0 with MSVC 2010
- From: Zoltán Ördögh (GMail) <csimbi gmail com>
- To: "xml gnome org" <xml gnome org>
- Subject: Re: [xml] Strange validation errors in libxml 2.9.0 with MSVC 2010
- Date: Mon, 13 Jan 2014 22:16:48 -0500
Hi all,
just to let you know, I found at least two causes for triggering this fake DTD validation problem, and I've found a solution for both:
1. If your path contains backslashes, it won't work; this can be mitigated with replacing backslashes with forward slashes prior to writing the path.
2. If you path contains spaces, it won't work; this can be mitigated with replacing each space with a %20. There a a lot of spaces on the Windows platform, starting with the project dir, for example: <username>\My Documents\Visual Studio 2012\Projects\<ProjectName>\Debug\mydtd.dtd. How many spaces did you count?
Using both of these workarounds made my programs validate XML documents normally, without any hickups.
I'd expect that:
- in the future, libxml will spit out a more sensible error message than "Validation failed: no DTD found" for such errors - after all, it was the file that was not found.
- issue #1 is technically a bug in libxml, however it's platform-specific and the workaround in pretty easy, do I am guessing that it will be most likely left alone - unless of course libxml author(s) want to claim that it's standards compliant (so the platform-specific stuff will have to be added).
- issue #2 should be fixed as it is, in fact a real bug in libxml, regardless the platform of choice.
All you need to do is check the external entities in the XML standard:
http://www.w3.org/TR/REC-xml/#sec-external-ent
and follow it all the way to the format of
SystemLiteral:
http://www.w3.org/TR/REC-xml/#NT-SystemLiteralYou'll find that space is in fact, a valid character - and so is backslash:
SystemLiteral | ::= | ('"' [^"]* '"') | ("'" [^']* "'") |
Only single and double quote characters are forbidden; both backslash and spaces (plus a bunch of others that I guess won't work like 0x0A, 0x0D) are permitted in
SystemLiteral.
If you keep reading further, you'll notice that the tightly restricted
PubidLiteral also allows for spaces specifically (but not backslashes):
| PubidLiteral | ::= | '"' PubidChar* '"'
| "'" (PubidChar - "'")* "'" |
| PubidChar | ::= | #x20 | #xD | #xA | [a-zA-Z0-9] | [-'()+,./:=?;!*# $_%] |
I am hope you'll have time to fix it eventually; thank you in advance.
Happy New Year to you all!
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]