Re: [xml] file path vs URI [patch]
- From: Petr Sumbera <Petr Sumbera Sun COM>
- To: xml gnome org
- Subject: Re: [xml] file path vs URI [patch]
- Date: Fri, 02 Nov 2007 14:54:44 +0100
While I was looking more into the code I came up with rather simple
sanity patches which solve both of my issues (described bellow in my
previous email).
Some more background:
xmlCtxtReadFile() has on its input parameter 'filename' which is
described as filename or URL. This function calls
xmlLoadExternalEntity() where on input is expected URL. I have added
code which handles filenames starting with //.
Practically the same issue is with xmlCtxtReadFile() which calls also
xmlLoadExternalEntity().
Please see my attached patch and consider changing libxml2 in similar
way. Any comments welcomed.
Thanks,
Petr
Petr Sumbera wrote:
Hi guys,
Sorry for maybe not well described or at least strange formulation as
I'm not XML expert at all.
In UNIX systems file path starting with "//" is being accepted and it's
processed without complaining. For example:
"cat //etc/passwd" will show content of /etc/passwd
But this seems to be against URI (RFC 2396) or at least its
implementation in libxml2.
http://www.ietf.org/rfc/rfc2396.txt
libxml2 understands such path/URI(?) as Relative reference where "//" is
understood as network. RFC 2396 says about Relative reference in
chapter 1.4 following:
"In contrast, a relative identifier refers to a resource by describing
the difference within a hierarchical namespace between the current
context and an absolute identifier of the resource."
Does libxml2 really follow up this? There seems to be simple fall back
from Absolute reference to Relative (in xmlParseURIReference()). I don't
see context anywhere.
Simple example why I'm concerned:
-bash-3.2$ head -n 2 /var/svc/manifest/network/ntp.xml
<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM
"/usr/share/lib/xml/dtd/service_bundle.dtd.1">
-bash-3.2$ xmllint --valid --noout //var/svc/manifest/network/ntp.xml
//var/svc/manifest/network/ntp.xml:2: I/O error : failed to load
external entity "//var/usr/share/lib/xml/dtd/service_bundle.dtd.1"
<!DOCTYPE service_bundle SYSTEM
"/usr/share/lib/xml/dtd/service_bundle.dtd.1">
^
//var/svc/manifest/network/ntp.xml:2: warning: failed to load external
entity "//var/usr/share/lib/xml/dtd/service_bundle.dtd.1"
<!DOCTYPE service_bundle SYSTEM
"/usr/share/lib/xml/dtd/service_bundle.dtd.1">
^
//var/svc/manifest/network/ntp.xml:15: validity error : Validation
failed: no DTD found !
<service_bundle type='manifest' name='SUNWntpr:xntpd'>
^
-bash-3.2$
---
Note that DTD is searched in wrong path
"//var/usr/share/lib/xml/dtd/service_bundle.dtd.1" instead of
"/usr/share/lib/xml/dtd/service_bundle.dtd.1" (basically var is
considered as server)
It may be problem of xmllint of not converting (sanitizing) path into
URI. Is it?
And why I'm bothering about all of this? We have some utility which
suffers with the same symptoms after updating libxml2 from 2.6.10 to
2.6.23. As it uses such functions like xmlReadFile (where on input is
filename) it can be similar to problem described above. The difference
is that the problem described above was there even in 2.6.10, while in
this version our utility was ok. Any idea here?
Thanks for any comment,
Petr
--- libxml2-2.6.30/parser.c.orig Tue Aug 14 06:43:11 2007
+++ libxml2-2.6.30/parser.c Fri Nov 2 04:42:37 2007
@@ -12259,6 +12259,11 @@
if (options)
xmlCtxtUseOptions(ctxt, options);
ctxt->linenumbers = 1;
+
+ /* sanitize filename starting with // so it can be used as URI */
+ if (filename != NULL && strlen(filename) >= 2
+ && filename[0]=='/' && filename[1]=='/')
+ filename++;
inputStream = xmlLoadExternalEntity(filename, NULL, ctxt);
if (inputStream == NULL) {
@@ -13516,6 +13521,11 @@
xmlCtxtReset(ctxt);
+ /* sanitize filename starting with // so it can be used as URI */
+ if (filename != NULL && strlen(filename) >= 2
+ && filename[0]=='/' && filename[1]=='/')
+ filename++;
+
stream = xmlLoadExternalEntity(filename, NULL, ctxt);
if (stream == NULL) {
return (NULL);
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]