Re: [xml] file path vs URI [patch]



While I was looking more into the code I came up with rather simple sanity patches which solve both of my issues (described bellow in my previous email).

Some more background:

xmlCtxtReadFile() has on its input parameter 'filename' which is described as filename or URL. This function calls xmlLoadExternalEntity() where on input is expected URL. I have added code which handles filenames starting with //.

Practically the same issue is with xmlCtxtReadFile() which calls also xmlLoadExternalEntity().

Please see my attached patch and consider changing libxml2 in similar way. Any comments welcomed.

Thanks,

Petr

Petr Sumbera wrote:
Hi guys,

Sorry for maybe not well described or at least strange formulation as I'm not XML expert at all.

In UNIX systems file path starting with "//" is being accepted and it's processed without complaining. For example:

"cat //etc/passwd" will show content of /etc/passwd

But this seems to be against URI (RFC 2396) or at least its implementation in libxml2.

http://www.ietf.org/rfc/rfc2396.txt

libxml2 understands such path/URI(?) as Relative reference where "//" is understood as network. RFC 2396 says about Relative reference in chapter 1.4 following:

"In contrast, a relative identifier refers to a resource by describing the difference within a hierarchical namespace between the current context and an absolute identifier of the resource."

Does libxml2 really follow up this? There seems to be simple fall back from Absolute reference to Relative (in xmlParseURIReference()). I don't see context anywhere.

Simple example why I'm concerned:

-bash-3.2$ head -n 2 /var/svc/manifest/network/ntp.xml
<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
-bash-3.2$ xmllint --valid --noout //var/svc/manifest/network/ntp.xml
//var/svc/manifest/network/ntp.xml:2: I/O error : failed to load external entity "//var/usr/share/lib/xml/dtd/service_bundle.dtd.1" <!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

       ^
//var/svc/manifest/network/ntp.xml:2: warning: failed to load external entity "//var/usr/share/lib/xml/dtd/service_bundle.dtd.1" <!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

       ^
//var/svc/manifest/network/ntp.xml:15: validity error : Validation failed: no DTD found !
<service_bundle type='manifest' name='SUNWntpr:xntpd'>
                                                     ^
-bash-3.2$

---

Note that DTD is searched in wrong path "//var/usr/share/lib/xml/dtd/service_bundle.dtd.1" instead of "/usr/share/lib/xml/dtd/service_bundle.dtd.1" (basically var is considered as server)

It may be problem of xmllint of not converting (sanitizing) path into URI. Is it?

And why I'm bothering about all of this? We have some utility which suffers with the same symptoms after updating libxml2 from 2.6.10 to 2.6.23. As it uses such functions like xmlReadFile (where on input is filename) it can be similar to problem described above. The difference is that the problem described above was there even in 2.6.10, while in this version our utility was ok. Any idea here?

Thanks for any comment,

Petr


--- libxml2-2.6.30/parser.c.orig        Tue Aug 14 06:43:11 2007
+++ libxml2-2.6.30/parser.c     Fri Nov  2 04:42:37 2007
@@ -12259,6 +12259,11 @@
     if (options)
        xmlCtxtUseOptions(ctxt, options);
     ctxt->linenumbers = 1;
+
+    /* sanitize filename starting with // so it can be used as URI */
+    if (filename != NULL && strlen(filename) >= 2
+       && filename[0]=='/' && filename[1]=='/')
+       filename++;
     
     inputStream = xmlLoadExternalEntity(filename, NULL, ctxt);
     if (inputStream == NULL) {
@@ -13516,6 +13521,11 @@
 
     xmlCtxtReset(ctxt);
 
+    /* sanitize filename starting with // so it can be used as URI */
+    if (filename != NULL && strlen(filename) >= 2
+        && filename[0]=='/' && filename[1]=='/')
+        filename++;
+
     stream = xmlLoadExternalEntity(filename, NULL, ctxt);
     if (stream == NULL) {
         return (NULL);


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]