Re: [xml] xml:base missing on result from XInclude?
- From: Susanne Oberhauser-Hirschoff <froh suse com>
- To: Daniel Veillard <veillard redhat com>, Alexey Neyman <stilor att net>
- Cc: xml gnome org
- Subject: Re: [xml] xml:base missing on result from XInclude?
- Date: Tue, 22 Apr 2014 10:11:46 +0000
Hi Daniel, Alexey,
Alexey Neyman <stilor att net> writes:
I think I know what is causing the issue. The code in
xmlXIncludeLoadDoc looks at the url argument to see if it is relative
path - to do so, it looks for slashes in the path. The problem is that
xmlXIncludeLoadNode() passes down URIs that are relative to the top-
level document, not to the most recent inclusion. Therefore, in the
example below the url in xmlXIncludeLoadDoc() is just '3.xml', not
'../3.xml' - and thus, the code wrongly considers it to be based in
the same directory as the current included file.
Thanks for fixing this. Maybe this whole "check for a slash to tell if
xml:base fixup is needed" logic is flawed, though?
I'm using libxml2 2.9.1 and lxml 3.2.1
Given these example files (similar to your examples, Alexey), I get no
xml:base fixup at all:
### sample files ##################################################
# generate three example files
mkdir test
cd test
cat >1.xml <<EOF
<?xml version="1.0"?>
<top xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href="2.xml"/>
</top>
EOF
cat >2.xml <<EOF
<?xml version="1.0"?>
<elem1 xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href="3.xml"/>
</elem1>
EOF
cat >3.xml <<EOF
<?xml version="1.0"?>
<elem2>
<a fileref="x.svg"/>
</elem2>
EOF
### wrong output ##################################################
# expect xml:base fixup. Get none :(
xmllint --xinclude 1.xml
<?xml version="1.0"?>
<top xmlns:xi="http://www.w3.org/2001/XInclude">
<elem1 xmlns:xi="http://www.w3.org/2001/XInclude">
<elem2>
<a fileref="x.svg"/>
</elem2>
</elem1>
</top>
###################################################################
The xml:base is not just the directory, it also contains the file name,
right? The whole XInclude test suite behaves like that, see below.
So it _should_ look like this, shouldn't it? This is what I get with
the attached patch to libxml:
### correct output ################################################
xmllint --xinclude 1.xml
<?xml version="1.0"?>
<top xmlns:xi="http://www.w3.org/2001/XInclude">
<elem1 xmlns:xi="http://www.w3.org/2001/XInclude" xml:base="2.xml">
<elem2 xml:base="3.xml">
<a fileref="x.svg"/>
</elem2>
</elem1>
</top>
###################################################################
The XInclude test suite agrees, when run with the attached script, like
this.
###################################################################
cvs -d:pserver:anonymous dev w3 org:/sources/public \
co 2001/XInclude-Test-Suite XInclude-Test-Suite
cd XInclude-Test-Suite
python3 PATH-TO/run-tests-with-lxml.py
###################################################################
This gets about 15 less failures when run with the patch below, and
afaict from a review with/without patch, there is no additional ones.
So it should be an improvement :)
S.
Do xml:base fixup for file name changes in the same directory, too.
The "if it contains no slash, it needs no fixup" logic breaks the
XInclude test suite.
Index: libxml2-2.9.1/xinclude.c
===================================================================
--- libxml2-2.9.1.orig/xinclude.c
+++ libxml2-2.9.1/xinclude.c
@@ -1685,7 +1685,7 @@ loaded:
#endif
/*
- * Do the xml:base fixup if needed
+ * Do the xml:base fixup as needed
*/
if ((doc != NULL) && (URL != NULL) && (xmlStrchr(URL, (xmlChar) '/')) &&
(!(ctxt->parseFlags & XML_PARSE_NOBASEFIX)) &&
@@ -1695,28 +1695,26 @@ loaded:
xmlChar *curBase;
/*
- * The base is only adjusted if "necessary", i.e. if the xinclude node
- * has a base specified, or the URL is relative
+ * The xml:base is adjusted as necessary. Possibly the
+ * xinclude node has a base specified?
*/
base = xmlGetNsProp(ctxt->incTab[nr]->ref, BAD_CAST "base",
XML_XML_NAMESPACE);
if (base == NULL) {
/*
- * No xml:base on the xinclude node, so we check whether the
- * URI base is different than (relative to) the context base
+ * No xml:base on the xinclude node. Compute the base
+ * from the URL of the included document, if possible
+ * relative to the context base. See
+ * uri.c:xmlBuildRelativeURI for the relative/absolute
+ * magic.
*/
curBase = xmlBuildRelativeURI(URL, ctxt->base);
if (curBase == NULL) { /* Error return */
xmlXIncludeErr(ctxt, ctxt->incTab[nr]->ref,
XML_XINCLUDE_HREF_URI,
"trying to build relative URI from %s\n", URL);
- } else {
- /* If the URI doesn't contain a slash, it's not relative */
- if (!xmlStrchr(curBase, (xmlChar) '/'))
- xmlFree(curBase);
- else
- base = curBase;
}
+ base = curBase;
}
if (base != NULL) { /* Adjustment may be needed */
node = ctxt->incTab[nr]->inc;
#!/usr/bin/env python3
# (C) 2014 Susanne Oberhauser-Hirschoff <froh suse com>
# The MIT license applies http://opensource.org/licenses/MIT
"""
# Run the XInclude test suite through lxml:
# get the test suite
cvs -d:pserver:anonymous dev w3 org:/sources/public \
co 2001/XInclude-Test-Suite XInclude-Test-Suite
cd XInclude-Test-Suite
# run this script
python3 PATH-TO/run-tests-with-lxml.py
"""
from lxml import etree, objectify
tests = objectify.parse('testdescr.xml').getroot()
feature2xmllint_option = {
'xpointer-scheme': '',
'unexpanded-entities': None,
'unparsed-entities': None,
'lang-fixup': None,
}
class TC: pass
tcs = list()
for suite in tests.testcases:
basedir = suite.get('basedir')
creator = suite.get('creator')
for case in suite.testcase:
tc = TC()
tc.basedir = basedir
tc.creator = creator
tc.id = case.get('id')
tc.file = case.get('href')
# success, error or optional
tc.type = case.get('type')
if tc.type == 'error':
tc.result_file = None
else:
tc.result_file = case.output
required_features = case.get('features')
if required_features is None:
tc.required_features = list()
else:
tc.required_features = required_features.split()
tcs.append(tc)
for tc in tcs:
if tc.required_features is None:
tc.xmllint_options = ['']
else:
tc.xmllint_options = tuple(feature2xmllint_option[f]
for f in tc.required_features)
if None in tc.xmllint_options:
tc.unhandled_features = tuple(
filter(
lambda x: None is feature2xmllint_option[x],
tc.required_features
))
else:
tc.unhandled_features = None
def xinclude_expand(tc):
filename = "{tc.basedir}/{tc.file}".format(tc=tc)
got = etree.parse(filename)
got.xinclude()
result = ['<?xml version="1.0"?>']
result.extend( etree.tostring(got, encoding=str).splitlines())
return filename, result
import difflib
for tc in tcs:
if tc.unhandled_features != None:
print("untested: {tc.creator}-{tc.id}: can't handle options {tc.unhandled_features}\n".format(tc=tc))
continue
try:
tofile, got = xinclude_expand(tc)
fromfile = "{tc.basedir}/{tc.result_file}".format(tc=tc)
with open(fromfile) as f:
expected = f.read().splitlines()
diff = difflib.unified_diff(expected, got,
fromfile=fromfile,
tofile="lxml.etree.parse( {} ).xinclude().tostring()".format(tofile),
lineterm='')
diff = list(diff)
if len(diff) == 0:
print("pass: {tc.creator}-{tc.id}".format(tc=tc))
else:
print("###{:#<64}".format(" diff: {tc.creator}-{tc.id} ".format(tc=tc)))
for line in diff: print(line)
print('###################################################################')
except Exception as e:
if tc.type == 'error':
print("pass: {tc.creator}-{tc.id}: expected error {e}".format(tc=tc,e=e))
else:
print("fail: {tc.creator}-{tc.id}: unexpected error {e}".format(tc=tc,e=e))
--
Susanne Oberhauser SUSE LINUX Products GmbH
+49-911-74053-574 Maxfeldstraße 5
Processes and Infrastructure 90409 Nürnberg
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg)
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]