Re: [xml] xpointer error



Daniel Veillard writes:
  Honnestly I don't remember, I implemented that 8 years ago and it
didn't got much use since then. It's likely there is bugs, and while
string ranges have some regression tests they seems to only test
rather small ranges.

I hope it will see some more use... Yours was the only complete
implementation of XPointer that I was able to find in a stand-alone parser.

What I'm trying to do is check whether I can use the string-range()
function of the xpointer() scheme instead of the completely unsupported
string-range() scheme (yes, the same name) defined by the TEI([1],[2]),
for the purpose of creating a modern text corpus with so-called
stand-off annotation (where the corpus markup resides outside of the
source text).

  Somehow I don't think you can use /div and expect to have ranges
working under a children of the div, maybe you misunderstood the spec.

I don't think I misunderstood it, here's a relevant example from the
spec/draft[3].

"The following expression selects the fifth of those exclamation marks
appearing in any text node in the document and the character immediately
following that exclamation mark:

string-range(/,"!",1,2)[5]
"

So you implemented that correctly, it's just the error message that pops
up in some cases :-)

But the error raised by libxml2 doesn't look great either.

I'll report it in bugzilla, if it helps. There's one range() error
already reported there that looks uglier than this one (this one is just
a message, the other looks like an implementation error).

Best,

  Piotr

Notes:
[1]: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/SA.html#SASOso
[2]: http://www.w3.org/2005/04/xpointer-schemes/
[3]: http://www.w3.org/TR/xptr-xpointer/#stringrange

On Wed, Nov 26, 2008 at 04:04:15PM +0100, Piotr Bański wrote:
Hello,

Daniel -- first of all, thanks so much for implementing the xpointer()
scheme -- I can try it out at last :-)

There is a problem that I get under both libxml 2.6.32 and 2.7.2,
illustrated by the following, which is a test case for extracting
substrings for the purpose of text annotation:

include2.xml

<?xml version="1.0" encoding="UTF-8"?>
<body><p><xi:include  xmlns:xi="http://www.w3.org/2003/XInclude";
href="source.xml" xpointer="xpointer(string-range(/div,'',1,47)[1])"/></p>
    <p><xi:include xmlns:xi="http://www.w3.org/2003/XInclude";
href="source.xml" xpointer="xpointer(string-range(/div,'',50,22)[1])"/></p>
    <p><xi:include xmlns:xi="http://www.w3.org/2003/XInclude";
href="source.xml"
xpointer="xpointer(string-range(/div,'',73,11)[1])"/></p></body>


source.xml

<?xml version="1.0" encoding="UTF-8"?>
<div><p>To make a prairie it takes a clover and one bee,
One clover, and a bee,
And revery.
The revery alone will do,
If bees are few</p></div>

-----------
$ xmllint --xinclude --debug include2.xml
Internal error at
/usr/src/ports/libs/libxml2/libxml2-2.6.32-2/src/libxml2-2.6.32/xpointer.c:2409
Internal error at
/usr/src/ports/libs/libxml2/libxml2-2.6.32-2/src/libxml2-2.6.32/xpointer.c:2409
DOCUMENT
version=1.0
encoding=UTF-8
URL=include2.xml
standalone=true
  ELEMENT body
    ELEMENT p
      INCLUDE START
      TEXT
        content=To make a prairie it takes a clover and ...
      INCLUDE END
    TEXT compact
      content=
    ELEMENT p
      INCLUDE START
      TEXT
        content=One clover, and a bee,
      INCLUDE END
    TEXT compact
      content=
    ELEMENT p
      INCLUDE START
      TEXT
        content=And revery.
      INCLUDE END


The error goes away in two cases: when I comment out the two latter
<p>s, or when I put a designated character at the the very beginning of
the source.xml text, and match against it, e.g.:

<p><xi:include xmlns:xi="http://www.w3.org/2003/XInclude";
    href="source.xml"
    xpointer="xpointer(string-range(/div,'&#160;',3,48)[1])"/></p>

The error context in xpointer.c is in xmlXPtrAdvanceChar():

     if (pos > len) {
 /* Strange, the indx in the text node is greater than it's len */
         STRANGE
         pos = len;
     }

And I can't see what I possibly do wrong here. Thought I'd let you know,
in case the problem was xmllint's.

The relevant fragment of the xpointer draft is at
http://www.w3.org/TR/xptr-xpointer/#stringrange

Best regards,





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]