[xml] libxml2 testXPath chokes on W3C strict/loose.dtd



I think I may be getting somewhere.  It seems, below, that the xpath
component of libxml2 is choking on the DTD, specifically in the entity
definition comments!  That's wierd!  There should be nothing wrong with
the line breaks before the comments.

My test follows.  I've attached /tmp/b.html in case you'ld like to
repeat it.

-------------------

lsiden lsiden libxml2-2.5.8 $ ./testXPath  -i /tmp/b.html
'//*[ id="study-text"]'
http://www.w3.org/TR/html4/loose.dtd:31: error: xmlParseEntityDecl:
entity HTML.Version not terminated
  -- Typical usage:
  ^
http://www.w3.org/TR/html4/loose.dtd:31: error: Extra content at the end
of the document
  -- Typical usage:
  ^
xpath.c:10772 Internal error: no document
Object is a Node Set :
NodeSet is NULL !
--------------------

I also tried this with strict.dtd, then with loose.dtd.  It chokes on
both.  

It might be helpful to others in the future if you replace the error
message returned by XML::LibXML::findnodes(), "empty xpath found ...",
with the actual reason for failure, if such a thing is at all possible
(I realize that it may not be).  This error message caused me to spend
many hours examining the XPath expression instead of the real problem,
which is parsing the DTD.

There is a thread about this with Daniel V. from June '02: see 
http://mail.gnome.org/archives/xml/2002-June/msg00019.html.  Apparently,
this has come up already, but I can't tell how it was resolved.  In any
case, if libxml2 doesn't parse the strict.dtd or loose.dtd posted by
W3C, then what does it parse?

Respectfully,
Larry Siden


Title: Breishit
Hello world

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]