Re: [xml] Undefined character entities and libxml



On Fri, Dec 16, 2005 at 02:57:01PM -0500, Jon Smirl wrote:
I have documents with the XHTML 1.1 doctype:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>

These documents contain character entities like &nbsp;

I would like to be able to parse these documents into a tree without
generating errors like this:
root.xml:26: parser error : Entity 'nbsp' not defined
host/">Parent Directory</a>/</td><td class="m">&nbsp;</td><td class="s">- &nbsp;

I built all of the catalogs needs for XHTML 1.1 and I can run the
documents through "xmllint -valid -nonet -noout doc.xml" without
errors. (Was there some place to get the xhtml 1.1 DTD as an rpm? I
could only find the xhtml 1.0 rpm)

  No I only packaged the 1.0 version not the 1.1.

Is it possible to run the docs through libxml in non-validating mode
without having the entities defined and not get error reports?

  It's an error, but not a fatal error, since there is an external
subset but it's not loaded.

 you will need to load the DTDs, while still not validating.
It's an intermediate parser processing, add
  --loaddtd : fetch external DTD
option for this.

xmllint with --noent has no effect on the error messages.

  yes, that one ask to substitute the entities found by their replacement
in the tree, but if they are not found ...

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]