Re: [xml] ignore undefined entity references while parsing?

On Sun, Mar 02, 2008 at 08:29:32PM +0100, Hans Martin wrote:

I assume, this would break any XML standard, but still...

I like to parse XML with all entity references ignored,
i.e. defined or undefined entities neither substituted
nor removed. So far, with libxml2 2.6.31 and Python 2.4
it works only for the defined ones.

Is there a clean way to do this? Currently, I quote the
ampersands of my input and unquote later. This works
well, but feels hacky. Is there a better way?

  Predefined entities (lt, gt, amp, quot and apos) are always
substitued by XML parsers, I'm afraid what you're trying to do
is raw text processing, not text processing and better done 
with text processing tools and not XML ones.


Red Hat Virtualization group
Daniel Veillard      | virtualization library
veillard redhat com  | libxml GNOME XML XSLT toolkit | Rpmfind RPM search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]