[xml] 3 pb with the xmlTextReader APi from Python

From: "Meunier, Jean-Luc" <Jean-Luc Meunier xrce xerox com>
To: <xml gnome org>
Subject: [xml] 3 pb with the xmlTextReader APi from Python
Date: Thu, 12 Jan 2006 15:46:03 +0100

Hi all,

I’ve found 3 problems with xmlTextReader, used from Python. I provide my code and a test example, so as to reproduce them or discard them since maybe I misused the API.

Some brief context: I’m interested in processing a XML file in “semi-streaming” mode: the input XML is copied without change to the output except for a series of sub-trees (identified for instance by their node name, e.g. the <PAGE> nodes), which I want to process in DOM using the expand method of the xmlTextReader API. Sounds nice, but copying isn’t so easy in fact.

The (little) problems:

Pb 1 - how to process the XML declaration , e.g. <?xml version="1.0"?>

Pb 2 – the QuoteChar() method seems to always return “ even if a ‘ was used to enclose an attribute, e.g. a=’123’

Pb 3 – in text node and attribute values, entities are strangely dealt with by the Value() method: for instance a & becomes a & in the returned string

Actually a rdr.CurrentDoc().encodeEntitiesReentrant(rdr.Value()) gives a correct output, so it’s even more strange to me

Those problems are visible using the attached xmldump.py code below which simply copies its input to its output. A test file is also there.

Thanks for your help/comments,

Attachment: xmldump.py
Description: xmldump.py

Attachment: test_simple.xml
Description: test_simple.xml

Follow-Ups:
- Re: [xml] 3 pb with the xmlTextReader APi from Python
  - From: Daniel Veillard

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]