Re: [xml] using xmlTextReader efficiently



On Wed, Aug 23, 2006 at 08:52:09PM -0700, Todd Ditchendorf wrote:
I'd like to use xmlTextReader to parse and validate source documents  
against DTDs and Relax NG. I'm somewhat familiar with this already,  
as I've used libxml2 before.

It just dawned on me, however, that it would be a good idea to ask  
the list to confirm some assumptions I've made about using  
xmlTextReader. I've looked at the source, but I'm not the most  
experienced C programmer, so I wanted to double check with the experts.

Basically, I want to verify that you can use xmlTextReader like an  
ideal SAX parser that doesn't build an entire tree structure of  
source documents in memory.

  Right, basically it's a bit like a tree parser bug with just a 
sliding window of the document being constructed at a given time,
at the minimal the current node and its ancestors.

I assume that this example:

http://xmlsoft.org/examples/reader2.c

does not build an in-memory tree of the entire source doc being  
validated, but rather only holds small portions of the document at a  
time. Is that correct?

  yes

Additionally, when taking similar action with a RELAX NG schema, does  
the same hold true for the source document (obviously, the schema doc  
has to be parsed into a tree in memory)?

  yes, except that with *some* piece of RNG schemas larger parts of the
tree need to be available, in RNG you may need to accumulate data, either
document data (libxml2 way) or regexp data (derivation method).

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]