Re: [Libxmlplusplus-general] Parser abstraction

From: Stefan Seefeld <seefeld sympatico ca>
To: libxmlplusplus-general lists sourceforge net
Subject: Re: [Libxmlplusplus-general] Parser abstraction
Date: Sat, 01 Feb 2003 15:27:20 -0500

Christophe de Vienne wrote:

The way parse_stream or an eventual parse_chunk is implemented could beshared, since only the parser state is initialised differently (and partialyonly, since some options are the same ones).Even the parse_file and parse_memory could share the same implementation inboth classes if we used more low level libxml calls.


ok. Well, right now they don't share *any* code. And even if they would use
those common function(s), it would be a single line (or two). So code reuse
can't be an issue here.

My main concern really is the semantic difference between DOM and SAX.
My point really is that you decide once whether you want to use an event driven
approach, or whether you want a full in-memory representation which you can
walk through and manipulate.
There isn't anything those two things have in common. Yes, document construction
can (and is in libxml2) be implemented on top of a SAX interface, but that's
an 'implementation detail'.

Imagine I have several documents to parse with the same options. I couldinstanciate a single parser instance that would become a factory of Document,parameterized once for all, and without having to give theses options eachtime to a factory function (which would mean store them somewhere).


ok, I can see that (though I'm not sure that there is anything favoring
to use the same options for all documents). I do think that in 90% of all
cases creation of a document is an atomic action, at least as far as the user's
code is concerned. In that light I'd at least provide a factory function
doing all this (possibly being implemented as

Document *create_document_from_file(const std::string &filename, const options &o)
{
  DomParser parser(o);
  Document *document = parser.parse(filename);
  return document;
}

so we are all happy :-)

Moreover, considering the underlying C layer, we could have not implementedany accessor to use a particular option. Having the parser state initialiasedby the parser instance and accessors before doing the real parsing would letthe possibility to alter it through a cobj() accessor in a herited class


see my other mail: I don't think using 'cobj()' accessors should be encouraged.
It totally breaks encapsulation.

The inputs are in common. In libxml for exemple, the xmlParseChunk is the samefor a sax or a dom parser : a Document will be produced if I don't specify asaxHandler.


right, and I think that is very unfortunate. It's two functions lumped into one;
And its semantics is very different, depending on the arguments you pass.

Still in libxml, the domparser is built on top of saxparser : thetwo concepts share more things that it seems at first sight.


again, would you derive your dom parser privately from the sax parser,
i.e. would you use 'derived from' in terms of 'implemented by', I would
(possibly) agree.

Regards,
		Stefan

Follow-Ups:
- Re: [Libxmlplusplus-general] Parser abstraction
  - From: Jonathan Wakely
- Re: [Libxmlplusplus-general] Parser abstraction
  - From: Christophe de Vienne

References:
- Re: [Libxmlplusplus-general] Parser abstraction
  - From: Christophe de Vienne
- Re: [Libxmlplusplus-general] Parser abstraction
  - From: Stefan Seefeld
- Re: [Libxmlplusplus-general] Parser abstraction
  - From: Christophe de Vienne

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]