Re: [Libxmlplusplus-general] Parser abstraction



Christophe de Vienne wrote:

The way parse_stream or an eventual parse_chunk is implemented could be shared, since only the parser state is initialised differently (and partialy only, since some options are the same ones). Even the parse_file and parse_memory could share the same implementation in both classes if we used more low level libxml calls.

ok. Well, right now they don't share *any* code. And even if they would use
those common function(s), it would be a single line (or two). So code reuse
can't be an issue here.

My main concern really is the semantic difference between DOM and SAX.
My point really is that you decide once whether you want to use an event driven
approach, or whether you want a full in-memory representation which you can
walk through and manipulate.
There isn't anything those two things have in common. Yes, document construction
can (and is in libxml2) be implemented on top of a SAX interface, but that's
an 'implementation detail'.

Imagine I have several documents to parse with the same options. I could instanciate a single parser instance that would become a factory of Document, parameterized once for all, and without having to give theses options each time to a factory function (which would mean store them somewhere).

ok, I can see that (though I'm not sure that there is anything favoring
to use the same options for all documents). I do think that in 90% of all
cases creation of a document is an atomic action, at least as far as the user's
code is concerned. In that light I'd at least provide a factory function
doing all this (possibly being implemented as

Document *create_document_from_file(const std::string &filename, const options &o)
{
  DomParser parser(o);
  Document *document = parser.parse(filename);
  return document;
}

so we are all happy :-)

Moreover, considering the underlying C layer, we could have not implemented any accessor to use a particular option. Having the parser state initialiased by the parser instance and accessors before doing the real parsing would let the possibility to alter it through a cobj() accessor in a herited class

see my other mail: I don't think using 'cobj()' accessors should be encouraged.
It totally breaks encapsulation.

The inputs are in common. In libxml for exemple, the xmlParseChunk is the same for a sax or a dom parser : a Document will be produced if I don't specify a saxHandler.

right, and I think that is very unfortunate. It's two functions lumped into one;
And its semantics is very different, depending on the arguments you pass.


Still in libxml, the domparser is built on top of saxparser : the two concepts share more things that it seems at first sight.

again, would you derive your dom parser privately from the sax parser,
i.e. would you use 'derived from' in terms of 'implemented by', I would
(possibly) agree.

Regards,
		Stefan





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]