RE: [libxml++] libxml++ future

From: Murray Cumming Comneon com
To: libxmlplusplus-general lists sourceforge net
Subject: RE: [libxml++] libxml++ future
Date: Fri, 26 Sep 2003 08:21:21 +0200
> From: Christophe de Vienne [mailto:cdevienne alphacent com] 
> 1 - postfix private members intead of prefixing them with an 
> underscore
> 
> target version : 1.0
> 
> The ISO c++ standart reserve names with a leading underscore to the 
> implemention. One shouldn't use some.
> Although there is no risk of real problem with that, I think 
> it would be 
> cleaner.

Fine by me.

> 2 - wrap xmlIO.
> 
> target version : 1.0
> 
> xmlIO interface allow the creation of our own Input/Output 
> Buffers. Wrap them 
> is an elegant and efficient way to reduce some useless 
> potentialy big strings 
> copy.
> 
> Think about how to send a document to a stream. Currently we 
> have to do :
> 
> std::ostream & output = std::cout; // could be any ostream of course
> std::string tmp = document.write_to_string();
> output << tmp;
> 
> In the above code, the entire document is written to a buffer 
> by libxml, then 
> copied to a std::string by libxml++ which is finally returned by 
> write_to_string(). Even the a COW implementation of 
> std::string, we'll need 
> twice more memory than the size of the document. With a non-COW 
> implementation it is even worse : it may be copied 3 or 4 time.
> 
> I wrote a small wrapper to xmlOutputBuffer and implemented a 
> Document::write_to_stream() function. The precedent code become :
> 
> std::ostream & output = std::cout; // std::cout is still an 
> example of course
> document.write_to_stream(output);
> 
> The advantage is much more than just writing 1 line instead 
> of 2. The entire 
> document is never in memory. libxml write to buffer by small 
> pieces which are 
> immediatly sent to the stream by the wrapper. A patch 
> demonstrating this is 
> on the patch manager if you want to experiment it. The 
> wrapper allow the user 
> to very easily define it's own OutputBuffer. I modified 
> dom_build example to 
> test it, and it works pretty well.
> 
> Another possible thing is to wrap xmlInputBuffer. Although we 
> can (and did) 
> implement parse_stream without it, it would permit to implement 
> xmlTextReader.getRemainder() in an elegant way (cf. 3).

Here is a link for others:
http://www.xmlsoft.org/xmlio.html

I'll look at the patch itself.

> **************************************************************
> *****************
> 3 - wrap xmlTextReader
> 
> target version : 1.0 ?
> 
> First some reference if you want to know better what I'm 
> speaking about :
> * libxml2 xmlTextReader implementation :
> http://xmlsoft.org/xmlreader.html
> * C# xmlTextReader interface :
> http://dotgnu.org/pnetlib-doc/System/Xml/XmlTextReader.html
> 
> I know this interface is not part the XML specification, 
> which is one argument 
> not to implement it.
> However I think is worth it : It will answer some needs on 
> which SAX or DOM 
> are not satisfying for many people, and I bet some new users may get 
> interested into libxml++ if we implement such a thing.
> 
> I think we can give it an API very close to the C# one, 
> thanks to the xmlIO 
> wrappers.

For this and the xmlIO thing, please be very careful about giving us aims
that can not be achieved quickly. API stability is also very useful, and we
can do difficult things in a later version if necessary. In my opinion, we
should have frozen alread (though it's lucky that we got the namespace
support recently). I am only waiting for a first release of glibmm 2.3.0, so
that we can start libxml++ 1.1 at the same time.

> **************************************************************
> *****************
> 4 - wrap xmlTextWriter
> 
> target version : it's too early to know
> 
> This interface if far less advanced than xmlTextReader. I 
> don't think it's 
> time to think seriously about it but it's a logical step 
> after xmlTextReader. 
> An idea to keep for the future ?
> 
> **************************************************************
> *****************
> 5 - use a string type which handle UTF-8
> 
> target version : 1.2
> 
> This point has been discussed in the past. I will just sum-up 
> the state of the 
> discussions at this time.
> The main debate was : do we impose a precise class or do we 
> transform libxml++ 
> to a templated library to let the user which class he wants.
> This debate ended with a vote pro/against templates with a 
> quite balanced 
> result.
> 
> We however have an alternative way : explicit instanciation. 
> This would 
> consist of implementing the lib with templates, but not including 
> implementations in header.
> Instead, we would explicitely instanciate the template 
> classes into the 
> dynamic lib with a chosen string type (very probably 
> Glib::ustring). Programs 
> using this default string type wouldn't need to be recompiled 
> at each minor 
> release, which is the main argument against templates.
> At the same time, users who want to use another string type 
> (QString for 
> exemple, or even std::string of char *), could still do it, 
> at the price of 
> recompiling their application at each release of libxml++, 
> even if the API 
> doesn't change.
> 
> - - Is this solution acceptable for you ?
> - - Is there any issue about LGPL with template libraries ?

No, I think this is ridiculous and doesn't solve any real problem.
 
> **************************************************************
> *****************
> 6 - Implement node iterators
> 
> target version : ?
> 
> This point was also discussed earlier. We couldn't make any 
> decision on a 
> clean API.
> Since xmlNode has some internal pointers to the other nodes 
> of the tree (next, 
> prev, children, parent), we could easily implement iterators 
> allowing to walk 
> in the tree in different ways :
> 
> - - children_iterator: explore all the children of a node.
> - - depth_first_traversal_iterator: allow to explore all node 
> with a depth first 
> algorithm, starting from a node, ending when all the subtree has been 
> explored.
> - - breadth_first_traversal_iterator: idem but breadth first.
> 
> These iterators could be bidirectionnal. The question is how 
> to define the 
> end() element.
> Each of them would have a const version.
> I'll try to make something more complete than last time about 
> this. Any idea 
> is welcome.	
> 
> 
> **************************************************************
> *****************
> 7 - make a better XPath support
> 
> target version : ?
> 
> I'm not very familiar with XPath. I don't know if the current 
> support we have 
> is enough for common uses. Any feedback on this would be appreciated.

Until someone says that it isn't good enough, I'll assume that it's good
enough.

> **************************************************************
> *****************
> The end.
> 
> If you reached this point, thank you for reading :-)
> 
> I'm waiting forward for comments/ideas,

Well done Christophe.

Murray Cumming
murrayc usa net
www.murrayc.com
Follow-Ups:
- Re: [libxml++] libxml++ future
  - From: Christophe de VIENNE
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]