[libxml++] wrapper for the xmlreader interface



So, I was chatting with Murray on IRC and wanting namespace awareness in
libxml++.  He challenged me to do some hacking on it myself, so I had a
go this afternoon.  Daniel Veillard indicated that libxml2 is never
likely to get SAX2, and that for namespaces he'd encourage the use of
the XmlTextReader interface.

Please be aware that my C++ skills are definitely in the "apprentice"
category, and I probably indented incorrectly to someone's great
irritation.

That said, here's a preliminary wrapper and example program for the
XmlTextReader stuff from libxml2.  It's a diff against today's CVS HEAD
plus these new files I added:

        examples/reader/Makefile.am
        examples/reader/main.cc
        examples/reader/example.xml
        libxml++/parsers/reader.cc
        libxml++/parsers/reader.h

It's only partially complete, in particular only the constructor where
you pass a filename is done.

The biggest trouble so far is that the underlying XmlTextReader stuff
handles allocation and freeing of things differently than the normal
parsers.  This means that the construct() and destruct() functions just
don't work.  I disabled them entirely for the purposes of this wrapper,
but it needs figuring out properly -- a bit beyond the scope of one
evening's hacking for me, unfamiliar as I am with the magic of libxmlpp
internals.  However, this disabling meant I couldn't implement the two
methods which allow sniffing of the current Node and Document. 

The other thing to note is that every time you get an xmlChar* back from
a method, it's your responsibility to free it with xmlFree().

I hope that this is a good enough start for someone with more clues than
I to improve.  Alternatively, I'd be happy to receive a few pointers
from folk on how best to make the wrapper more C++ idiomatic, and in
particular solve the construct/destruct issues.

The *good news* is that this stuff works for parsing namespace-enabled
XML documents.

cheers

-- Edd


? examples/reader
? libxml++/parsers/reader.cc
? libxml++/parsers/reader.h
Index: configure.in
===================================================================
RCS file: /cvsroot/libxmlplusplus/libxml++/configure.in,v
retrieving revision 1.19
diff -u -r1.19 configure.in
--- configure.in	7 Feb 2003 10:41:25 -0000	1.19
+++ configure.in	11 Feb 2003 23:35:40 -0000
@@ -60,6 +60,7 @@
     examples/dom/Makefile
     examples/sax_parser/Makefile
     examples/sax_exception/Makefile
+    examples/reader/Makefile
 
   xml++-config
   libxml++-1.0.pc
Index: examples/Makefile.am
===================================================================
RCS file: /cvsroot/libxmlplusplus/libxml++/examples/Makefile.am,v
retrieving revision 1.5
diff -u -r1.5 Makefile.am
--- examples/Makefile.am	4 Feb 2003 05:46:40 -0000	1.5
+++ examples/Makefile.am	11 Feb 2003 23:35:40 -0000
@@ -1,3 +1,3 @@
-SUBDIRS = dom_build dom_parser dom sax_parser sax_exception
+SUBDIRS = dom_build dom_parser dom sax_parser sax_exception reader
 
 EXTRA_DIST = README Makefile.am_fragment
Index: libxml++/libxml++.h
===================================================================
RCS file: /cvsroot/libxmlplusplus/libxml++/libxml++/libxml++.h,v
retrieving revision 1.11
diff -u -r1.11 libxml++.h
--- libxml++/libxml++.h	3 Feb 2003 18:47:27 -0000	1.11
+++ libxml++/libxml++.h	11 Feb 2003 23:35:41 -0000
@@ -11,6 +11,7 @@
 #include <libxml++/exceptions/parse_error.h>
 #include <libxml++/parsers/domparser.h>
 #include <libxml++/parsers/saxparser.h>
+#include <libxml++/parsers/reader.h>
 #include <libxml++/nodes/node.h>
 #include <libxml++/nodes/commentnode.h>
 #include <libxml++/nodes/element.h>
Index: libxml++/parsers/Makefile.am
===================================================================
RCS file: /cvsroot/libxmlplusplus/libxml++/libxml++/parsers/Makefile.am,v
retrieving revision 1.1
diff -u -r1.1 Makefile.am
--- libxml++/parsers/Makefile.am	12 Nov 2002 13:00:59 -0000	1.1
+++ libxml++/parsers/Makefile.am	11 Feb 2003 23:35:41 -0000
@@ -1,7 +1,7 @@
 INCLUDES = -I$(top_srcdir) @LIBXML_CFLAGS@
 
-h_sources_public = parser.h saxparser.h domparser.h
-cc_sources = parser.cc saxparser.cc domparser.cc
+h_sources_public = parser.h saxparser.h domparser.h reader.h
+cc_sources = parser.cc saxparser.cc domparser.cc reader.cc
 
 
 noinst_LTLIBRARIES = libparsers.la
@@ -9,4 +9,4 @@
 
 # Install the headers:
 library_includedir=$(includedir)/libxml++-1.0/libxml++/parsers
-library_include_HEADERS = $(h_sources_public)
\ No newline at end of file
+library_include_HEADERS = $(h_sources_public)
include $(top_srcdir)/examples/Makefile.am_fragment

#Build the executable, but don't install it.
noinst_PROGRAMS = example

#List of source files needed to build the executable:
example_SOURCES = main.cc

EXTRA_DIST = example.xml
<?xml version="1.0"?>
<gjob:Helping xmlns:gjob="http://www.gnome.org/some-location";>
  <gjob:Jobs>

    <gjob:Job>
      <gjob:Project ID="3"/>
      <gjob:Application>GBackup</gjob:Application>
      <gjob:Category>Development</gjob:Category>

      <gjob:Update>
	<gjob:Status>Open</gjob:Status>
	<gjob:Modified>Mon, 07 Jun 1999 20:27:45 -0400 MET DST</gjob:Modified>
        <gjob:Salary>USD 0.00</gjob:Salary>
      </gjob:Update>

      <gjob:Developers>
        <gjob:Developer>
        </gjob:Developer>
      </gjob:Developers>

      <gjob:Contact>
        <gjob:Person>Nathan Clemons</gjob:Person>
	<gjob:Email>nathan windsofstorm net</gjob:Email>
        <gjob:Company>
	</gjob:Company>
        <gjob:Organisation>
	</gjob:Organisation>
        <gjob:Webpage>
	</gjob:Webpage>
	<gjob:Snailmail>
	</gjob:Snailmail>
	<gjob:Phone>
	</gjob:Phone>
      </gjob:Contact>

      <gjob:Requirements>
      The program should be released as free software, under the GPL.
      </gjob:Requirements>

      <gjob:Skills>
      </gjob:Skills>

      <gjob:Details>
      A GNOME based system that will allow a superuser to configure
      compressed and uncompressed files and/or file systems to be backed
      up with a supported media in the system.  This should be able to
      perform via find commands generating a list of files that are passed
      to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine
      or via operations performed on the filesystem itself. Email
      notification and GUI status display very important.
      </gjob:Details>

    </gjob:Job>

  </gjob:Jobs>
</gjob:Helping>

#ifdef HAVE_CONFIG_H
#include <config.h>
#endif

#include <libxml++/libxml++.h>

#include <iostream>

using namespace std;

void
processNode(xmlpp::XmlTextReader *r)
{
  xmlChar *n;

  n=r->name();
  cout << r->depth() << " " << r->nodeType() << " ";
  cout << n << " " << r->isEmptyElement() << "\n";
  xmlFree(n);

  switch (r->nodeType()) {
  case xmlpp::XmlTextReader::TEXT:
	{
	  xmlChar *v=r->value();
	  cout << "[[\n" << v << "\n]]\n";
	  xmlFree(v);
	}
  break;
  case xmlpp::XmlTextReader::START_ELEMENT:
	{
	  xmlChar *p=r->namespaceUri();
	  xmlChar *l=r->localName();
	  cout << "{ " << p << " : " << l << " }\n";
	  xmlFree(p);
	  xmlFree(l);
	}
  break;
  default:
	break;
  }
}

int
main(int argc, char* argv[])
{
  xmlpp::XmlTextReader *r;
  int ret;


  r=new xmlpp::XmlTextReader("example.xml");

  ret = r->read();
  while (ret == 1) {
	processNode(r);
	ret = r->read();
  }

  cout << "All done\n";

  delete r;

  if (ret != 0) {
	cout << "Failed to parse example.xml" << "\n";
  }
}
/* reader.h
 * libxml++ and this file are copyright (C) 2003 by Edd Dumbill, and
 * are covered by the GNU Lesser General Public License, which should be
 * included with libxml++ as the file COPYING.
 */

#ifndef __LIBXMLPP_PARSERS_READER_H
#define __LIBXMLPP_PARSERS_READER_H

#include <libxml/xmlreader.h>
#include "libxml++/parsers/parser.h"
#include <list>
#include <map>

#include <iostream>

namespace xmlpp {



/** XmlTextReader
 */
class XmlTextReader
{
public:
  typedef enum {
	START_ELEMENT = 1,
	ATTRIBUTE = 2,
	TEXT = 3,
	CDATA = 4,
	ENTITY_REFERENCE = 5,
	ENTITY_DECLARATION = 6,
	PI = 7,
	COMMENT = 8,
	DOCUMENT = 9,
	DOCTYPE = 10,
	DOCUMENT_FRAGMENT = 11,
	NOTATION = 12,
	END_ELEMENT = 15
  } NodeType;

  XmlTextReader(std::istream& in, const char* URI);
  XmlTextReader(const char* URI);
  virtual ~XmlTextReader();

  // iterators
  int read();
  xmlChar* readInnerXml();
  xmlChar* readOuterXml();
  xmlChar* readString();
  int readAttributeValue();

  // accessors
  int attributeCount();
  xmlChar* baseUri();
  int depth();
  int hasAttributes();
  int hasValue();
  int isDefault();
  int isEmptyElement();
  xmlChar* localName();
  xmlChar* name();
  xmlChar* namespaceUri();
  int nodeType();
  xmlChar* prefix();
  int quoteChar();
  xmlChar* value();
  xmlChar* xmlLang();
  int readState();

  // methods
  int close();
  xmlChar* getAttributeNo(int no);
  xmlChar* getAttribute(const xmlChar* name);
  xmlChar* getAttributeNs(const xmlChar* localName,
						  const xmlChar* namespaceURI);

  // xmlTextReaderGetRemainder -- TODO

  xmlChar* lookupNamespace(const xmlChar* prefix);
  int moveToAttributeNo(int no);
  int moveToAttribute(const xmlChar* name);
  int moveToAttributeNs(const xmlChar* localName,
						const xmlChar* namespaceURI);
  int moveToFirstAttribute();
  int moveToNextAttribute();
  int moveToElement();
  int normalization();

  // extensions

  int setParserProp(int prop, int value);
  int getParserProp(int prop);

  // TODO:
  // xmlNodePtr currentNode
  // xmlDocPtr currentDoc
  
protected:
private:
  void handleException(const exception& e);
  void _init();

  xmlTextReaderPtr _reader;
  xmlRegisterNodeFunc _old_reg;
  xmlDeregisterNodeFunc _old_dereg;
  exception* _exception;
};


} // namespace xmlpp

#endif //__LIBXMLPP_PARSERS_READER_H



/* reader.cc
 * libxml++ and this file are copyright (C) 2003 by Edd Dumbill, and
 * are covered by the GNU Lesser General Public License, which should be
 * included with libxml++ as the file COPYING.
 */

#include "libxml++/parsers/reader.h"

namespace xmlpp {

  XmlTextReader::XmlTextReader(std::istream& in, const char* URI)
  {
	_init();
  }

  XmlTextReader::XmlTextReader(const char* URI)
  {
	_init();
	_reader = xmlNewTextReaderFilename(URI);
	if (!_reader)
	  throw internal_error("Couldn't create reader (file not found?)");
  }

  XmlTextReader::~XmlTextReader()
  {
	if (_reader != NULL)
	  xmlFreeTextReader(_reader);
	if (_old_reg != NULL)
	  xmlRegisterNodeDefault(_old_reg);
	if (_old_dereg != NULL)
	  xmlDeregisterNodeDefault(_old_dereg);
  }

  void
  XmlTextReader::_init()
  {
	// the reader interface has different semantics
	// for nodes so for now we'll avoid registering
	// nodes
	_old_reg=xmlRegisterNodeDefault(NULL);
	_old_dereg=xmlDeregisterNodeDefault(NULL);
  }

  // iterators

  int
  XmlTextReader::read()
  {
	// TODO: if read returns >1 then we've encountered
	// an error: we probably should generate some
	// sort of parse error exception
	return xmlTextReaderRead(_reader);
  }

  xmlChar*
  XmlTextReader::readInnerXml()
  {
	return xmlTextReaderReadInnerXml(_reader);
  }

  xmlChar*
  XmlTextReader::readOuterXml()
  {
	return xmlTextReaderReadOuterXml(_reader);
  }

  xmlChar*
  XmlTextReader::readString()
  {
	return xmlTextReaderReadString(_reader);
  }

  int
  XmlTextReader::readAttributeValue()
  {
	return xmlTextReaderReadAttributeValue(_reader);
  }

  // accessors

  int
  XmlTextReader::attributeCount()
  {
	return xmlTextReaderAttributeCount(_reader);
  }

  xmlChar*
  XmlTextReader::baseUri()
  {
	return xmlTextReaderBaseUri(_reader);
  }

  int
  XmlTextReader::depth()
  {
	return xmlTextReaderDepth(_reader);
  }

  int
  XmlTextReader::hasAttributes()
  {
	return xmlTextReaderHasAttributes(_reader);
  }

  int
  XmlTextReader::hasValue()
  {
	return xmlTextReaderHasValue(_reader);
  }

  int
  XmlTextReader::isDefault()
  {
	return xmlTextReaderIsDefault(_reader);
  }

  int
  XmlTextReader::isEmptyElement()
  {
	return xmlTextReaderIsEmptyElement(_reader);
  }

  xmlChar*
  XmlTextReader::localName()
  {
	return xmlTextReaderLocalName(_reader);
  }

  xmlChar*
  XmlTextReader::name()
  {
	return xmlTextReaderName(_reader);
  }

  xmlChar*
  XmlTextReader::namespaceUri()
  {
	return xmlTextReaderNamespaceUri(_reader);
  }

  int
  XmlTextReader::nodeType()
  {
	return xmlTextReaderNodeType(_reader);
  }

  xmlChar*
  XmlTextReader::prefix()
  {
	return xmlTextReaderPrefix(_reader);
  }

  int
  XmlTextReader::quoteChar()
  {
	return xmlTextReaderQuoteChar(_reader);
  }

  xmlChar*
  XmlTextReader::value()
  {
	return xmlTextReaderValue(_reader);
  }

  xmlChar*
  XmlTextReader::xmlLang()
  {
	return xmlTextReaderXmlLang(_reader);
  }

  int
  XmlTextReader::readState()
  {
	return xmlTextReaderReadState(_reader);
  }

  // methods

  int
  XmlTextReader::close()
  {
	return xmlTextReaderClose(_reader);
  }

  xmlChar*
  XmlTextReader::getAttributeNo(int no)
  {
	return xmlTextReaderGetAttributeNo(_reader, no);
  }

  xmlChar*
  XmlTextReader::getAttribute(const xmlChar* name)
  {
	return xmlTextReaderGetAttribute(_reader, name);
  }

  xmlChar*
  XmlTextReader::getAttributeNs(const xmlChar* localName,
								const xmlChar* namespaceURI)
  {
	return xmlTextReaderGetAttributeNs(_reader, localName, namespaceURI);
  }

  // xmlTextReaderGetRemainder -- TODO

  xmlChar*
  XmlTextReader::lookupNamespace(const xmlChar* prefix)
  {
	return xmlTextReaderLookupNamespace(_reader, prefix);
  }

  int
  XmlTextReader::moveToAttributeNo(int no)
  {
	return xmlTextReaderMoveToAttributeNo(_reader, no);
  }

  int
  XmlTextReader::moveToAttribute(const xmlChar* name)
  {
	return xmlTextReaderMoveToAttribute(_reader, name);
  }

  int
  XmlTextReader::moveToAttributeNs(const xmlChar* localName,
								   const xmlChar* namespaceURI)
  {
	return xmlTextReaderMoveToAttributeNs(_reader, localName, namespaceURI);
  }

  int
  XmlTextReader::moveToFirstAttribute()
  {
	return xmlTextReaderMoveToFirstAttribute(_reader);
  }

  int
  XmlTextReader::moveToNextAttribute()
  {
	return xmlTextReaderMoveToNextAttribute(_reader);
  }

  int
  XmlTextReader::moveToElement()
  {
	return xmlTextReaderMoveToElement(_reader);
  }

  int
  XmlTextReader::normalization()
  {
	return xmlTextReaderNormalization(_reader);
  }

  // extensions

  int
  XmlTextReader::setParserProp(int prop, int value)
  {
	return xmlTextReaderSetParserProp(_reader, prop, value);
  }

  int
  XmlTextReader::getParserProp(int prop)
  {
	return xmlTextReaderGetParserProp(_reader, prop);
  }

  // TODO:
  // xmlNodePtr currentNode
  // xmlDocPtr currentDoc
  // -- as we're not currently registering/deregistering
  // nodes I can't implement these just now.

} // namespace xmlpp


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]