Re: [xml] GSoC using libxml for indexing XML document
- From: Daniel Veillard <veillard redhat com>
- To: TomÃÅ PospÃÅil <killteck seznam cz>
- Cc: xml gnome org
- Subject: Re: [xml] GSoC using libxml for indexing XML document
- Date: Sun, 5 Jun 2011 11:19:53 +0800
On Fri, Jun 03, 2011 at 05:34:03PM +0200, TomÃÅ PospÃÅil wrote:
Hi Daniel and all hackers,
I'm GSoC student creating new XML index in PostgreSQL which use LibXML for handling XML documents. My idea
about index is about to use node offsets and Patricia Trie for mapping structural information to our
internal representation tree index. So how can I get offset of nodes?
P.S. I already searched history and understand that libXml is not implemented for this kind of XML
handling, but my use scenario is quite different that typical usage. Daniel suggested xmlByteConsumed and
xmlTextReaderByteConsumed by I see that it's not exactly what I need.
Well xmlByteConsumed is somehow about relating nodes to offset
of course things are often more complext due to the fact:
- not all XML documents are made of one continuous stream (entity in
XML speak), e.g. use of entities or XInclude.
- libxml2 may convert encoding on the fly, and the encoder need
to work on batch data to provide adequate performances, so at
the parser level it's usually very hard to have precise offset
from source
if it doesn't do what you want, well you didn't defined precisely
what you wanted either "offset of nodes" can be interpreted in many
ways ...
Daniel
--
Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/
daniel veillard com | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library http://libvirt.org/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]