Re: [xml] GSoC using libxml for indexing XML document



On Sun, 2011-06-05 at 11:19 +0800, Daniel Veillard wrote:
On Fri, Jun 03, 2011 at 05:34:03PM +0200, Tomáš Pospíšil wrote:
Hi Daniel and all hackers,

I'm GSoC student creating new XML index in PostgreSQL which use
LibXML for handling XML documents. My idea about index is about to use
node offsets and Patricia Trie for mapping structural information to
our internal representation tree index. So how can I get offset of
nodes?

Before you get too far you might want to look at xqilla and dbxml -
presumably your goal will be to support XQuery, and you'll want the
parser and XPath 2 engine for that (unless PostgreSQL already has XQuery
support?).  You might continue on your current course, of course, but
take a look.  Maybe you did that already.

Patricia trees can be fast (PAT used to use them, years ago, a spin-off
from the University of Waterloo called OpenText), but they are difficult
to update, so if you plan to support XQuery Update, e.g. to replace a
single text node in a 500 MByte XML document, it might not be the best
choice.

Liam


-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]