Re: [xml] xml diff and patch support -- new node type




--- Daniel Veillard <veillard redhat com> wrote:

On Mon, Jan 31, 2005 at 04:55:25PM -0800, Mark Vakoc wrote:
Daniel,

I'm midway through implementing what I hope to be a rather flexible diff
(and
eventually patch) implementation for libxml2.  Flexible in the sense that

  Cool, I sent a mail a couple of years ago asking for someone
to implement a diff :-)


Yeah, I've been toying with it for a while but finally got serious about
finishing it.  Got to admit the algorithm is more complex then I would have
thought initially....
 
Couple of general things about it:
* the diff API will only take URIs rather than xmlDocPtr's because it is
destructive to the documents and also requires access to all fields to
store
hashes, relative positions, misc items (including _private) for the source
and
target document

  Well you make the code so you handle the restrictions, but what about 
an API doing an xmlCopyDoc ? And do you need to modify both documents ?

xmlCopyDoc should be no problem, I'll worry about that later.  Right now it
only modifies the source (left) document, but I'm not done yet :)


* I'm adding a xmlHashMultiTable that is a hash multimap where the sort key
need not be unique.  It is also implemented to allow front and back access
to

  Not sure I understand :-) Let see the API !


Basically I needed a queue that can quickly look up nodes in it by a hash value
that guarantees I can pull out matching entries in the order they were added to
the queue.  The table may contain multiple matches for the same hash value. 
Here's the API for it so far:

/*
 * Hash Multimap
 */
XMLPUBFUN xmlHashMultiTablePtr XMLCALL
                        xmlHashMultiCreate(int size);
XMLPUBFUN void XMLCALL
                        xmlHashMultiFree(xmlHashMultiTablePtr table, 
                                        xmlHashDeallocator f);
XMLPUBFUN int XMLCALL
                        xmlHashMultiAddEntry(xmlHashMultiTablePtr table, 
                                        unsigned long hash, 
                                        void* userdata);
XMLPUBFUN xmlHashMultiEntryPtr XMLCALL
                        xmlHashMultiLookup(xmlHashMultiTablePtr table, 
                                        unsigned long hash);
XMLPUBFUN void* XMLCALL
                        xmlHashMultiGetData(xmlHashMultiEntryPtr entry);
XMLPUBFUN xmlHashMultiEntryPtr XMLCALL
                        xmlHashMultiNext(xmlHashMultiEntryPtr entry);
XMLPUBFUN int XMLCALL
                        xmlHashMultiSize(xmlHashMultiTablePtr table);
XMLPUBFUN int XMLCALL
                        xmlHashMultiRemove(xmlHashMultiTablePtr table, 
                                        xmlHashMultiEntryPtr entry, 
                                        xmlHashDeallocator f);

  Hum, that is annoying. It's gonna break lot of stuff. If such node are
never exposed afterward, I suggest to not add it to ElementType.

Just a heads up and want to make sure adding a new value to the
xmlElementType
enum is ok before I commit to that.  Should have a patch ready in a week or
two.

  I would rather make a #define for the new element type and avoiding 
it from escaping the scope of the xmldiff C module.


Ok, I can keep it entirely within the xmldiff module.  I'll just have to remove
those nodes manually before calling any xmlFreeDoc, no problem.  When I'm done
I may be able to get away with just storing info into an existing element
already, though those are filling up.  So far node->_private stores a hash
value of the subtree rooted at that node, node->extra is used for bitwise
flags, and node->line is used for relative position of a child.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]