"we dont' know what the data are but we expect to process them cleanly" this sounds very similar to not knowing the encoding of a piece of text,
Similar, maybe. Tho my heuristics would've to answer a question with far less possibilties: Does an input buffer contain references to (pre-) defined entities or not? I still think it's possible, but it's not efficient (at least not by comparison to a developer using a proper call for an actual piece of input).
you may try to apply heuristic but it will come bite you back no matter what.
Maybe, but I'll bite back... 8)
In both case only one solution: educate the users/developpers.
Well, I'm sure that you're well aware that some (ideal) solutions are hard to realise...
it's perfectly clean to use diret access to the node structure. Just check that the target strings are not from the doc dictionnary (in which case don't free them)
How do I check this (OK, I can look that up, but probably you know it OTTOYH)?
and overwise use xmlMalloc/xmlFree to manipulate the target text nodes. Of course if you use non-predefined entities, then you will have to add entities references to the element children list.
I'll take xmlNodeSetContent() as template to implement an "...addContent" with entity support Thanks & Ciao, Markus P.S.: Attached is a diff of tree.c (against the current CVS head) which would add some documentation for the xmlNode[Set|Add]Content functions.
Attachment:
tree.diff.txt
Description: tree.diff.txt