Re: [xml] functions which never return



On Fri, Nov 26, 2004 at 12:11:28PM +0100, Olivier Sirven wrote:
Hi,

I am using libxml2 to parse rss, rdf and atom content and I have found 2 
functions xhich, in some case, never return:
- xmlSAXParseMemory
I call the function this way:
xmlSAXParseMemory(NULL, data, dataLength, 1);
If data contains invalid data (like invalid xhtml) the call will never 
return...the only way I've found to manage that issue is to call the function 
this way: xmlSAXParseMemory(NULL, data, dataLength, 0); but I wonder if it 
won't be to strict while parsing data ?

  The recovery flag should *NOT* be used in normal operations.
I actually tempted to not fix this bug to prevent people from doing this.
If the RSS stream are broken, they are broken, and XML tells the data should
be dropped on the floor as soon as the problem is found. People MUST fix
their stream, workaround should not be attempted at the aggregator level
without human intervention, WONTFIX .

- xmlNodeDump
When the node given to xmlNodeDump contains some invalid encoded strings, 
xmlNodeDump will never return. I've managed this issue by parsing the node 
and its children for invalid string to correct them.

  How did that string made it to the document ? If it is the result of parsing
then this must be debugged and fixed. If this is the result of using the 
API to add strings to an existing document you must ensure that they are
correct UTF-8 string before passing them to the API as xmlChar *.

Does one can imagine in a futur release of libxml the functions to detect and 
to prevent infinite recursion ?

  In the first case, probably no to avoid abuse of the XML specification.
In the second case yes, but you will have to provide reproduceable test
cases see http://xmlsoft.org/bugs.html for guidelines.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]