[xml] Trouble parsing html



I've been successfully parsing "real world" html using
the following:


htmlParserCtxtPtr
parserContext=htmlCreateMemoryParserCtxt(cString,
strlen(cString));
htmlCtxtUseOptions(parserContext,NULL);
htmlParseDocument(parserContext);
receiver=parserContext->myDoc;


However in trying to solve an issue when I output this
later on, I tried the following so that I could
specify the encoding when parsing instead:


htmlParserCtxtPtr parserContext=xmlNewParserCtxt();
receiver=htmlCtxtReadMemory(parserContext,cString,strlen(cString),url,encoding,NULL);
/* or:
receiver=htmlCtxtReadDoc(parserContext,cString,url,encoding,NULL);
*/


but using this second method, the only thing that gets
put into the tree is the comments in the original code
(pointed to by cString), and no other code. Am I
missing a step in the second method, or do those
functions only expect completely valid HTML when
parsing?



 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]