[xml] HTML parser encoding in SAX



Hi,

I've been searching for an answer in the documentation and mailing list but failed to find one. I hope I'm not missing something obvious.
I'm using the HTML parser in push mode (calling htmlCreatePushParserCtxt) to parse html files that may arrive with all kinds of encodings (my application knows which encoding will arrive).
I'm trying to use xmlParseCharEncoding to know which xmlCharEncoding to give htmlCreatePushParserCtxt.

xmlCharEncoding current_xml_encoding = xmlParseCharEncoding(encoding_str);

... (error handing here) ...

myCtxt = htmlCreatePushParserCtxt(&myParserStruct,nevermind,0,0,0,current_xml_encoding);

Not all the encodings that I need have a matching xmlCharEncoding.
I can't register an encoding handler since I can't create a push parser context with a given encoding handler.
Is there another way to do it without patching the source and without pre-converting all the html to UTF-8 ?

Thanks.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]