[xml] Supporting additional encodings in a push parser?
- From: Nick Kew <nick webthing com>
- To: xml gnome org
- Subject: [xml] Supporting additional encodings in a push parser?
- Date: Wed, 15 Nov 2006 19:16:13 +0000
I'm looking to improve I18n support in my libxml2-based Apache
filter modules, such as mod_proxy_html.
To work well with the Apache architecture, these
use a push parser:
ctxt->parser = htmlCreatePushParserCtxt(ctxt->sax, ctxt,
buf, m->start, 0, enc );
"enc" is an xmlCharEncoding set by xmlParseCharEncoding or
xmlDetectCharEncoding. With charset sniffing, this
automatically inherits libxml2's native charset support.
Now, libxml2's encoding module lets me register new charsets,
for example by registering iconv-based conversion functions:
xmlCharEncodingHandlerPtr charenc
= xmlNewCharEncodingHandler(encoding, iconv_in, iconv_out);
xmlRegisterCharEncodingHandler(charenc);
But there's a missing link: xmlCharEncoding is an enum, and
registering a new encoding handler doesn't create a new value I
can use with xmlParseCharEncoding, htmlCreatePushParserCtxt, etc.
Is there a workaround that'll enable me to register new
charsets *and* use them in a push parser, other than
just preprocessing ahead of the parser?
--
Nick Kew
Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]