Re: [xml] setting the default charset ?



On Fri, Jul 27, 2001 at 06:49:25PM +0200, Cyrille Chepelov wrote:
So, at the worst case, we could pass the older files through iconv() to make
sure they're UTF-8 and let libxml2 handle the result.

  Well if you use the libxml2 framework, it will be done progtressively 

Yes ; I know the default encoding: it's whatever nl_langinfo(CODESET) says
(but I'd prefer to have to tell libxml myself rather than having libxml try
to guess). Of course, if it turns out the document I'm feeding to the parser

  Libxml will never look at locales, I garantee this !

has an encoding specification, that specification shall override the default
I would give through the (still) hypothetical API.

Something like 
      int xmlSetParserEncoding(xmlParserCtxPtr ctxt,
                              const char *encoding);
would be nice. (I initially thought that would be what xmlSwitchEncoding()
was supposed to do, but it didn't quite work. And I'm afraid I don't really
understand what the libxml-parserinternals page says on this function).

  xmlSwitchEncoding will put an iconv filer for this encoding between your
source and the parser, more precisely a encoder from this encoding to
UTF8 

Perhaps even a shorthand like
      xmlDocPtr xmlParseFileWithEncoding(const char *filename,
                                         const char *default_encoding);
 (with default_encoding == NULL means "do like xmlParseFile() did") could be
useful (well, in my case, it certainly would).

  this could be easilly added.

That would definitely be helpful !

 in the meantime use xmlSwitchEncoding().

Daniel

-- 
Daniel Veillard      | Red Hat Network http://redhat.com/products/network/
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]