Re: [xml] Ignoring Character Encodings



From: "Daniel Veillard" <veillard redhat com>
On Thu, Apr 11, 2002 at 12:30:44PM +0100, Richard Jinks wrote:
...
  I think removing the encoding information on the DOCUMENT top node
should be sufficient. Just replace it to NULL and free it.

...

If there isn't a proper way using the API that I've missed, I've got a
patch
which just adds an extra run-time flag in xmlSetFeature(), and an if
statement in xmlSwitchEncoding() that will just break out of the
function if
an encoding is already set (and the flag is set, of course).

  Depends on the serialization routine you're using, you did not say
what function you're using. There are specific function asking to save in
a given encoding.
    http://xmlsoft.org/encoding.html

Daniel

Ah. I think I might not have explained myself properly. I'm concerned with
the
parsing of the document, not with the subsequent output.

I'm feeding the document in through the push parser in large chunks, but as
libxml gets to the encoding declaration, it suddenly switches encoding on me
and starts reporting errors that the UTF-8 chars I'm giving it don't match
the
ISO-8859-1 (for example) encoding the document says it's in, and used to
be in before I got to it. Subsequently, the errors from the libxml encoder
causes
the parse to fail.

By your answers, I think you might have assumed that I was trying to save
the
document out in a different encoding, unless I have misunderstood you?

Thanks again,
Richard


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]