Re: [xml] encoding issue using libxml with swish-e

Hi Tref,

Are you aware of the footer appended to your email:
"This email is intended only for the use of the individual or entity
named above and contains information that is confidential. No
confidentiality is waived or lost by any mis-transmission. If you
received this correspondence in error, please notify the sender and
immediately delete it from your system. You must not disclose, copy or
rely on any part of this correspondence if you are not the intended
recipient. Any communication directed to clients via this message is
subject to our Agreement and relevant Project Schedule. Any information that
is transmitted via email which may offend may have been sent without
knowledge or the consent of Areeba."

You wouldn't seriously expect to get help from a public mailing list,
using this footer, yes?

Anyway, I don't know the heck about swish-e, but I can assure you,
that there are no encoding issues with libxml2, but only Frequently
Happened Misunderstandings (we are talking libxml2 here, if swish-e
is using old versions, I'm not sure [but still believe that they are OK 

libxml2 will happily and correctly parse an XML file
containing the ASCII character string cinémathèque 
when the XML prolog is:
<?xml version="1.0" ?> or
<?xml version="1.0" encoding="UTF-8" ?> or
<?xml version="1.0" encoding="ISO-8859-1" ?> or
many more possibilities.

Whether libxml2 will correctly parse the accented characters
themselves entered in the XML file, cannot be judged by the
fact whether they display right with your favorite non-XML-aware 
editor, as you don't know the encoding the editor assumes. 
hd the file and then check wether the accented characters match
the encoding you declare in the prolog.

But I assume that's not the problem, as using numerical entities
does'n fix your problem.

So I assume it's the most common pitfill:

*** All libxml2 API is using UTF-8 character strings ***

Your application and all all middle layers like swish-e must be willing and
capable to accept and receive UTF-8 strings. Before sending these
strings to be displayed by the GUI/OS, they must be converted again to
the encoding the GUI/OS expects.

Peter Jacobi

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]