RE: [xml] encoding, SAX callbacks

On Mon, 26 Feb 2001, James McCann wrote:

Some of the arguments I have read in this thread against seem to assume
that every project which uses XML will encounter documents in many
encodings and that it is better to simply use UTF8 so that the app doesn't
encounter "a myriad" of undesired encodings.  In the project I am working
on, we are using XML to send data over the network from one app to another.

I just HAVE to make a response to this thread (quoting out of context of
course although I -have- read the entire thread.

all of ISO-8859-* can be encoded into UTF
There ARE encodings which cannot (Big5, EUC are the only examples I know
Maintaining both the original doc + converted form is unreasonably bulky
when XML document is veryvery large....
(but hey, that's what SAX is for, no? :)

I work with -many- encodings in docs.  And anyone who reads docs produced
by tools other than ones they have full control over should expect
multiple encodings.

So I think your circumstance is a "special case" and the "general case" is
that UTF8 acceptance is The Right Way.

and while I'm probably being rude with this I consider apps only capable
of handling one encoding - PARTICULARILY if that encoding is ISO-8859-1
(or worse yet, IBM ASCII) extremely arrogant.  Call it a pet peeve.
Doesn't normally affect me unless I have to -use- such tools and I
frequently work in languages other than english...

*grumble* people who only speak one language *grumble*

G'day, eh? :)
        - Teunis, who's not -entirely- serious here...  but please, if
          you're software's going out into the open world, make it

And a library for handling conversions: iconv().  ('xcept it's not under
windows - but there's an equiv.  The code's in the standard libs anyways)

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]