[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

RE: [xml] encoding, SAX callbacks



On Mon, 26 Feb 2001, James McCann wrote:

> Some of the arguments I have read in this thread against seem to assume
> that every project which uses XML will encounter documents in many
> encodings and that it is better to simply use UTF8 so that the app doesn't
> encounter "a myriad" of undesired encodings.  In the project I am working
> on, we are using XML to send data over the network from one app to another.

I just HAVE to make a response to this thread (quoting out of context of
course although I -have- read the entire thread.

<rant>
all of ISO-8859-* can be encoded into UTF
There ARE encodings which cannot (Big5, EUC are the only examples I know
of)
Maintaining both the original doc + converted form is unreasonably bulky
when XML document is veryvery large....
(but hey, that's what SAX is for, no? :)

I work with -many- encodings in docs.  And anyone who reads docs produced
by tools other than ones they have full control over should expect
multiple encodings.

So I think your circumstance is a "special case" and the "general case" is
that UTF8 acceptance is The Right Way.

and while I'm probably being rude with this I consider apps only capable
of handling one encoding - PARTICULARILY if that encoding is ISO-8859-1
(or worse yet, IBM ASCII) extremely arrogant.  Call it a pet peeve.
Doesn't normally affect me unless I have to -use- such tools and I
frequently work in languages other than english...
</rant>

*grumble* people who only speak one language *grumble*

G'day, eh? :)
	- Teunis, who's not -entirely- serious here...  but please, if
	  you're software's going out into the open world, make it
	  international!

And a library for handling conversions: iconv().  ('xcept it's not under
windows - but there's an equiv.  The code's in the standard libs anyways)





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]