Re: [xml] utf-8 encoding and xmlSAXParseMemory



* Daniel Veillard <veillard redhat com> [2006-05-02 23:10]:
The problem is "how do you know it's ISO-8859-1 and not another
variant.

You donât. But if itâs data on the English or French part of the
web, then invalid bytes are ISO-8859-1 with 99.999% certainty.

You can't garantee to not generate false positive (i.e. corrupt
data) which is why the XML Working Group declared this had to
be a fatal error.

I know. I wouldnât use that approach for mission-critical
systems. Mostly, I use it to deal with webpages and -feeds,
which are generally all kinds of dirty and broken anyway.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]