[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [xml] utf-8 encoding and xmlSAXParseMemory



* Daniel Veillard <veillard redhat com> [2006-05-02 23:10]:
> The problem is "how do you know it's ISO-8859-1 and not another
> variant.

You don’t. But if it’s data on the English or French part of the
web, then invalid bytes are ISO-8859-1 with 99.999% certainty.

> You can't garantee to not generate false positive (i.e. corrupt
> data) which is why the XML Working Group declared this had to
> be a fatal error.

I know. I wouldn’t use that approach for mission-critical
systems. Mostly, I use it to deal with webpages and -feeds,
which are generally all kinds of dirty and broken anyway.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]