Re: my worry about the recent libxml change



[sorry for the repost ...]

On Thu, Mar 22, 2001 at 06:28:00PM -0800, Darin Adler wrote:
> The current code in xml-i18n-tool, OAF, and Nautilus depends on the
> following property: Text in XML files, localized strings that come from
> gettext, file names, GTK widget labels, and other strings all use the same
> character set (the local one for each locale, often Latin-1).
> 
> Adding code to libxml to properly handle character sets when reading and
> writing XML will retroactively decide that existing files are in some
> particular character set, when they are actually in a mix of character sets.

  And hence no XML parser should ever get them back to you:
    http://www.w3.org/TR/REC-xml#charencoding

--------------
It is a fatal error when an XML processor encounters an entity with an
encoding that it is unable to process. It is a fatal error if an XML
entity is determined (via default, encoding declaration, or higher-level
protocol) to be in a certain encoding but contains octet sequences that
are not legal in that encoding. It is also a fatal error if an XML
entity contains no encoding declaration and its content is not legal
UTF-8 or UTF-16.
--------------

  Again I have expressed this was a libxml1 problem for one year, 
and urged people to switch to a compatible parser like libxml2.

> Making libxml DOM trees in memory always use UTF-8 will break all the code
> that puts strings in and takes them out without doing any translation.

It wont break the code, it shows the code is broken, significant difference.
I don't want to break your code, but i want it to be fixed, I will help
people to transition like I have said one year ago when releasing the
first libxml2 version. Point is until recently nobody gave a fuck about
what I was saying, you get more pressure now, well I'm sorry ...

> I don't see how to make a program that works compatibly with both the old
> and new versions of libxml. I have no idea how to address this issue in the
> code for the various packages.

  you can test xmlParserVersion exported by libxml1 and libxml2 and act
to use the translation where needed i.e. when the string value is > "1.8.11"

[root rpmfind /root]# nm /usr/lib/libxml.so.1.8.11 | grep xmlParserVersion
0005528c D xmlParserVersion

[root orchis /root]# nm /usr/lib/libxml.so.1.8.12 | grep xmlParserVersion
0004d58c D xmlParserVersion


> I hope someone can prove with testing or coding or both that I am wrong, and
> this change can be done compatibly.

  I hope it can too. But if the final question left is:
    "Shoudl we prefer keeping the existing broken platform over
     adherance to standard and better I18N support"

  then my definitive answer is a resounding NO ! 
I just hope we won't end up being stuck at this question.

Daniel

-- 
Daniel Veillard      | Red Hat Network http://redhat.com/products/network/
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]