Re: [xml] Is it time to update xmlunicode.c?



On Fri, Aug 10, 2012 at 01:17:59PM +0200, Nikolai Weibull wrote:
Hi!

  Hello Nikolai,

I noticed that xmlunicode.c is based off of Unicode 4.0.1.  Is it time
to update it to Unicode 6.1.0?

  Good question !
  Let's see ... the definition of Unicode affects 2 parts of libxml2:
    - the core parser: http://www.w3.org/TR/REC-xml/#charsets
      defines the character range and that can't be extended
      productions related to specific characters have a predefined range
      e.g. [4]      NameStartChar
           [4a]     NameChar
      and character classes like defined in appendix B don't need to be
      upgraded. Basically the XML parser should not be affected by any
      update there
    - the XML schemas datatype part driven by
      http://www.w3.org/TR/xmlschema-2/
      and this affects the character classes:
      http://www.w3.org/TR/xmlschema-2/#charcter-classes
      where it dumps a version od the Blocks database
      "The following table specifies the recognized block names (for more
       information, see the "Blocks.txt" file in [Unicode Database])."
      the reference within is
      The Unicode Consortium. The Unicode Character Database. Available at:
      http://www.unicode.org/Public/3.1-Update/UnicodeCharacterDatabase-3.1.0.html

 Based on that I think it would just affect the character classes in
schemas types. So apparently I already updated to 4.0.1 from 3.1.0 of
the original spec. But I wonder if updating is actually right w.r.t.
XSD. Honnestly I don't know, One thing would be to try and see what
it gives on regression tests.

 They say "However, implementors are encouraged to support the blocks
defined in any future version of the Unicode Standard. " but that was
written a long time ago and could lead to some incompatibilities.
One thing is sure all the group listed in the spec should be kept
defined, and I hope with the same ranges. New blocks should be Okay
I would guess.

 Liam, maybe you have an informed point of view on the question, shall
we update to more recent version of Unicode for XSD datatype v1.0 ?

  I would tentatively be okay, assuming this doesn't generate regression
tests with "make check"

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]