Re: [xslt] XPath number formatting



On Fri, Apr 07, 2006 at 12:13:20PM +0200, Vincent Lefevre wrote:
>   http://bugzilla.gnome.org/show_bug.cgi?id=337565
> 
> There are 2 problems with the current behavior:
> 
> 1. A number that is written by xsltproc must be readable by any
> XSLT/XPath processor, and the "style e" is not a number according to
> the XPath recommendation. More precisely, converting a string like
> 1.23456789012e+11 into a number should produce a NaN (this is again
> a bug in libxml2, and xalan does it right). So, this means that data

  Round tripping is mandatory at least internally
  http://bugzilla.gnome.org/show_bug.cgi?id=53364

> produced by xsltproc won't be readable by other XSLT/XPath processors
> like xalan, because libxml2 breaks the specs. For an interchange
> format (one of the main goals of XML), this is not acceptable.

  Please read the previous discussions about this like
  http://mail.gnome.org/archives/xml/2001-April/msg00080.html
This deviation in libxml2 has been there nearly forever.

> 2. Even inside xsltproc, the current behavior may break things when
> one wants to do string manipulations like digit extraction (this is
> my case: I had integers between 0 and 2^53).

  I don't think the XPath specs can mandate correct behaviour for such
integer values.

> It seems that the patch corrects integers only. There's the same
> problem with large non-integers (between 10^9 and 2^52?). Also the
> generated strings cannot always allow to distinguish the numbers:
> 
> <?xml version="1.0"?>
> <xsl:stylesheet version="1.0"
>                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
> <xsl:output method="text"/>
> <xsl:param name="x"/>
> <xsl:param name="y"/>
> <xsl:template match="/">
>   <xsl:variable name="nx" select="number($x)"/>
>   <xsl:variable name="ny" select="number($y)"/>
>   <xsl:variable name="cmp">
>     <xsl:choose>
>       <xsl:when test="$nx = $ny">equal</xsl:when>
>       <xsl:otherwise>different</xsl:otherwise>
>     </xsl:choose>
>   </xsl:variable>
>   <xsl:value-of select="concat('The following numbers:&#10;  ', $nx,
>                         '&#10;  ', $ny, '&#10;are ', $cmp, '.&#10;')"/>
> </xsl:template>
> </xsl:stylesheet>
> 
> $ xsltproc --param x 4503599627370497 --param y 4503599627370498 diff-numbers.xsl diff-numbers.xsl
> 
> The following numbers:
>    4.5035996273705e+15
>    4.5035996273705e+15
> are different.
> 
> Two different numbers must have different strings, as the XPath spec
> says: "beyond the one required digit after the decimal point there
> must be as many, but only as many, more digits as are needed to
> uniquely distinguish the number from all other IEEE 754 numeric
> values."
> 
> > The only "problem" I see is that this may mean that libxslt needs
> > to start using GMP (or some other "BigNum" library);
> 
> Isn't the C library sufficient? (BTW, glibc uses GMP for this purpose.)

  The divergence from the standard is recorded and present since the
beginning of libxml2 XPath and libxslt support. At the time clearly
there was no coherence in implementations. Maybe this need to be fixed, 
but I won't take a single other processor output as an argument, even
if it's Saxon...
  Following the spec is important, but number formating is clearly one
of the areas where the XPath and XSLT 1.0 were broken. Now if people
want to fix this, I'm not against it, but:
   - I don't want extra library requirement
   - I want a compatible behaviour on all the platforms supported by
     libxml2

That probably mean writing some not so trivial code to handle larger than
necessary integers/fractional numbers, I think there is something somehow
similar in the Schemas type support for the Decimal type which requires
18 digits of precisions at least (we support 24 see xmlSchemaValDecimal in
xmlschemastypes.c), 52 digits is IMHO totally out of scope for this.

Since this hasn't been raised since 2001, and that behaviour is recorded
as such since then I don't feel an urge to fix it myself, but as usual
patches are welcome, but let's be clear about the scope of what really
need to be done and the means to get there. Adding depends on new library
is not okay, using non-portable types or code aren't either (this include
long long).

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]