Re: [xslt] XPath number formatting



On 2006-04-07 11:55:11 +0200, Tim Van Holder wrote:
> William M. Brack wrote:
[...]
> >  However, there doesn't seem to be any good definition of what an
> >  "integer" is. In particular, should a number which is greater
> >  than MAX_INT on some system still be treated as an "integer"? The
> >  existing code within our library doesn't do this - instead, it
> >  treats it as a "non-integer" and outputs the value in exponential
> >  form. I believe this is incorrect, but would welcome any opposing
> >  views.
> 
> Failing any specific mentions of programming languages or system
> libraries in the spec, I would think that things like MAX_INT are
> irrelevant.  I would assume that "integer" has the mathematical
> meaning.  Even (modern versions) of COBOL can work with integers
> that don't fit in MAX_INT (32 digits or more).

I agree. In fact, I started to write an article to comp.text.xml
about what an integer should be until I realized there was the
same problem with non-integers and that the "style e" form was
forbidden in any case. As I've said in my last comment on

  http://bugzilla.gnome.org/show_bug.cgi?id=337565

There are 2 problems with the current behavior:

1. A number that is written by xsltproc must be readable by any
XSLT/XPath processor, and the "style e" is not a number according to
the XPath recommendation. More precisely, converting a string like
1.23456789012e+11 into a number should produce a NaN (this is again
a bug in libxml2, and xalan does it right). So, this means that data
produced by xsltproc won't be readable by other XSLT/XPath processors
like xalan, because libxml2 breaks the specs. For an interchange
format (one of the main goals of XML), this is not acceptable.

2. Even inside xsltproc, the current behavior may break things when
one wants to do string manipulations like digit extraction (this is
my case: I had integers between 0 and 2^53).

It seems that the patch corrects integers only. There's the same
problem with large non-integers (between 10^9 and 2^52?). Also the
generated strings cannot always allow to distinguish the numbers:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output method="text"/>
<xsl:param name="x"/>
<xsl:param name="y"/>
<xsl:template match="/">
  <xsl:variable name="nx" select="number($x)"/>
  <xsl:variable name="ny" select="number($y)"/>
  <xsl:variable name="cmp">
    <xsl:choose>
      <xsl:when test="$nx = $ny">equal</xsl:when>
      <xsl:otherwise>different</xsl:otherwise>
    </xsl:choose>
  </xsl:variable>
  <xsl:value-of select="concat('The following numbers:&#10;  ', $nx,
                        '&#10;  ', $ny, '&#10;are ', $cmp, '.&#10;')"/>
</xsl:template>
</xsl:stylesheet>

$ xsltproc --param x 4503599627370497 --param y 4503599627370498 diff-numbers.xsl diff-numbers.xsl

The following numbers:
   4.5035996273705e+15
   4.5035996273705e+15
are different.

Two different numbers must have different strings, as the XPath spec
says: "beyond the one required digit after the decimal point there
must be as many, but only as many, more digits as are needed to
uniquely distinguish the number from all other IEEE 754 numeric
values."

> The only "problem" I see is that this may mean that libxslt needs
> to start using GMP (or some other "BigNum" library);

Isn't the C library sufficient? (BTW, glibc uses GMP for this purpose.)

-- 
Vincent Lefèvre <vincent vinc17 org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]