Re: [xslt] XPath number formatting



Vincent Lefevre wrote:

Unfortunately I wasn't there at this time, but it is based on wrong
assumptions. Bjorn Reese said:

  Very simple. XPath uses ANSI/IEEE Std 754 (1985) for floating point
  numbers. This standard uses a fixed numbers of digits (bits,
  actually) to represent numbers, which means that it cannot
  accurately represent, say, the number pi as it would require an
  infinite number of digits.

Until there, this is true (53 bits + 1 sign bit for the mantissa).

  It also means that big numbers cannot be represented accurately with
  the regular 9.9 floating point notation. If we were going to try this,
  the least significant digit would be garbage.

The representation required by XPath is *not* a "9.9 floating point
notation". With the representation required by XPath, there is
absolutely no garbage.

Let me start by clarifying that at the time of the cited discussion, only the decimal point numbers were implemented. Support for integer numbers were submitted later by somebody else.

In the following I will try to elaborate on the reasoning behind the
cited discussion. Although the discussion was only addressing decimal
point numbers, I believe that the basic rationale can be extended to
integer numbers as well.

By "9.9 floating point notation" I was referring to the notation without
an exponent (e.g. the %f specifier for printf). This makes my above
statement is trivially true: with at most 17 digits available in IEEE
754, we cannot accurately represent numbers with 18 digits or more.

There tends to be two diverging points of view about how to regard the
digits beyond 17 (called excessive digits in the following), depending
on what you want to use the number for.

People who need to convert numbers into strings and back without loss
of precision want to use the excessive digits to convey as much
information as possible to be able to round numbers correctly when
they are converted back from strings to numbers.

People who need to convert numbers into strings to be presented to
users want to avoid that users believe that the excessive digits convey
accurate information. An example of this point of view is expressed in
the design rule "Never give an answer with more precision than is
warranted." See:

  http://catless.ncl.ac.uk/Risks/24.17.html#subj1

If you belong to the first group of people, then the excessive digits
are important and certainly not to be regarded as garbage. If you belong
to the second group of people then the excessive digits are garbage, and
potentially harmful (see above link).

In the cited discussion I adopted the point of view of the second group
because their scenario seemed to be much more common for XSLT.

  It such cases it is customary to use scientific notation instead,
  and this is what we did. It is not 100% standard compliant, but the
  standard would generate garbage, so we chose the lesser of the two
  evils.

Again this is wrong. Note that Bjorn didn't give any example for which
there would be a problem (but perhaps he wrongly thought about a 9.9
floating point notation).

See above.

But [-2^53,2^53] is a natural integer range when the number type is
IEEE-754 double precision (and integer types are not available),
like in XPath. And it is important to follow the spec concerning
these integers.

I did not write the integer part of the formatting code, nor do I use it, so I have no objections against extending the range to 53-bit numbers.

However, we must decide how to handle both decimal and integer numbers
outside the legal range.

My preference is to use scientific notations, since this is also the
solution chosen by later versions of the specs.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]