Re: [xml] libxml2 version 2.9.0: xpath.c:xmlXPathStringEvalNumber() - not 100% fails to parse float with leading '+'

One thing I don't see a lot of which could be useful in this case -- environment variable.  Then it can default to the current setting, and be set to another.  This sometimes matters because some people use parsers that don't always interpret as we do ... I had trouble with an MS Server not accepting POST from nano_http because of CR/LF disagreement ... as if the world doesn't know about this problem and should just code around it always.   Arghh.   I accept a lot of XML from mom/pop companies that are not sophisticated -- I had so much trouble with people not being able to capitalize properly (saleAmt vs SaleAmt vs Saleamt) and I was sick of the support calls for nonsense, so I made an environment to ignore case (our XML NEVER has tags with the same letters in different cases) ... this is of course completely insane and follows no standard except one of my favorites -- 20 less phone calls every day for months as new customers sign up to send orders by XML.

If anyone is interested I have some mods to libxml2 along those lines.  For example I have one that eliminates the headers on XML files -- for internal exchanges where I know the encoding and so forth and don't want to waste the bytes -- another environment. 

Anyway, I only suggest this if someone is actually getting XML with a leading plus and can't do much about it.  It is sometimes easier to just accept it.


On 5/8/2013 5:18 AM, Daniel Veillard wrote:
On Mon, May 06, 2013 at 12:41:22PM +0000, Kuhnke, Christoph (I/EF-56, extern) wrote:
Dear creators of libxml2, 

 might be better to subscribe to the list so i don't have to dig your
messages in the mailing-list bounces, which I tend to forget doing !

I think I found a slight mistake in function xpath.c:xmlXPathStringEvalNumber()
of libxml2 version 2.9.0.

For data type "float" in xml files resp. xml schemas 
[1] states "The mantissa must be a decimal number."
[2] contains details: 
	An optional leading sign is allowed. 
	If the sign is omitted, "+" is assumed. 
and names "+100000.00" as valid example for a decimal.

In contrast when first non-whitespace-character of it's argument is a '+',
xpath.c:xmlXPathStringEvalNumber() stops parsing and return 0.0.

To make libxml2 conform to [1] I suggest the following change:

Replace 3 lines 10098 through 10100 (incl.) 
by the following 7 lines 

    if ((*cur != '.') && ((*cur < '0') || (*cur > '9')) && (*cur != '-') && (*cur != '+')) {
    if (*cur == '+') {
	isneg = 0;

Please let me know, what you think about my proposal.

[1] Lexical representation
[2] Lexical representation
  Ah well I am sorry but that's the wrong reference ;-)
Surprizingly enough xpath.c:xmlXPathStringEvalNumber() is part of the
implementation of XPath (version 1),

 "a string that consists of optional whitespace followed by an optional
 minus sign followed by a Number followed by whitespace is converted to
 the IEEE 754 number that is nearest (according to the IEEE 754
 round-to-nearest rule) to the mathematical value represented by the
 string; any other string is converted to NaN"

[30]	Number	   ::=	    Digits ('.' Digits?)?
	              | '.' Digits
[31]    Digits     ::=	[0-9]+

 basically no "+" is allowed by XPath 1 for the beginning of numbers
so the function is I think correct.
 Libxml2 actually has a xml Schemas (version 1) datatype implementation
and it does check float values, the code is in
  xmlSchemaValAtomicType() in xmlschemastypes.c
see around line 2393, it does accept a leading "+"

  It's a feature, not a bug, at least at the libxml2 level, at the
W3C spec level it is a slight incoherence...



Eric S. Eberhard
2933 W Middle Verde Road
Camp Verde, AZ  86322

928-567-3727  work                      928-301-7537  cell             (our work)     (fun pictures)

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]