[xml] Thoughts about normalization in XML Schemata


I noticed that the values in xmlschemastypes.c are compared with
a tricky on-the-fly normalization during comparison. This leads to the
question in which form the values should be stored for IDC key
comparison: normalized or not. Types derived from string or
normalizedString can have different "whitespace" facets, so I see
the following ways for comparison:
1. The values are already normalized; just compare with xmlStrEqual
2. The values are left as found; this needs:
   1. The value of the whitespace facet of the simple types involved
   2. String comparison functions which can compare values with
      different whitespace facets

What would be more performant? Identity-constraints means a _lot_
of comparison operations.

AFAIK the only mechanism that would _need_ the value to be normalized
beforehand is the "pattern" facet.

What do you think? Could be such multi-whitespace-type comparison
functions be done, or would it slow down things too much?

By the way, if PSVI for elements is going to be implemented, the
normalized value must be accessible as well; this means normalization
for every request. An idea would be to store a flag, so if it is
normalized once, the value is swapped and need not to be normalized the
next time.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]