[xslt] bug in str:tokenize

I think I've found a number of bugs in the str:tokenize implementation
in libxslt.  It's hard to tell exactly which ones are bugs, because
EXSLT isn't very specific.

If you call str:tokenize('/foo//bar/', '/'), you'll get:


For the first one: Leading instances of the delimiter always seem to
make it into the first token.  I don't think that's right.

For the second one: The delimiter made it into the token because of the
double slash.

For the third one: Trailing instances of the delimiter always produce an
empty token element.

To play around even more, str:tokenize('//foo', '/') produces:


And str:tokenize('foo///bar', '/') produces:


I'm pretty certain that delimiter characters should never appear in
tokens, and certainly never appear alone as tokens.  And I *think* that
empty tokens should be stripped (which the implementation in fact tries
to do, but doesn't catch on the final character).

If nobody disagrees, I'll try to fix up str:tokenize tonight and send a
patch.  Also, somebody should speak with the EXSLT people to get them to
be more specific.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]