On Fri, Aug 03, 2007 at 08:22:37PM +0530, Ashwin wrote: > > Hi, > > While testing the xmlFARegExec function if I give the following input > > (0|1|2|3|4|5|6|7|8|9) (0,10) (The rule to be used for matching the > input _expression_) > > _expression_ to be matched 1234567891 (length 10) > > This is returning a failure, if I reduce the _expression_ by 1 it works > fine. Is this because of incorrect usage?
> Hum, no, looks like a bug, strange ... > paphio:~/XML -> ./testRegexp '(0|1|2|3|4|5|6|7|8|9){0,10}' '1234567891' > Testing (0|1|2|3|4|5|6|7|8|9){0,10}: > 1234567891: Fail > paphio:~/XML -> ./testRegexp '(0|1|2|3|4|5|6|7|8|9){0,10}' '123456789' > Testing (0|1|2|3|4|5|6|7|8|9){0,10}: > 123456789: Ok
Could be worth bugzilla'ing that's something I should be able to fix !
Hi, With regard to the above problem I made some modifications to the generate epsilon function, which seems to have solved the problem when a range with 0 is specified, however now it so happens that if the input is more than the specified range regxpexec function returns success instead of failure. The code change is as follows:- In file xmlregxp.c
1522 if (atom->min == 0) { 1523 xmlFAGenerateEpsilonTransition(ctxt, atom->start,atom->stop); 1524 newstate = xmlRegNewState(ctxt); 1525 xmlRegStatePush(ctxt, newstate); ctxt->state = newstate; xmlFAGenerateEpsilonTransition(ctxt, atom->start, newstate);
counter = xmlRegGetCounter(ctxt); ctxt->counters[counter].min = atom->min - 1; ctxt->counters[counter].max = atom->max - 1;// These three lines were not part of the earlier if condition counter = -1; } else { counter = xmlRegGetCounter(ctxt); ctxt->counters[counter].min = atom->min - 1; ctxt->counters[counter].max = atom->max - 1; }
I have to admit that this is nothing but a shabby workaround, if at all one can call it even that. Passing the value of counter as -1 in the functions xmlFAGenerateCountedTransition & xmlFAGenerateCountedEpsilonTransition immediately following this bit of code seems to solve the problem.The basis for the above change is the fact I have a hunch the problem lies in counted transitions getting generated where they are not required. Consider the following cases(I used xmlRegxpPrint to print the below on the console):- On the left hand side is the case where range was 0, and maximum range string was returning failure (after the code change this is now returning success)
The input string is 012 The input String is 122 Testing (0|1|2){0,3}: Testing (0|1|2){1,3}: regexp: '(0|1|2){0,3}' regexp: '(0|1|2){1,3}' 4 atoms: 4 atoms: 00 atom: charval once char 0 00 atom: charval once char 0 01 atom: charval once char 1 01 atom: charval once char 1 02 atom: charval once char 2 02 atom: charval once char 2 03 atom: subexpr once start -572662307 end 2 03 atom: subexpr once start 4 end 2 5 states: 4 states: state: START 0, 8 transitions: state: START 0, 4 transitions: trans: removed trans: removed trans: removed trans: char 0 atom 0, to 2 trans: removed trans: char 1 atom 1, to 2 trans: removed trans: char 2 atom 2, to 2 trans: count based 0, epsilon to 4 state: NULL trans: counted 0, char 0 atom 0, to 2 state: 2, 5 transitions: trans: counted 0, char 1 atom 1, to 2 trans: count based 0, epsilon to 3 trans: counted 0, char 2 atom 2, to 2 trans: removed state: NULL trans: counted 0, char 0 atom 0, to 2 state: 2, 5 transitions: trans: counted 0, char 1 atom 1, to 2 trans: count based 0, epsilon to 4 trans: counted 0, char 2 atom 2, to 2 trans: removed state: FINAL 3, 0 transitions: trans: counted 0, char 0 atom 0, to 2 1 counters: trans: counted 0, char 1 atom 1, to 2 0: min 0 max 2 trans: counted 0, char 2 atom 2, to 2 122: Ok state: NULL state: FINAL 4, 0 transitions: 1 counters: 0: min -1 max 2 122:Fail
I made the change on the assumption that the part highlighted on LHS should be similar to RHS (I might of course be completely wrong.). However now for cases like (0|1|2|3){0,3} if I give the input as 1231 it returns success.
Regards Ashwin |