Re: [xml] PEReference patch part 2 and (non-)future of ctxt->token



On Tue, Jun 04, 2002 at 05:05:27PM +0200, Peter Jacobi wrote:
Content-Description: Mail message body
Hi Daniel, All,

Find attached the second step for correctly validating obscure uses of 
PEReferences. In addition it elimates the last (non-zero) assignments to 
ctxt->token, see discussion below.

  Hum, those patches touches some of the hardest part of the parser,
sorry for the delay but it took me a bit of time before being able to 
forcus on this.

What's the patch doing: when including a PEReference (other than in
an entity value), it doesn't set input to xmlNewEntityInputStream but
to newly defined xmlNewBlanksWrapperEntityStream. This consists
of a rewritten PEReference with spaces added at front and end. A test
whether the current input is already the from 
xmlNewBlanksWrapperEntityStream prevents infinite recursion.

The old ctxt->token = ' ' assignments which generate only one sided 
blank padding are removed.

One more test of IS_BLANK is replaced by skipped = xmlSkipBlanks(),
as the IS_BLANK macro doesn't pop input streams.

  Hum, couple of bad news:
    - first the patches are corrupted in some ways, I had to ask patch
      to ignore space differences to be able to apply them (I applied
      both patches).
    - skipped wasn't defined in your patch, I had to explicitely define it
      as a local variable, otherwise gcc would simply not compile the file

but the main problem is that once applied even really basic stuff like
test/bigentname.xml break in the regression tests. I mean nearly all 
test using entities broke as a result.
Trying with the test you provided also result in an error:

paphio:~/XML -> ./xmllint --valid t4a.xml 
t4a.dtd:1: error: Entity value required
<!ENTITY % percent "&#x25;">
                 ^
t4a.dtd:1: error: xmlParseEntityDecl: entity percent not terminated
<!ENTITY % percent "&#x25;">
                  ^
t4a.dtd:1: error: Extra content at the end of the document
<!ENTITY % percent "&#x25;">
                  ^
t4a.xml:3: error: Entity 'abc' not defined
<root>&abc;</root>
           ^
paphio:~/XML -> 

  I assume it's something you tested on your side so the problem probably arose
from the patch problems.

Could you make the diff on a Linux or Unix box and send it in a way to be sure
won't be corrupted by mail agents, possibly also sending your parser.c file
along too so conflicts can be resolved if needed ?
I'm sorry but I can't commit the patch if it breaks so many things...

The attached t4a.xml and t4a.dtd are the test case which need this 
second part of the PEReference patch.

  well but doesn't work here :-\

So, in the context of further performance patches, I'm tempted to 
eliminate all uses of ctxt->token, which will help to streamline some 
code.

Yup. Any chance you get access to a box where CVS/diff/editors won't mess with
the definition of the blanks and end of lines and where you would be able
to run "make tests" ?

There is a stylistic issue to decide: with the elimination of ctxt->token
following four expressions are identical after preprocessing:

RAW
CUR
NXT(0)
*ctxt->input->current

Do you have a preference which of these to use in the future?
 
  CUR, definitely,

  thanks !

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]