Re: [xml] Substitution of nested entity references



On Thu, Apr 18, 2002 at 11:34:24AM +0200, Henke, Markus wrote:
Well, that task is actual challenging my modest abilities... :)
I guess i'm lack of some very internals of libxml.
However, i don't want to give up without a fight, maybe we can
clarify some points i'm thinking about!?

- Is it necessary (and possible (how)) to track the "state"
  of an entity reference, that is, if the reference is
  already parsed, e.g. as part of an element content?

   if ent->children is NULL then it's likely it wasn't parsed already

- An entity that occurs as part of element content is parsed
  using "xmlParseReference(xmlParserCtxtPtr ctxt)" in parser.c,
  which also contains the logic to build the entitiy declaration
  and the appropriate child node list.
  What's the reason that an entity reference that's found in an
  attribute value is handled different?

  I explained that already, the parser cannot build the tree it is done
only after the SAX callback providing the attribute value.

  Is it worth to think about to reuse that logic to parse entity
  refs in attribute values (and if so, how can we get the parser
  context in "xmlStringGetNodeList())?
  Is "xmlParseReference()" called for any appearance of an entity
  reference or just once per reference?
  If the latter, how is it achieved?

  General handling of entities is *extremely complex* you can't parse
it a declaration time. The idea of factoring layers can also completely
break conformance (make sure you know all the rules concerning Well
Formedness and related to entites reference before even starting to think
changing the code done at parse time, beware Dragon Ahead !!!)

- Why the child node list for entity declarations isn't build
  while the doctype declaration is parsed?

  Because you can perfectly define a completely broken entity and 
the document will still be well formed if you don't reference it !!!

That's may enough for now (although there's more...), maybe
some naive questions, but i'm a bit confused now from debugging
through libxml much too long...  =)

  It is very complex, in part because XML entities handling is complex,
in part due to SAX, in part due to my lack of global vision when I coded
this initially.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]