Re: [xml] entity handling concept
- From: Daniel Veillard <veillard redhat com>
- To: Christian Glahn <christian glahn uibk ac at>
- Cc: xml gnome org
- Subject: Re: [xml] entity handling concept
- Date: Wed, 18 Apr 2001 16:30:53 -0400
On Wed, Apr 18, 2001 at 09:19:12PM +0200, Christian Glahn wrote:
hi everybody,
during the weekend I tried to get through the concept for the
entity reference concept in the document tree. As it is written
in the TODO file, the handling of % entity references has to
be rethought.
no necessarilly, until we want to be able to edit and save DTDs and
that's definitely not urgent.
While I am implementing a small subsystem to handle XML files, this
problem became important for me.
What kind of application do you use which need to write back PE references ?
Are you sure you really understand.
I would like to read, set up and write
document trees without bothering about PERefs I need to use by chance.
You have the right to use PE references only in the external subset,
I assume you understand the difference between Parameter entities and
general entities.
As I understand the document tree handling, there is no way to do
such a job without rethinking the way of handling entity references a bit
in the current version of libxml2 (which is to me 2.3.5 but I believe
this is also true for 2.3.6).
But PE references *cannot* be handled in a document tree, they
are *by essence* unstructured macro like facilities.
I got through the code of tree.c and realized, that &entity; nodes
store the entity name and the value of the entity in the node
nested inside the document tree rather than using it as a reference.
&entities; are *not* PE references.
As well the child pointer of the node (*children and *last) point to
the entity itself. i don't know why latter is done, since the entity
node is not needed anymore to handle this specific node.
Okay entity nodes in the tree are not for PE References, they are for
general entities (external parsed or internals). There is good reasons
to keep both the tree form and the raw content. If you don't understand
why think about: saving, validating, copying nodes.
While thinking about representing PERefs in the Dtd, I would more
agree into an representation which follows the current implmentation
of "normal" entities (&entities;). This means to store the data (tree)
underneath the %entity;-node rather than replacing the entity reference
entierly with the imported data.
problem is that you cannot build a sane tree for PE references.
Just to take an example look at:
<!ATTLIST html
%i18n;
xmlns %URI; #FIXED 'http://www.w3.org/1999/xhtml'
>
and
<!ENTITY % button.content
"(#PCDATA | p | %heading; | div | %lists; | %blocktext; |
table | %special; | %fontstyle; | %phrase; | %misc;)*">
of course %special; etc are themselve recursive macros ....
I don't see how to map this to a tree in any useful way for libxml
Such an implementation could be done transparent to other parts of
the library (and its extensions) through providing small wrapper functions
(e.g. getNodeNextSibling() or setNodePrevSibling() etc.) to access
the data.
and that would iterate over what kind of nodes ?
The bad thing about this concept is, that it keeps for each reference
node the whole data tree in memory. This could be a problem if such
entities are used intensivly to import large sub-trees.
The bad thing is that parameter entities are not structured, they
are not mappable in a libxml tree in any sane way. And unless you
need to save back modified DTDs I don't see why you would need to
try to store this unstructured format that is DTDs. And honnestly
if you want to do DTD modification with PE support libxml2 doesn't
sound the right tool for the job.
Daniel
--
Daniel Veillard | Red Hat Network http://redhat.com/products/network/
veillard redhat com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]