RE: [xml] Adding an entity in a textnode

the problem (like daniel wrote before) is, that you would have to add an
entity reference child to the parent element to insert an entity reference.
if you insert a text child with content "afdhgadf&sfdjh;sfgjs" this text
node will contain exactly this character sequence and nothing else! the xml
serialization of this would look like this "afdhgadf&sfdjh;sfgjs"
(asuming the text node is contained in a wellformed document, otherwise no
meaningful serialization exists). Which differs from your sequence in that
the ampersand has been escaped.


-----Original Message-----
From: christoph riedl [mailto:linux daemon gmx de]
Sent: Tuesday, December 10, 2002 5:40 PM
To: xml gnome org
Subject: Re: [xml] Adding an entity in a textnode

Either define german characters as entities, or use their numerical
Here is a XML document that contains the unicode values for 
all german 

  [astaroth:~]$ cat test.xml
    &#xc4; &#xe4;
    &#xd6; &#xf6;
    &#xdc; &#xfc;

Note that your resulting HTML files will never contain 
&uuml; and similar.

They will either contain the numeric entity, like my test.xml, or a 
character itself, depends on the encoding you use in the 
resulting HTML.

Perhaps I have not been specific enough.
It does not make any difference whether I use &uuml; or &#xdf;
The problem is that if I (generally speaking) add a textnode 
that containes
an entitynode AS TEXT to the document, in the output the 
leading & of the
entity is always replaced by &amp;.
So the output of create_text_node ( "some text &#xdf; more text" ) 
NOTE: the &#xdf; is pure text and NOT an entity node!
is always "some text &amp;#xfd; more text". And therefore I 
will never see
special char in the html output. You only get &#xdf; (the & 
actually is a

I hope this claryfies my problem.

christoph riedl wrote:
Hello everyone.
I'm basically working on a PHP project but as php uses 
libxml, I guess
is the place to get a solution to my problem. 
Unfotunately I don't know where my "problem" is generated 
so I can only
describe the
symtoms in the hope that any of you can hopefully help me out.

I'm building up a xmldoc from scratch. Then I create a text 
node with the
following content: "german umlaut &uuml;". When I then 
dumpmem the whole
the result lookes like "german umlaut &amp;uuml;".
The whole project is intended for generating html files in 
the end (after
additional processing via sxlt). So a textnode containing 
an entity would
perfectely ok.
Is there a way that I can prevent this replacement of & 
with &amp; in the
My current workaround for this delemma looks something like:
ereg_replace ( "&amp;", "&", result_from_dump_mem() );
As you might guess, this is very unefficient.

I would be very thankfull for any note that would point me toward a
or otherwise
clear out where this comes from and why it is the way it is.

Christoph Riedl

+++ GMX - Mail, Messaging & more +++
NEU: Mit GMX ins Internet. Rund um die Uhr für 1 ct/ Min. surfen!

xml mailing list, project page
xml gnome org

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]