Re: [xml] Adding an entity in a textnode

Either define german characters as entities, or use their numerical
Here is a XML document that contains the unicode values for all german 

  [astaroth:~]$ cat test.xml
    &#xc4; &#xe4;
    &#xd6; &#xf6;
    &#xdc; &#xfc;

Note that your resulting HTML files will never contain &uuml; and similar.

They will either contain the numeric entity, like my test.xml, or a 
character itself, depends on the encoding you use in the resulting HTML.

Perhaps I have not been specific enough.
It does not make any difference whether I use &uuml; or &#xdf;
The problem is that if I (generally speaking) add a textnode that containes
an entitynode AS TEXT to the document, in the output the leading & of the
entity is always replaced by &amp;.
So the output of create_text_node ( "some text &#xdf; more text" ) 
NOTE: the &#xdf; is pure text and NOT an entity node!
is always "some text &amp;#xfd; more text". And therefore I will never see
special char in the html output. You only get &#xdf; (the & actually is a

I hope this claryfies my problem.

christoph riedl wrote:
Hello everyone.
I'm basically working on a PHP project but as php uses libxml, I guess
is the place to get a solution to my problem. 
Unfotunately I don't know where my "problem" is generated so I can only
describe the
symtoms in the hope that any of you can hopefully help me out.

I'm building up a xmldoc from scratch. Then I create a text node with the
following content: "german umlaut &uuml;". When I then dumpmem the whole
the result lookes like "german umlaut &amp;uuml;".
The whole project is intended for generating html files in the end (after
additional processing via sxlt). So a textnode containing an entity would
perfectely ok.
Is there a way that I can prevent this replacement of & with &amp; in the
My current workaround for this delemma looks something like:
ereg_replace ( "&amp;", "&", result_from_dump_mem() );
As you might guess, this is very unefficient.

I would be very thankfull for any note that would point me toward a
or otherwise
clear out where this comes from and why it is the way it is.

Christoph Riedl

+++ GMX - Mail, Messaging & more +++
NEU: Mit GMX ins Internet. Rund um die Uhr für 1 ct/ Min. surfen!

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]