RE: [xml] Adding an entity in a textnode



Hi,
the problem (like daniel wrote before) is, that you would have to add an
entity reference child to the parent element to insert an entity reference.
if you insert a text child with content "afdhgadf&sfdjh;sfgjs" this text
node will contain exactly this character sequence and nothing else! the xml
serialization of this would look like this "afdhgadf&sfdjh;sfgjs"
(asuming the text node is contained in a wellformed document, otherwise no
meaningful serialization exists). Which differs from your sequence in that
the ampersand has been escaped.

Cheers,
  Marcus

-----Original Message-----
From: christoph riedl [mailto:linux daemon gmx de]
Sent: Tuesday, December 10, 2002 5:40 PM
To: xml gnome org
Subject: Re: [xml] Adding an entity in a textnode


Either define german characters as entities, or use their numerical
values. 
Here is a XML document that contains the unicode values for 
all german 
characters:

  [astaroth:~]$ cat test.xml
  <doc>
    &#xc4; &#xe4;
    &#xd6; &#xf6;
    &#xdc; &#xfc;
    &#xdf;
  </doc>

Note that your resulting HTML files will never contain 
&uuml; and similar.

They will either contain the numeric entity, like my test.xml, or a 
character itself, depends on the encoding you use in the 
resulting HTML.

Perhaps I have not been specific enough.
It does not make any difference whether I use &uuml; or &#xdf;
The problem is that if I (generally speaking) add a textnode 
that containes
an entitynode AS TEXT to the document, in the output the 
leading & of the
entity is always replaced by &amp;.
So the output of create_text_node ( "some text &#xdf; more text" ) 
NOTE: the &#xdf; is pure text and NOT an entity node!
is always "some text &amp;#xfd; more text". And therefore I 
will never see
the
special char in the html output. You only get &#xdf; (the & 
actually is a
&amp;).

I hope this claryfies my problem.


christoph riedl wrote:
Hello everyone.
I'm basically working on a PHP project but as php uses 
libxml, I guess
here
is the place to get a solution to my problem. 
Unfotunately I don't know where my "problem" is generated 
so I can only
describe the
symtoms in the hope that any of you can hopefully help me out.

I'm building up a xmldoc from scratch. Then I create a text 
node with the
following content: "german umlaut &uuml;". When I then 
dumpmem the whole
thing,
the result lookes like "german umlaut &amp;uuml;".
The whole project is intended for generating html files in 
the end (after
an
additional processing via sxlt). So a textnode containing 
an entity would
be
perfectely ok.
Is there a way that I can prevent this replacement of & 
with &amp; in the
textnode?
My current workaround for this delemma looks something like:
ereg_replace ( "&amp;", "&", result_from_dump_mem() );
As you might guess, this is very unefficient.

I would be very thankfull for any note that would point me toward a
solution
or otherwise
clear out where this comes from and why it is the way it is.

Regards,
Christoph Riedl




-- 
+++ GMX - Mail, Messaging & more  http://www.gmx.net +++
NEU: Mit GMX ins Internet. Rund um die Uhr für 1 ct/ Min. surfen!

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml







[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]