Re: [xml] Question about UTF-8 output / character references



On Sun, Nov 13, 2005 at 12:55:48AM +0100, Geert Jansen wrote:
Hi,

i'm sorry if this may be a newbie question.. I'm serializing an xml 
document using xmlSaveDoc() and I'm using utf-8 output encoding. It 
seems that non-ascii unicode characters are always saved as xml 
character references, not as utf-8 encoded characters. So for example, 
an euro sign is alway output as &#20AC; and never the three bytes \xe2, 
\x82, \xac.

  hum strange... it may depends where the characters are located too.

Is this correct? When I study the source it seems that 
xmlEscapeEntities() always outputs character references. The comment 
near the top of the function seems to indicate however that the function 
is only used when there is no encoding. Is this still true?

  Does your document hold a doc->encoding ?

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]