[xml] Question about UTF-8 output / character references


i'm sorry if this may be a newbie question.. I'm serializing an xml document using xmlSaveDoc() and I'm using utf-8 output encoding. It seems that non-ascii unicode characters are always saved as xml character references, not as utf-8 encoded characters. So for example, an euro sign is alway output as &#20AC; and never the three bytes \xe2, \x82, \xac.

Is this correct? When I study the source it seems that xmlEscapeEntities() always outputs character references. The comment near the top of the function seems to indicate however that the function is only used when there is no encoding. Is this still true?


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]