Re: [xml] xmllint: Why does it convert UTF-8 to numeric entity refs?

On Mon, 4 Aug 2003, Daniel Veillard wrote:

(It would be nice to have an optional flag to xmllint allowing a choice
of output between characters and numeric refs.)

  There is just too many option already, and this would require propagating
one more set of flags for the serialization routine. I see no justification
for adding more complexity to the serializer. The outputs are equivalent
anyway, as soon as the data is processed by an XML parser both version
cannot be distinguished,

Yes, I understand this. I should have explained that my main concern is
when using "--format" to reformat source XML code. If the purpose of
running xmllint is to provide code that is easier for a human to work
with (as opposed to a parser), UTF-8 is often preferable to ASCII +
numeric entity references. For example, if one does not have a parsing
XML editor and needs to use a plain Unicode-aware text editor like Vim
to edit XML documents that contain a lot of Chinese, Arabic, etc.

Anyway, there are other programs out there to do XML reformatting
(although some of them are buggy), so I would not consider this a high
priority for xmllint.


David Sewell

David Sewell, Managing Editor
Electronic Imprint, The University of Virginia Press
PO Box 400318, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: dsewell virginia edu   Tel: +1 434 924 9973

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]