[xml] --html option breaks encoding
- From: Damien Deschodt <damien d gmx com>
- To: xml gnome org
- Subject: [xml] --html option breaks encoding
- Date: Sun, 12 Nov 2017 21:06:44 +0100
Hello,
I am trying to get an epub out of a html documentation, after some XPath queries. But XPath requires
perfectly valid xml, so I tried to format it with xmllint. Then if I'm not mistaken, the --html option breaks
the encoding.
Without --html :
$ echo '<title>Introduction — Vue.js</title>' | xmllint --encode UTF-8 --format -
[...]
<title>Introduction — Vue.js</title>
With --html (seems to be required for entire documents). The "—" is transformed to "—" :
$ echo '<title>Introduction — Vue.js</title>' | xmllint --html --htmlout --encode UTF-8 --format -
[...]
<title>Introduction — Vue.js</title>
$ xmllint --version
xmllint: using libxml version 20904
compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N
Catalog XPath XPointer XInclude Iconv ICU ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules
Debug Zlib Lzma
Thanks in advance.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]