Re: [xml] xmllint --html problem?
- From: Daniel Veillard <veillard redhat com>
- To: Elizabeth Mattijsen <liz dijkmat nl>
- Cc: xml gnome org
- Subject: Re: [xml] xmllint --html problem?
- Date: Fri, 9 Nov 2001 08:46:03 -0500
On Fri, Nov 09, 2001 at 01:03:40PM +0100, Elizabeth Mattijsen wrote:
Does the following sequence of commands indicate a problem in the HTML
parsing of libxml or not?
No
# xmllint --version
xmllint: using libxml version 20409
# xmllint --html --encode UTF8 71.html >71.xml 2>/dev/null
This parse an HTML resource and save an HTML resource.
I assume there was errors (2>/dev/null) so I don't have much context.
# xmllint --noout 71.xml
71.xml:53: error: Input is not proper UTF-8, indicate encoding !
ophy of Education, The</a><br/>Edited by Michael A. Peters (New Zealand)Ã? &
^
1.xml:53: error: Bytes: 0xC3 0x20 0x50 0x61
ophy of Education, The</a><br/>Edited by Michael A. Peters (New Zealand)Ã? &
File "71.html" available on request: it's about 53K which I thought would
be too large to send to the list right away...
You're asking the XML parser to parse an HTML resource, it fails,
this is not surprizing.
xmllint does not magically convert HTML to XHTML. Use Tidy for this
(see the W3C page for pointers).
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard redhat com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]