Re: [xml] encoding question..
- From: Daniel Veillard <veillard redhat com>
- To: Manos Moschous <moschous ics forth gr>
- Cc: libXml2 <xml gnome org>
- Subject: Re: [xml] encoding question..
- Date: Mon, 20 Sep 2004 08:18:43 -0400
On Mon, Sep 20, 2004 at 01:58:38PM +0300, Manos Moschous wrote:
Hi,
i do
$ ./testHTML.exe --encode ISO-8859-7 htmlSites/index.html
htmlSites/index.html:12: error: Input is not proper UTF-8, indicate encoding
!
that mean that doc contains non-UTF-8 strings.
the index.html is the greek version of www.google.com.gr with
encoding-type:ISO-8859-7
what am i have to do to parse the html document normally?
When saving to a file for example you lost the HTTP headers which
may indicate the encoding of the original.
Anyway parsing google HTML is pointless, there is a direct xml interface.
Jumping on code and asking question every 5 minutes is not the proper way
to build right code. You should look a bit more globally and read about the
issues first. I pointed you at the encoding page already ! Please refrain
from asking at every little step in the way, read the docs, and think !
Daniel
--
Daniel Veillard | Red Hat Desktop team http://redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]