[xml] encoding problem with iso-8859-1?

From: Morus Walter <morus walter tanto-xipolis de>
To: xml gnome org
Subject: [xml] encoding problem with iso-8859-1?
Date: Wed, 20 Mar 2002 11:48:16 +0100

Hi,

when I try to parse a document encoded in iso-8859-1 I get a
an error message, that the input is not proper UTF-8 although
the encoding is declared iso-8859-1:
(1194 ~/Drop-Box) xmllint 123103.xml
123103.xml:3: error: Input is not proper UTF-8, indicate encoding !
<title>Israel reagiert mit HÃÃ¤rte auf AnschlÃ¤ge vom Wochenende</title>
                            ^
123103.xml:3: error: Bytes: 0xC3 0xC3 0xA4 0x72
<title>Israel reagiert mit HÃÃ¤rte auf AnschlÃ¤ge vom Wochenende</title>
                            ^
The document contains a very long line (~1750 characters).
However the problem does not seem to be connected to this (at least
not directly)
The problem disapears if
- I add NL after the <title> and before the </title> tag
- I delete the <head> tag and everything after the title-element (except
  the closing </nitf>).

So somethings seems to go wrong with the character conversion.

libxml2 version is 2.4.18, libc is 2.2.4, OS is linux, kernel 2.4.10
I attacted the xml file.

greetings
        Morus

PS: I didn't enter the bug in bugzilla, since I didn't see a way,
to add a file there, and I think the sample file is important.

Attachment: 123103.xml
Description: Binary data

Follow-Ups:
- Re: [xml] encoding problem with iso-8859-1?
  - From: Daniel Veillard
- Re: [xml] encoding problem with iso-8859-1?
  - From: Daniel Veillard

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]