Re: [xml] xmllint --html problem?



Hi Daniel,

At 10:37 AM 11/9/01 -0500, Daniel Veillard wrote:
  Can you send it as an attachment, mail tools cannot be trusted
to preserve the main part.

Sent it by seperate mail, not to the list...


  What is the original encoding ? I think the problem might be there
the initial conversion fails because HTML assumes ISO-8859-1
and this may not be the case (though it could be Portugese names and hence
I would expect that encoding ...).

That wouldn't explain the correct conversion of the same character earlier on the same line...


> I looked at the source code, but must admit I'm out of my league there. ;- it's more complex than that, the sequence of bytes the parser may see
at that point may already have been translated from ISO-8859-1 to UTF8
implicitely.

Then, maybe the problem is the error recovery always assuming only a single byte to recover, rather than potentially 2 or more bytes in the case of UTF?


Elizabeth Mattijsen





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]