Re: [xml] HTML-parser: encoding?



In message <4 2 0 58 20011129152709 02371980 pop dijkmat nl>
          Elizabeth Mattijsen <liz dijkmat nl> wrote:

The HTML-parser of libxml2 is very nice.  But I wonder what the real goal
of that parser is (has there been a discussion about that?: if so, I seemed
to have missed that).

If it is there to allow you to take _any_ (dirty) HTML-file and turn it
into a valid XML-dom, then its functionality is still not complete.

Currently, if there is no encoding specification found in an HTML-file,
ISO-Latin-1 is assumed.  However, no check is performed whether all text
characters actually fall within ISO-Latin-1!



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]