The htmlCtxtReset() function contains the following line:

    ctxt->charset = XML_CHAR_ENCODING_UTF8;

However, htmlNewParserCtxt() and htmlInitParserCtxt() do not do this, they leave charset set to zero.

This means that causing htmlCtxtReset() changes the behaviour of htmlCtxtReadFile() compared to using a fresh parsing context.

Is this a bug? It certainly seems a bit awkward.


