[xml] How to parse HTML files with ampersands in URI not encoded as "& " ?



Hello,

In pretty new to this list, and guess what, sorry for my English :)

I have an issue concerning parsing HTML files with the HTMLparser API.
The web page has attributes in tags which contain URI with ampersands
not encoded as "&".
Obviously, the parser (with the HTML_PARSE_RECOVER option) returns an error:
htmlParsEntityRef: expecting ';'

The xmlDoc created lacks of many elements.

So, I would like to know if there is a way to parse such HTML files with libxml?

Thanks,
Pierre

PS: I apology in advance if I have missed an explanation posted in the
previous posts



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]