Re: [xml] Parsing tag-soup HTML

Hi Nick,

 Coming back with some kind of definition of what a tag soup parser
behaviour is is probably more important than digging in libxml2 code.
I am not sure we can emulate web browser parsers behaviour.

It's worth looking at the HTML5 specification:

Section 8, "The HTML Syntax", is the relevant bit. It still needs some work, but it's actively being developed and is a good starting point for figuring out how to treat messy real world HTML and hopefully get similar behaviour to web browsers.

Best regards,


Print XML with Prince!

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]