[xml] xmllint --html --xmlout

How robust is

xmllint --html --xmlout

Is it possible to confuse it so badly it won't continue or will generate ill-formed markup? Or will it keep on trucking no matter what?

How does the HTML parser handle bogons (unrecognized elements)? Are they treated as empty or dropped or something else?

How good an alternative is this for TagSoup and Tidy?

I'm working on a book about converting messy old HTML to clean XHTML, and I'm trying to decide exactly how much of each tool to recommend when.

ïElliotte Rusty Harold  elharo metalab unc edu
Java I/O 2nd Edition Just Published!

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]