[xml] correct way to output HTML without the <?xml...



I've got a libxml2 DOM with HTML in it (note: it's just an XML DOM
with HTML in it).

I want to output it as well-formed HTML, preferably with the HTML4
transitonal DTD but no xml declaration, eg:

  <DOCTYPE HTML ...>
  <html>
    <head>
      <link rel="xxx" .../>
    </head>
    .
    .
    .
  </html>

I'm not sure of the right way to achieve this.

I tried this:

    htmldoc = libxml2.newDoc("1.0")
    html4_transitional = htmldoc.newDtd("HTML", "-//W3C//DTD HTML 4.01 Transitional//EN",
                                        "http://www.w3.org/TR/html4/loose.dtd";)
    htmldoc.addChild(html4_transitional)
    .
    .
    .
    xml_output_buffer.htmlDocContentDumpFormatOutput(htmldoc, "utf-8", 1)

but that spits out HTML without the DTD or a well-formed DOM.


Outputting with:

   xml_output_buffer.saveFormatFileTo(htmldoc, 'UTF-8', 1)

outputs the XML including the XML declaration, which I don't want
since this is well-formed HTML.


Anybody know the right way to go about this?


Nic



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]