Re: [xml] Can I add an htmlElemDesc



At 06:11 PM 09/21/01 -0400, Daniel Veillard wrote:
Next question.  When parsing HTML can I add an htmlElemDesc?  I'd like to
be able to define new tags so that I don't get a warning such as "Tag foo
invalid".

 Hum, you can't in libxml.

html40ElementTable is static const non modififiable.
You can avoid the warning, but that warning is there for a good reason,
HTML is not designed to be extensible (some SGML'ers may flame me for
saying so but it is true for HTML as most people uses it - an ubiquitous
presentation format).

Yes, I understand.  I'm adding libxml2 to swish-e, and there's two uses for
this.  One is to allow an <ignore> tag around text that should not be
indexed.  I think inktomi does a similar thing.  The other is that swish is
sometimes used to index chunks of html that are marked by a <tag>s so the
searches can be limited to parts of a page.  Yes, truly ugly.  It would be
better to pass the HTML content from parsing XML back to the HTML parser.

Anyway, is the only option to scan the warning message and if it's an OK
tag to skip printing the message?

Thanks,



Bill Moseley
mailto:moseley hank org




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]