Re: [xml] How is ignorableWhitespace defined?

Den 03 Sep 2001 14:13:32 -0400 skrev Daniel Veillard:
On Mon, Sep 03, 2001 at 08:01:25PM +0200, Jonas Borgström wrote:

How is ignorableWhitespace in SAX defined?

  HTMLparser.c line 1749

Yes, I have read this function, I has a bit unclear, but my question was
where you found the information how to handle whitespaces. The function
areBlanks contains several "cases" for how to handle whitespaces with
different elements liks "html", "b" and "em".

The section about whitespaces in the html 4.0.1 specification doesn't
specify very much about how to handle whitespaces in this case.

Is there some other document I can read?

static int areBlanks(htmlParserCtxtPtr ctxt, const xmlChar *str, int len) {
    int i;
    xmlNodePtr lastChild;

    for (i = 0;i < len;i++)
        if (!(IS_BLANK(str[i]))) return(0);

    if (CUR == 0) return(1);
    if (CUR != '<') return(0);
    if (ctxt->name == NULL)
    if (xmlStrEqual(ctxt->name, BAD_CAST"html"))
    if (xmlStrEqual(ctxt->name, BAD_CAST"head"))
    if (xmlStrEqual(ctxt->name, BAD_CAST"body"))

  your text is child from body, hence the result.

If I add a paragraph like this:

<span>FOO</span>        <span>BAR</span>

"./testHTML jborg.html"  Still gives the same result, but
"./testHTML --sax jborg.html" generates a sax.characters()
call now instead of a sax.ignorableWhitespace.

But this might be because the debugSaxHandlers doesn't call
the "real" sax handlers.

Should html-browsers (gtkhtml2 in this case) also render
text from the sax->ignoreableWhitespace callback?

  in general no, but HTML rendering having no defined meaning
in terms of SAX it's up to you to decide.

mozilla and konqueror renders space between "FOO" and "BAR".

  doesn't help me at all, if you told me *WHY* they do so then
I may make some progress.

I would if I knew why, that is what I'm trying to find out.

/ Jonas

Jonas Borgström                  jonas codefactory se
CodeFactory AB         
Office: +46 (0)90 71 86 10       Cell: +46 (0)70 248 89 58

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]