[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [xml] libxml2 add <p> tag
- From: Morus Walter <morus walter tanto-xipolis de>
- To: ml oxymium net
- Cc: xml gnome org
- Subject: Re: [xml] libxml2 add <p> tag
- Date: Mon, 7 Oct 2002 10:08:56 +0200
Hi,
>
> I've found the following problem with libxml2 2.4.16
>
> [manu xx yy]$ cat /tmp/test.html
> <gsweb name="UploadFile">Upload</gsweb>/<gsweb name="ProcessFile">Processing</gsweb>
>
> [manu xx yy]$ xmllint /tmp/test.html --html
> /tmp/test.html:1: error: Tag gsweb invalid
> <gsweb name="UploadFile">Upload</gsweb>/<gsweb name="ProcessFile">Processing</g
> ^
> /tmp/test.html:1: error: Tag gsweb invalid
> <gsweb name="UploadFile">Upload</gsweb>/<gsweb name="ProcessFile">Processing</g
> ^
> <?xml version="1.0" standalone="yes"?>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
> <html><body><gsweb name="UploadFile">Upload</gsweb><p>/<gsweb name="ProcessFile">Processing</gsweb></p></body></html>
> ^
> ____________________________________________________|
>
>
> I get the same thing with a file containing:
> <?xml version="1.0" standalone="yes"?>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
> <html><body><a name="UploadFile">Upload</a>/<a name="ProcessFile">Processing</a></body></html>
>
>
> I've found nothing about this problem in the list. Is it a fixed bug or how can I avoid this <p> addition ?
>
This is a characteristic of libxmls html parser.
The parser has a built in "knowledge" about allowed structures and adds
paragraphes if it finds content in contexts where it "thinks" they were
required.
It might be considered a bug, that the paragraph is not inserted before
the first <a>-element in your second example (the p insertion seems to
be triggered by PCDATA content and not by inline elements like <a>,
which is not to consequent).
OTOH tag soup parsing is always a mess...
And there no way do guess the right behaviour for user created elements
like gsweb.
Apart from modifying libxml, there is AFAIK no way of preventing the
<p> addition in the html parser.
HTH, greetings
Morus
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]