Re: [xml] Comments droped when inside tag



On Tue, Dec 08, 2009 at 05:47:31PM +0200, Lia wrote:
Hi,

I have a problem related to comments while using mod_proxy_html (version
3.0.1) with libxml2 (version 2.7.6). 
The problem consists in dropping some comments from HTML code while parsing
for URL rewrite.
Having a short look in code, I suspect the unexpected behaviour comes from
libxml, but I am not 100% sure, please forgive the out of topic comment, if
no so. 

When having an HTML document in form:

<html xmlns="http://www.w3.org/1999/xhtml ">
   <head>
       <meta name="description" content="....."/>

  here

      ..

  That piece of non-space text ends the head and open the <body>

      <!--[if lte IE 6]>
            <link href="..." rel="StyleSheet" type="text/css" media="all" />
      <![endif]-->
      <script type="text/javascript" src="..."></script>
       <!--[if lte IE 6]>
              <script type="text/javascript" src="..."></script>
       <![endif]-->
  </head>
     ...
</html>

After parsing with SAX, the first comment is dropped, only the comment found
as last child in the <head> element is preserved.
Does anyone know if dropping comments inside tags, is a problem related to
libxml?

  Except that the HTML prser seems to do its job 

paphio:~/tmp -> xmllint --html tst.html
tst.html:12: HTML parser error : Unexpected end tag : head
  </head>
         ^
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd";>
<html xmlns="http://www.w3.org/1999/xhtml ">
<head><meta name="description" content="....."></head>
<body><p>
      ..
      <!--[if lte IE 6]>
            <link href="..." rel="StyleSheet" type="text/css"
media="all" />
      <![endif]-->
      <script type="text/javascript" src="..."></script><!--[if lte IE
6]>
              <script type="text/javascript" src="..."></script>
       <![endif]-->
     ...
</p></body>
</html>
paphio:~/tmp ->

  oh and the extra space at the end of your xmlns definition makes your
data not XHTML !

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]