[xml] Reordering of meta tags -- Bug?



Hi everybody,

I use the latest versions of libxml2 and XML::LibXML to develop a
natural language processing system under Perl. I guess that the
following problem is libxml2-related and not a bug in XML::LibXML.
Apologies if that's not actually the case.

I use a well formed XHTML document as XML input:

    my $p = XML::LibXML->new();      
    $p->expand_entities(0);
    $p->keep_blanks(1);
    $p->pedantic_parser(1);
    my $dom = $p->parse_string($input);
    my $r = $dom->getDocumentElement;

Then I do a recursive descent, beginning at the root node, calculate
a couple of properties for every node and do a 

    my $output = $dom->toString()

in order to send the resulting code to the next processing stage 
of the system.

Take a minimal example like the following:

    <?xml version="1.0" encoding="iso-8859-1"?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
    <html xmlns="http://www.w3.org/1999/xhtml";>
      <head>
        <title>foo</title>
        <meta http-equiv="Content-Type" content="text/html"></meta>
      </head>
      <body>
        Hello, World.
      </body>
    </html>

If I put this XHTML code into a LibXML-object and call toString(), the
title and the meta tags are swapped:

    ...
    <head><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
      <title>foo</title>
    ...

As far as I understand after browsing this mailing list's archive, 
a reordering of meta-tags was introduced some time ago. Now, the 
problem is that I'm not able to modify the <meta http-equiv="..."> 
tag in any way. I can fetch its name, check for attributes, i.e., 
things like

        $elem->hasAttribute("http-equiv")

work, but 

    $elem->setAttributeNS("myNS", "foo", "bar") 

does not have any effect at all. Even removing the node has no effect, 
because as soon as I call toString(), the node in question magically 
reappears. Is there any way to bypass this behaviour or to actually 
delete this node?

Kind regards,
        Georg
-- 
Georg Rehm uni-giessen de           http://Georg-Re.hm
                                    http://www.uni-giessen.de/germanistik/ascl/
Research Group for Applied and Computational Linguistics, University of Giessen



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]