[xml] How to extract text content using the PHP interface?



Consider the following simple XML and associated PHP program
to extract the string value of "file/*" and store it in a PHP
structure:

<files>
  <file>
    <name>casin.flv</name>
    <size>717536</size>
  </file>
  <file>
    <name>cat.flv</name>
    <size>725477</size>
  </file>
  <file>
    <name>tren1.flv</name>
    <size>5291492</size>
  </file>
</files>

<?php
$dom = new DOMDocument;
$dom->load( $argv[1]);
$xpath = new DOMXPath( $dom);
$nodelist = $xpath->query( '/files/file/*/text()');
$files = array();
if ( $nodelist ) {
  for ( $i = 0; $i < $nodelist->length; ) {
    $name = $nodelist->item($i++)->wholeText;
    $size = $nodelist->item($i++)->wholeText;
    $files[] = array( 'name' => $name, 'size' => $size);
  }
}
else {
  echo "nichts gefunden\n";
}
var_dump( $files);

This achieves the desired effect, but it is not nice at all. I'd
never want to query for "file/*/text()", and then access the
obscure wholeText property. I'd rather want to query for "file/*"
and then call something like textContent(), to_literal(), or maybe
serialize() or toString() - as in Perl.

use strict;
use warnings;
use XML::LibXML;
my $file = shift or die 'Datei!';
my $parser = XML::LibXML->new;
my $doc = $parser->parse_file( $file);
my $xpc = XML::LibXML::XPathContext->new( $doc);
my @strings = map $_->textContent, $xpc->findnodes( '/files/file/*');
print "$_\n" for @strings;

I know Perl is better, but that's not the point. I'm rather wondering
if I'm missing some LibXML-related thing in PHP?

Michael Ludwig



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]