Re: [xml] preserveWhiteSpace bug



On Thu, Jul 07, 2005 at 09:45:38PM -0400, Trevor Lowing wrote:
Hello,

Seems to be a bug with libxml2 2.6.X with handling XML files with Windows
CRLF style linebreaks. The DOM parser reads the linebreaks as empty text
nodes. preserveWhiteSpace is ignored A temporary workaround is to save the 
file using Unix LF instead of
Windows CRLF linebreaks. The bug has been reported several times on the
PHP buglist but the problem is with the underlying libxml2 library. Code
samples are posted in these bug reports already.



http://bugs.php.net/bug.php?id=31873

  In XML, white spaces in content are significant.
The option of disabling blanks nodes is a non-conforming extra option and
   preserveWhiteSpace = false;
is I assume activating that option. 
Libxml2 is very cautious and tend to ignore that option when it believes
this could lead to information loss.
This option can be exerced by using xmllint --noblanks and seems to work 
as expected for me:

paphio:~/XML -> cat tst.xml
<Programmers>
        <Programmer>
                <firstName>&#321;ukasz</firstName>
                <lastName>Budnik</lastName>
        </Programmer>
</Programmers>
paphio:~/XML -> xmllint --noblanks tst.xml
<?xml version="1.0"?>
<Programmers><Programmer><firstName>&#x141;ukasz</firstName><lastName>Budnik</lastName></Programmer></Programmers>
paphio:~/XML ->

  I don't own a Windows Licence, I can't test the problem on Windows.
It works for me... Also one should be sure that PHP opens the file in
binary mode for reading as this has been a problem in the past.

http://bugs.php.net/bug.php?id=32033

  Seems to be the exact same problem, right ...
  Nothing I can do on my side, but I take patches from people who run and own
a Windows development environment.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]