Re: [xml] Problem with xmlReadFile on Windows and 0x10 characters

On 6/23/2010 16:41, James Ytterstene wrote:

Im writing a c++ wrapper where i read a XML file with the xmlReadFile("filename", NULL, 0);
I have created the file via libxml2 calls and got it saved to disk.

The file im trying to read in my example is:
<?xml version="1.0" encoding="UTF-8"?>
<configuration name="testSaveLoadXMLFileLayout">
<parameter name="floatParam1" datatype="float">92345.1</parameter>

Now my problem:
If i have the file unchanged from any windows editor the line ending is CR only but if someone edit the file it will be changed to CRLF (Stupid windows editors but we must use them) If i now try to read the file back in libxml2 i will get an extra node at each line only containing 0x10. Blanks on the new line seems to be cleared ok. If i change the xmlReadFile and add the option XML_PARSE_NOBLANKS i can read the file back ok. But when reading about that option i find
many posts about not to use it, so im confused here.
Hi, James.

We're using this option in Wine for ms-like xml processing wrapper. So it's used for both CR/CRLF files, cause some files bundled with app could have a CRLF formatting. And I have to say that there's no problem with it, and no bugs reported for open/parsing that could be affected by this, AFAIK.

If I remember correctly compliant parser shouldn't care about return codes inside markups, and text/CDATA/comments should be processed as is in file.

When i read about libxml2 and how files should be parsed i get the feeling that the parser should handle the CRLF when reading files and always save the new files with CR only. So the extra CRLF shouIdn't be any issue but I can be wrong here. Is there any general solution for the parsing of files so the CR CRLF doesnt add any extra nodes?
Yes, saving is another story. In Wine to get more win-like output we have to mess with output stream and add line-feeds. So libxml2 doesn't provide a way to customize this thing (and it shouldn't actually, according to standards). If you need a properly formatted output you need to rewrite a node dumping yourself.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]