[xml] SAX API returns wrong length for characters containing non-ASCII


I'm trying to parse a xml document with the SAX API. The document containts some some german "umlaute". A short example:
<?xml version="1.0" encoding="UTF-8"?>
<X><n>hello from Köln</n><g>http://www.koeln.de</g></X>
<X><n>öhello from Köln</n><g>http://www.koeln.de</g></X>

The callback to my charactersSAXFunc tells me the String inside <n>...</n> of the first line has a length of 12. So the String I save for later use is just "hello from K".

Whereas for the second line it returns the correct length of 18, so I get the complete String. The difference is that it starts with a non-ascii sign. The same happens btw. with french letters.

Possibly I forgot to tell something to the parser?

Thanks for your great effort :)

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]