[xml] SAX API returns wrong length for characters containing non-ASCII



Hello,

I'm trying to parse a xml document with the SAX API. The document
containts some some german "umlaute". A short example:
<?xml version="1.0" encoding="UTF-8"?>
<Mediathek>
<X><n>hello from Köln</n><g>http://www.koeln.de</g></X>
<X><n>öhello from Köln</n><g>http://www.koeln.de</g></X>
</Mediathek>

The callback to my charactersSAXFunc tells me the String inside
<n>...</n> of the first line is 12. So the String I save for later use
is just "hello from K".

Whereas for the second line it returns the correct length of 18, so I
get the complete String. The difference is that it starts with an
non-ascii sign. The same happens btw. with french letters.

Possibly I forgot to tell something to the parser?

Thanks for your great effort :)



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]