Re: [xml] XML and Unicode Determination



On Thu, Mar 02, 2006 at 12:08:36PM -0600, Browder, Tom wrote:
When using the DOM or text reader parsers I would like to know when they
have found the first unicode character not in the ASCII range.  Is there
any way to do so other than to use the internal function 'UTF8Toascii'
on every xmlChar string I see?

  check the first byte of content you received with the high order bit
set, since all strings are UTF-8 internally.
  There is no way the parser can tell you when this happen because 
encoding conversion to UTF-8 are done block by block as a prior step to
parsing.

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]