[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [xml] Perl module XML::LibXML not encoding UTF-8 properly [SOLUTION]
- From: Daniel Veillard <veillard redhat com>
- To: Loren Osborn <lsosborn dis-sol-inc com>
- Cc: xml gnome org
- Subject: Re: [xml] Perl module XML::LibXML not encoding UTF-8 properly [SOLUTION]
- Date: Tue, 27 Sep 2005 04:34:23 -0400
On Mon, Sep 26, 2005 at 02:18:28PM -0700, Loren Osborn wrote:
> Daniel Veillard wrote:
> > > UTF-8 makes certain assertions about how multi-byte characters are
> > > represented. While this code change doesn't check all of those
> > > assumptions, but it does ensure that all the non-first bytes have
> their
> > > high bits set correctly. This is likely to catch similar errors at
> > > least regarding Latin characters. If you are feeling ambitious,
> feel
> > > free to check for the assertion that code-points are encoded in the
> > > fewest number of bytes possible. This patch is untested, but I
> prefer
> > > that a developer more familiar with the libxml2 library give it a
> more
> > > thorough once over.
> >
> > that problem is that you add this check in one APIs. I am mot sure
> > it make sense to do this on one entry point and not all the others.
> > I am not sure it makes sense to add the checking to all tree APIs
> > this could be extremely costly at runtime.
>
> Yes, I was expecting such a reaction, but I felt justified putting the
> check where I did because there was already a correctness check there. I
> simply refined it a bit. As far as whether this type of correctness
> check be enforced on all entry-points is certainly an efficiency concern
> that should be considered by libxml2's architects, but I simply wanted
> to submit a code sample to demonstrate how this could be done.
Yes I appreciate that. There is something half baked in that function
it makes sense to fix it, and on the other hand it's asymetric :-)
I'm still uncertain about how to best do this.
Daniel
--
Daniel Veillard | Red Hat Desktop team http://redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]