Re: [xml] Release of libxml2-2.7.1
- From: Rob Richards <rrichards ctindustries net>
- To: veillard redhat com
- Cc: xml gnome org, Colin Guthrie <gmane colin guthr ie>
- Subject: Re: [xml] Release of libxml2-2.7.1
- Date: Tue, 23 Sep 2008 11:21:31 -0400
Daniel Veillard wrote:
On Thu, Sep 04, 2008 at 06:16:15PM +0100, Colin Guthrie wrote:
Rob Richards wrote:
Colin Guthrie wrote:
Hi Daniel,
Daniel Veillard wrote:
Python serialization code was broken in 2.7.0 so here is a new release
with a cleanup of that code, even more isolation of the new buffer type
from user code and a couple of fixes:
* Portability fix:
- Borland C fix (Moritz Both)
* Bug fixes:
- python serialization wrappers
- XPath QName corner case handking and leaks (Martin)
* Improvement:
- extend the xmlSave to handle HTML documents and trees
* Cleanup:
- python serialization wrappers
I hope that one is a good one !
Not sure if this is the right avenue to report this bug but I'm
having some fairly serious regressions.
I've not tested 2.7.0, but 2.7.1 is definitely affected.
I noticed the problem in PHP parsing of XML, and have submitted a
bugreport and a test case to the following bug:
https://qa.mandriva.com/show_bug.cgi?id=43486
Essentially, when using PHPs older parsing functions (which I thought
were built on expat rather than libxml2 but it seems not), escaped
entities in cdata are completely ignored. i.e. > < etc.
See the attachment on the above bug for a test case which requires
PHP to be installed.
Hopefully someone can shed some light on the situation.
Can you report this as a PHP bug? It looks like some really old hack
code in the PHP extension in order to mimic some specific expat
functionality. The behavior change you see though resulting from a code
changes in libxml2 is really due to the hackish code in the extension
doing things it wasnt meant to be doing. You're better off using the
xmlreader extension in PHP in any case as its simpler, faster, more
powerful and doesn't have any legacy issues like the old xml extension.
Thanks for the info Rob.
I'll report this to the PHP people.
I'm well aware there are better PHP extensions for XML processing, but
sadly I'm maintaining some old code that I don't really want to rip
apart unless I have to!
The only thing I can think of is that libxml2 doesn't anymore ask
though a SAX callback when looking for entities references if they
are in the predefined set. This comes in essence by an old decision
from the XML working group stating that user definition for those 5
entities could not override the default predefined ones. So I guess
that change is logical. Now what is done on top of SAX to result
in that bug, I don't really know :-\
The short story is that in the mimicking of the old behavior when the
extension used expat, entities are not replaced and no warnings are
desired when its external and no defined.
A hack was used in the extension where wellFormed is set to 0 when the
context is created. Then when the getEntity callback is called, the
extension is handling the character output itself - only the entity
reference is output, not its content. Once done, because the document is
not well formed (supposedly), nothing is else is done with the entity.
Now that the pre-defined entities are not passed to the callback, they
are no longer handled. Not modifying the flag wellFormed flag results in
the pre-defined entities working properly, but causes the entity to be
parsed which in turns kicks off all callbacks causing the content of the
entity to be pushed through all the extensions callbacks. I've been
currently looking at both trying to work around the change while keeping
the hack in place as well as exploring completely re-writing the entity
handling, but not sure if either of those solutions will work. So
basically the extension was using voodoo code to get the entities to
work as it wanted them to and it has finally caught up with it.
Rob
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]