Re: XML libs (was Re: gconf backend)



[ Cc'ing to xml gnome org list, not that I like that, but it seems
  totally impossible to get the GNOME core developpers to actually
  go to the libxml2 related list to discuss libxml2 realted problems :-( ]

On Sun, Sep 28, 2003 at 02:07:52AM -0400, Havoc Pennington wrote:
> On Sat, 2003-09-27 at 19:51, Daniel Veillard wrote:
> >   You think in terms where you control the input and output. the
> > error is that your next big client is gonna use an Oracle back-end for
> > your XML data, and suddenly you don't control the production anymore,
> > and if you use a non conformant parser you made a promise that you just
> > can't hold, and that kind of thing has serious long-term costs.
> 
> Slow down, I'm not advocating gmarkup. That's why I described my ideal
> XML lib, and said it would be conformant.

  Okay, sorry about that. Let's stay on track. There is ways to make 
progresses I hope !

>   This has nothing to do with web development versus data oriented
> > development. You have a spec, either you're compliant or not. It's a
> > contract. And all it costs you to comply to that contract is mostly
> > to reuse correctly a compliant library instead of trying to roll your
> > own.
> 
> But that isn't true. To be XML compliant in terms of handling the stuff
> not found in the gmarkup-like subset, you not only have to use the
> library, you have to use it properly. Or you have to let it do things
> that probably break many apps.
> 
> Say I just cut metacity over to a library that handles includes and
> dtds. Suddenly themes would probably be able to cause the WM to lock up
> by creating unexpected I/O during theme loading. There are probably
> security issues as well since themes can be untrusted. To switch to the
> library then, I need a detailed understanding of what it is going to be
> doing, and then I have to figure out how to turn off the I/O; but when I
> turn it off, metacity's theme parser isn't XML compliant anymore, as I
> understand it. The features of XML that require nonlocal or even local
> I/O seem very browser-centric and problematic for a lot of apps.

   In the new set of APIs I'm currently writing, you pass 2 options
   doc = xmlReadFile(filename, NULL, XML_PARSE_NOENT | XML_PARSE_NONET);
Then the libxml2 parser will substitute entities and forbid remote resources
access.

> >   Well libxml2 uses callback for errors, that's the model everybody
> > uses and I'm not sure that was ever questionned by the relatively large
> > user base. Since your model seems to impose an asynchronous processing
> > I think this will need some discussion on the mailing-list. I cannot
> > change radically to a new model without at list a bit of explanation.
> 
> Basically I want to write a function:
> 
>  MyAppDataStructure*  load_xml_file (const char *filename, GError
> **error);
> 
> So the question is how to do that. The problem is that functions such as
> xmlLoadACatalog() (totally random example) don't return any explanation

  Well that is a mostly internal function. But Okay if you take
the xmlReadFile() example it's the same.

> of the error; you can look at errno, but you don't know if the errno is
> for stat() or open() or read() or there could be a parse error or
> out-of-memory and errno is junk. So the only possible error to display

  libxml2 is designed to be able to report multiple errors when parsing
a resource. And your API style does not allow this. It's critical for
a lot of work to be able to know that you have different problems
lines 100, 120 and 134. I understand your viewpoint and will try to
carry it on the list.

> to the user is "failed to load catalog" or something, with no further
> diagnostic. Also, sometimes on failure it looks to me like
> xmlGenericError was called and sometimes it wasn't.
> 
> What you want to display for a parse error is the line where the error
> happened and a problem description; for an I/O error you want strerror
> (errno). GError/DBusError/CORBA_environment/C++exceptions are a way to
> propagate this detailed information.
> 
> Not that I really advocate doing this for libxml2; it seems like it
> would basically double your API size by adding
> xmlLoadACatalogWithError() and so forth. I _don't_ think this is a good
> idea, for the record.

  Those are not function you're expect to use for this. I'm redesigning
a new layer of APIs for simple purpose, as I pointed already in a previous
mail. If there is a way to get them in-line then let's do it, I don't
want a bunch of grumpy Gnome users for a couple of years until I do
some more API refactoring.
  See you complain on d-d-l but did not subscribe to the friggin xml gnome org
list to discuss the issue. On the other hand I'm pretty sure you expect
API details and work for gtk+ or dbus to be carried on their respective
list, isn't that unfair ?

> Though I am still hoping a conformant lib can be small and avoid some of
> the problematic things like doing I/O behind the app's back.

  If you want to make progresses, and not blindly ditch libxml2 for
non-real issues, there is a window of opportunities now. the new xmlRead
based functions and the xmlReader could be made closer to your ideal API,
but as usual it will get there *if you help* . You can do that or
dream that somewhere, someone, will have the free time, money and energy
to build your dream conformant XML-1.0 library which will save you 600KB of
on-disk space, and maintain that code for the upcoming 10 years.
  So let's be realistic, if there are ways in which the proposed 
new APIs [1] or the xmlReader [2] could be made closer to what you need
read the damn resources, and comment, preferably on the list. This will
take 3 times less emails work, and will get somewhere, dammit !

Daniel

[1] http://mail.gnome.org/archives/xml/2003-September/msg00146.html
[2] http://xmlsoft.org/xmlreader.html#Walking

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]