Re: [g-a-devel] [Accessibility] Re: [Accessibility-atspi] D-Bus AT-SPI - The way forward



Hi,

Rob Taylor wrote:
CC'ing the D-Bus mailing list as there's lots of interesting stuff here.

OK, this is about 400 topics in one email, cross-posted to 5 mailing lists ;-)

I do think the dbus list could be really helpful on a lot of this, and that in general people extending or having trouble with dbus have been much too slow to mail the list.

Can I propose that if people are seriously working on, or seriously having a problem with, any of these issues that they post a single-topic thread to the dbus list per-issue.

The vast majority of these issues have been discussed before, and there is frequently even a "plan of record" and we just lack someone implementing it. A few of these issues are already solved, even, and simply asking might save someone some trouble.

I'll comment in a random way on a few of the points in here below, but I dread a thread with many topics on 5 different lists, so can I ask that further followup be single-topic, change the subject line, and drop the non-dbus lists? (assuming the followup is about dbus generally and not a11y)


OK, some quick comments on specific points.

Performance
===

D-Bus performance is already well-known and pretty well understood; I got the same results as the AtSpiDbusInvestigation page in 2004:
http://lists.freedesktop.org/pipermail/dbus/2004-November/001779.html
and updated information and suggestions related to this here:
http://lists.freedesktop.org/archives/dbus/2007-October/008822.html

The basic fact is that for a round-trip blocking method call there is a constant-factor overhead of around 3x vs. raw sockets. (ORBit is not a lot slower than raw sockets, iirc.) If you use the bus daemon, this inherently doubles, since there are double the number of "hops." Looking at the causes of this overhead, it is difficult to change fundamentally; I'm sure with effort you could get it down to 2x, and perhaps a bit better by writing a libdbus replacement with less flexibility. But you won't get to 1x.

The constant factor falls out of a number of design decisions, some of them in the protocol, and others in libdbus. This was intentional. My opinion is that these decisions were largely correct, but opinions vary. (Some of the design decisions are easy to explore changing, such as whether to validate the data; see the archive links above.)

Given that fact, I don't think there is much more to say. There is no way application design should vary based on whether 2x or 3x was measured. Unless you intend to hack on libdbus itself and want to drill down into hotspots to fix them, what you as an app developer should have in your mind is "there's a single-digit constant-factor overhead vs. raw sockets" and "avoid round trips!!!!"

For AT-SPI I bet the bottom line is that you guys have got to reduce the amount of traffic and round trips, probably by changing or extending the API.

If you need something raw-socket-like, then stick to CORBA, or use D-Bus for discovery only then set up a custom socket channel, or whatever. There is no reason to drop CORBA when it is suitable. The reason for dbus is that (in my opinion) CORBA was not suitable for many desktop use-cases. That does not mean CORBA is not suitable for AT-SPI.

Richer Introspection Data
===

Re: struct names in the introspection, etc. I feel sure there are old threads on this but am too lazy to dig them up.

Other than digging those up and posting the archive links, I guess the first step would be to write up the motivation and proposal on the list. In general it seems like a reasonable idea.

Interface Repository
===

We have certainly discussed on the list simply installing XML files so static languages can access them at compile time. Not sure of the status of this, but it should amount to a spec patch, maybe even we already have a patch. There are definitely old threads and it's worth doing.

A runtime IR service, I'm not sure what it would be for. The dbus approach is that each app provides its own introspection data.

This has pros and cons vs. keeping a global repository of types, but I'm skeptical that doing it *both* the central IR way and the introspection way at the same time makes sense.

IDL vs. XML
===

Remember that writing the XML files by hand is not intended. It would not be needed if we had reasonable tools.

The original vision, which I still think would be best, is that you implement the object in some language. The XML is then generated by scanning that language - pulling docs from the language's native inline docs, pulling interfaces from the language's native interfaces, etc. This extracted XML can then be installed for use by static languages, for example. Definitely this is how the GLib bindings were intended to work; we did NOT want people to have to write an XML file then generate code from it. (For the "server side" or object implementation, that is.)

A CORBA-like IDL fits into this same concept. The idea is that if you wanted to hand-write your IDL, maybe a nice CORBA-style syntax would be preferred. No problem. You use libIDL to write a little tool to convert your nice syntax to the XML format.

In other words, the XML format is a lingua franca. That's part of why XML is used, because you can write quick tools and scripts in Python or Perl or whatever that manipulate the XML.


Passing Types With Objects/Structs
===

The major point here that I agree with: as Michael says, the introspection calls probably add up to a lot of round trips, especially for dynamic languages.

However, there are surely some good ways to optimize that which are backward compatible and *simple* - i.e. that do not imply that dbus has a global-across-all-processes type system or type repository, because it doesn't and imo shouldn't.

<background digression>
A principle of the dbus design is that dbus is NOT a type system, it's a marshaling system.

That was part of the whole point of dbus vs. CORBA: you are supposed to use the type system *of your programming language* or *of your component system*. dbus is a *marshaling* mechanism, not a universe of types. For dbus, "struct A { int }" and "struct B { int }" are the same thing.

Unlike at least some of the old theories about how to use CORBA, dbus is NOT intended to be a component system; it is NOT intended to be a way to define cross-language objects or types. It *can* be used to *remote* a component. See D-Bus FAQ for more on IPC vs. components.

The out-of-band introspection data is intended to provide optional hints for how to generate a language binding. But it's also intended that the introspection data can be ignored; you can just treat the dbus messages as raw structured data.
</background digression>

So, how would I approach the introspection round trips and/or bandwidth? I would think some combination of standard strategies, such as batch calls to introspect multiple objects at once. Another possible approach would be to have an implicit type repository *per application* - something like an extended Introspect() call that lets you specify an introspection context, and in an introspection context the app would send you each interface only once, and refer to it by reference the second and subsequent times.

Probably step 1 would be to profile the bottlenecks for the dynamic bindings that use introspection, with some common apps they might be introspecting. We should not add a bunch of complexity on performance speculation, only on performance data.

For the solution, I think it's important to keep the layering that the introspection data is an *optional hint* that can be used to interpret dbus messages.

Object Paths
===

> In terms of attaching objects to a connection, it'd be really nice to
> have the attach method take not only a object path, but also a
> possible
> function for parsing the remaining components of a path whose prefix
> matches the given object path.

If I understand this correctly, this is already allowed. You can register a handler for an entire subtree of the object path namespace. The intended usage of that is to allow you to do your own path to handler mapping, e.g. the example I usually give is that you could register to handle "/documents" and then do your own interpretation of "/documents/0", "/documents/1", etc.

Object References
===

There has been past discussion, possibly worth digging up, about some standard format for a complete "IOR" - which would include a server address, optionally a bus name, and an object path. i.e. the info to allow you to create a DBusConnection and then create an object proxy.

The "shared connections" feature is intended to support this.

i.e. if you resolved this "IOR" you would have to get the DBusConnection from the server address, then create your proxy. Without shared connections, you would create a DBusConnection per proxy, which would be absurdly, horrifyingly inefficient.

Anyway, I think shared connections are the only hard part about this feature, and that part is already implemented.

I think the reason there's no "IOR" feature so far is because very few people have needed it. It's rare to want to pass an object reference that is "location independent" (not known to be on some specific bus or provided by some other specific predefined program). When you're talking to another program, and thus getting an object reference, you would normally already know what bus that other program is on.

But, if someone thinks this feature through and codes it, it makes sense, as I said the shared connections feature is already there and intended to support it.

Binary Introspection Data
===

I don't see how this is worth the enormous pain of reimplementing all kinds of stuff. It is not even clearly better than XML; there are plenty of contexts where XML is more convenient. And there is no proven performance problem, or proof that binary would be dramatically better, or at least nobody has posted the proof where I've seen it.

I don't see a massive deprecation and reimplementation effort spanning quite a few projects, justified purely by subjective aesthetics.

In any case, I bet performance would be better addressed via the mechanisms discussed earlier - batching up the introspection data, or allowing it to be passed "by reference" when the same app gets the same interface a second time. Certainly that seems like the thing to try first.

Perspective
===

Let me say again. I know it's a lot of fun to screw with component systems and type systems and IPC systems. (Obviously I've done it myself.) However, we should not delude ourselves that this is especially *worthwhile* in most cases.

Where are apps having the most trouble, doing the most things wrong, etc.? Arcane improvements to the IPC system are not the answer.

Most of the problems are on a higher level. e.g. lack of convenience API for stuff like this:
http://svn.mugshot.org/dumbhippo/trunk/client/linux/src/hippo-dbus-helper.h

Or in GNOME, we aren't even using dbus for the baseline, simple functionality it already provides; e.g. there's still no single-instance support in gtk. Why would we be adding all kinds of new stuff to dbus, when we're still sucking at using the functionality we have?

Let's remember that DCOP was implemented in a very short period of time, and was dead simple - MUCH less complex than dbus is - and people used it heavily and successfully for lots of real functionality.

Havoc


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]