[Kde-accessibility] [Accessibility] Re: [Accessibility-atspi] D-Bus AT-SPI - The way forward

Tue Dec 11 15:09:02 CET 2007

CC'ing the D-Bus mailing list as there's lots of interesting stuff here.

Mark Doffman wrote:
> Hi Michael,
> 
> On Mon, 2007-12-10 at 16:45 +0000, Michael Meeks wrote:
>> Hi Mark,
>>
>> On Wed, 2007-12-05 at 16:56 +0000, Mark Doffman wrote:
>>> Available at http://live.gnome.org/GAP/AtSpiDbusInvestigation is the
>>> results of an investigation into a move of the AT-SPI interface to a
>>> D-Bus transport
>> 	It's most interesting.
>>
>>> D-Bus is undoubtedly slower at most of the common method calls, 5-6x
>>> slower when making a call that passes one int as an argument. When
>>> passing more data per call this speed difference decreases.
>> 	This is simultaneosly pleasing & distressing. That D-BUS hasn't
>> apparently progressed performance-wise to the (non-optimised) state of
>> ORBit2 (effectively in deep-sleep/maintenance mode for the last 4 years)
>> is somewhat surprising. I wonder what is going on there, must be a silly
>> or two in the marshalling logic.
>>
>>>  ORBit takes a long time to pass an Object reference, making D-Bus up
>>> to 1.5x faster at these method calls.
>> 	I can believe it; CORBA object references are quite verbose -
>> particularly (as you note) when multiple transports are added: IP / Unix
>> etc.
>>
>>> Although D-Bus is the slower transport, looking at the calls made by
>>> Orca and GOK, we feel it will be possible to provide sensible caching
>>> that should mitigate this effect.
>> 	Quite - ultimately, the choice of transport is moot IMHO, though
>> clearly unifying on a single shared transport layer is a great direction
>> even if, for mindless political reasons, it has to be "not CORBA".
>>
>>> For a switchover to D-Bus a number of core libraries will need to have
>>> the transport mechanism changed: cspi, pyatspi, GAIL. There will also
>>> need to be a new Java accessibility back end. Some core D-Bus work is
>>> also needed, in the areas of interface specification, bindings and
>>> possibly optimisation.
>> 	Right; so I guess the sticking point is only Java.
>>
>> 	Wrt. core D-BUS work: one of the reasons I was actually enthusiastic
>> about a switch to D-BUS is that it marshals types on the wire: that
>> *should* allow an extremely sexy forward & backwards compatibility story
>> to be developed: that is impossible with CORBA. Unfortunately, it seems
>> that's been mostly ignored despite my attempts to communicate that:
>> generating a shared goal of that for a11y would be really useful.
>>
>> 	What do I mean about compatibility ? cf. the mess around 'Event'
>> 'EventDetails' etc. If we can have a 'struct' that simply grows as we
>> add more fields to it, and gets padded with 0's as mismatches occur: we
>> have an incredibly nice compatibility story. The stock non-answer to
>> this is "ah yes, if you hand-write all your marshallers / de-marshallers
>> - you can get that already !" ;-) which leads to point b):
> 
> Ok, I think I get whats going on here. I imagine this is more about the
> marshaller / de-marshaller code not throwing a wobbly when there is data
> appended to the message that it doesn't expect. Its certainly something
> to think about.

Absolutely, though it might actually make sense in C to have the 'old'
and 'new' method calls come out as different method calls in C- for
example if we wanted to extend ATK, I don't believe we would be able to
just extend the number of parameters taken by a given ATK method due to
GTK ABI guarantees. Something like that might also make it clearer for
app developers.

>> 	The bindings must be good, and need to be generated from some sort of
>> sane & readable (preferably IDL-like) interface description. I wrote a
>> prototype one in perl long ago, not sure if it's rescue-able ;-)
> 
> I had always imagined that the canonical version of the AT-SPI API would
> rest with the D-Bus XML. Its a difficult one this, XML certainly isn't
> something anyone wants to write interfaces in, but its currently what
> D-Bus introspection uses. Its some extra work to maintain a version of
> the interface, along with the translation tools to an XML version. 
> 
> I guess if there was such a tool already available.. Such as the
> modified version of your perl one :), then it wouldn't be such an
> issue. 

I should add in here that I'm very interested in promoting the type
annotations that Collabora came up with as part of their Telepathy
Specification to some sort of cross-language-binding standard.
You can find some examples in the telepathy spec here:

http://darcs.collabora.co.uk/darcsweb.cgi?r=telepathy/telepathy-spec;a=tree;f=/spec
It would be good to expose these annotations in introspection to allow
dynamic languages to construct more meaningful representations of an
interface. It may be painful to add them to the current XML returned by
Introspect given that most parsing is probably broken, so more on this
below...

I actually feel that an IDL as canonical form is a bad idea, as most
uses of any introspection will be by a machine, e.g. generating
documentation as HTML or DocBook. IDL would only be useful to the
authors of an interface. Users should almost always be referencing
generated documentation for their given language binding. The difficulty
of generating documentation in various forms has always been a problem
for CORBA interfaces, AFACT.

>> 	Anyhow, the "D/BUS thoughts" I wrote in 2005 is attached, somehow it
>> managed not to get moderated when I re-posted it to the D-BUS list some
>> year or so later; perhaps it's only of historical interest now.
>>
>> 	One last concern - was anonymous objects & the problems of type
>> introspection (round-trip-wise). Do we marshal the interface type of an
>> object with it's reference ? [ bit rusty here ].
> 
> In ORBit the type gets passed with the object reference. This prodded us
> into a bit of a think on the D-Bus side. You're right, we don't want to
> go introspecting every new object we see just to get the methods
> available on it. I guess this implies some sort of interface repository
> (lets rebuild corba!), along with passing a type signature. 

This would obviously be a useful optimisation, especially if dealing
with a lot of different objects like we do in AT-SPI. The hard question
here is where would such an interface repository go and what would it
look like? A service on the bus seems like a good idea, but what if an
application is connected to multiple buses? Maybe a session- and/or
system-local file or directory of files? but what about fast lookup?

Does an object reference include a list of interface ids from this
interface repository? or would the repo also have registration of full
types (a type being a list of implemented interfaces), and have this ID
sent with an object reference.

Questions, questions!

>> 	Another query - wrt. lifecycle mechanisms: what would be proposed for
>> lifecycle tracking object peers inside providing applications ?
> 
> No proposals. ATM I'm imagining that all AT-SPI objects die with their
> applications, and not otherwise. If anyone has some examples of where
> this can't be the case we really need to know.
> 
> I'm sure we could go the Bonobo route and have a base class that was
> reference counted, but we really don't want to. 
> 
>> 	Anyhow - in general, IMHO etc. moving to D-BUS is a positive move, and
>> [ I guess ], the mercy (I hope) is that it can be done without excessive
>> disription to the Python or cspi bindings, and no pain for atk either. I
>> guess as Novell spins up it's a11y team here, we -may- be able to help
>> out with some of the work / testing - though that's unclear as yet. I'd
>> love to follow the design & impl. of the work myself anyhow.

Great, it'd be wonderful to have someone following with as much
experience as you do with the issues faced and mistakes made in Bonobo
and ORBit :)

>> 	HTH,
>>
>> 		Michael.
>>
>> email message attachment, "Attached message - D/BUS thoughts ..."
>>> -------- Forwarded Message --------
>>> From: michael meeks <michael.meeks at novell.com>
>>> Reply-To: michael.meeks at novell.com
>>> To: Havoc Pennington <hp at redhat.com>
>>> Subject: D/BUS thoughts ...
>>> Date: Mon, 09 May 2005 16:53:03 +0100
>>>
>>> Hi Havoc,
>>>
>>> 	So - at LWE I said I'd scribble down a few notes wrt. things I
>>> was hoping would get done in D/BUS; most pleased to see the recursive
>>> type support. Please do forward to the list if you think any of it is
>>> useful.
>>>
>>> 	So - this is informed by the work on ORBit2; we learned a
>>> number of interesting things there over the course of a few years.
>>> hopefully some of these are by now fixed in D/BUS etc.
>>>
>>> * Some lessons:
>>> 	+ don't create new type systems
>>> 		+ people hate to convert between representations
>>> 		+ people hated (the type-safe, powerful etc.) CORBA
>>> 		  type system; because they had to convert between
>>> 		  GArray & CORBA_sequence_Foo
> 
> D-Bus Glib and D-Bus Python marshal directly into the native type
> system. Its a good idea.

Yes, almost all language bindings have gone this path. dbus-glib
currently sucks however, as its mapping from dbus type to glib types is
predetermined in the code. For a while now, I've been working with Jürg
Billeter and Johan Darlin to bring the gobject-introspection project to
life. This will help to solve this issue by allowing dbus types to be
demarshalled to exactly the types as expected by an existing C API.

>>> 	+ use a recursive type system
>>> 		+ all programming languages have them, for good
>>> 		  reason.
>>> 		+ they allow a nice, simple mapping to many
>>> 		  languages.
>>> 	+ a corrolory of that is:
>>> 		+ don't proliferate representations
>>> 			+ you will always need a native
>>> 		 	  representation of XYZ information
>>> 			+ if you structure that representation to
>>> 			  conform to your type system - you avoid
>>> 			  creating 'yet another representation'
>>> 		+ ie. it's not a strength to represent type data in
>>> 		  IDL, and XML, and 1-per-language native parsed
>>> 		  forms. Far better to use a common representation
>>> 		  based on the type system.

This has already occurred to some degree, it needs sorting. (q.f.
various  binding-specific XML annotations)

>>> 		+ IPC shouldn't be _that_ difficult, or require
>>> 		  _that_ much code
>>> 		+ reducing redundant representations helps to
>>> 		  substantially reduce code complexity & ease
>>> 		  maintenance.
>>>
>>> 	Anyhow - here were some of my thoughts of several months ago
>>> when I last looked at D/BUS:
>>>
>>>     * Extensibility
>>> 	+ One thing I really like about D/BUS that CORBA was
>>> 	  missing is the extensibility allowed by the marshalling
>>> 	  of type information on the wire. ie. a D/BUS call
>>> 	  would look (in CORBA) like:
>>> 		callMethod( in string name, in sequence<Any> args)
>>> 	+ Unfortunately, CORBA relied on a very strong contract
>>> 	  between client & server. There is no need to do this
>>> 	  with D/BUS:
>>> 	** extra arguments to functions, extra members in structures
>>> 	   etc. should be silently elided / padded to 0 **
>>> 	+ Of course there was interface versioning, and perhaps that
>>> 	  is/was necessary but it never worked well.
>>>
>>>     * Anonymous object references
>>> 	+ In may applications there are no particularly obvious,
>>> 	  or sensibly unique string names associated with objects
>>> 	  we want to expose.
>>> 	** There needs to be a good, performant, standard mangling
>>> 	   for such objects. **
> 
> Object references still haven't been sorted. An object reference is
> going to be the bus path, along with an object path unique to the
> application. As mentioned before, possibly a type signature is needed
> also.

In terms of attaching objects to a connection, it'd be really nice to
have the attach method take not only a object path, but also a possible
function for parsing the remaining components of a path whose prefix
matches the given object path. This would probably be useful for AT-SPI.
I dunno if a standard mangling for unpathed objects is really necessary,
it just needs to be something efficient for a given language binding.

>>>     * Introspection - performance
>>> 	+ CORBA passes a type-id with every object reference.
>>> 	  While that looks like wasteful overhead, it allows a
>>> 	  remote client to realise that it is of an identical
>>> 	  type to a previously introspected object - meaning
>>> 	  that scripting bindings don't have to do 1 extra,
>>> 	  synchronous round-trip per method call, plus a load
>>> 	  of (XML?) parsing to be able to invoke a method on
>>> 	  the object.
>>> 	** D/BUS should do something similar, round-trips are
>>> 	   expensive. **
>>>
>>>     * Introspection - complexity
>>> 	+ As previously discussed; there is a huge benefit to
>>> 	  the existing 'getIntrospectionData' type method,
>>> 	  however - the introduction of a new XML representation
>>> 	  seems unnecessary.
>>> 		+ that is particularly true if the base types
>>> 		  can be compatibly extended during marshalling.
>>> 	+ this would avoid the need for an XML parser with
>>> 	  commensurate time & space penalty, still provide
>>> 	  equal extensibility, and reduce the representation
>>> 	  count.
>>> 	** D/BUS should use it's native type system to describe
>>> 	   types instead of a foreign one **
> 
> Damn straight. I really like this. Anyone for
> getNativeIntrospectionData()?

Yep, this has been discussed before and its something I'd be very keen
to see. As I said above, I think adding the telepathy-style type
representations I want to see could be problematic in the current
Introspect call, so this would be a perfect moment to come up with
either an addition to the current or an new introspection interface that
uses the dbus type system to communicate the interface. Any interface
repository service should probably use this new representation. Of
course applications would unfortunately have to support the old method,
but maybe this could be eventually phased out.

>>>     * Mapping recursive types to the native C ABI
>>> 	+ This is a simple task - and we should be doing it to
>>> 	  the GArray types - again, not a new idea.
>>> 	+ Writing that code is _suprisingly_ complex, error
>>> 	  prone, and difficult to test across the N architectures.
>>> 	+ Re-licensing the ORBit2 code to do it should be no
>>> 	  issue, it's mostly Novell/RH code.
>>> 	+ ie. you should be able to call:
>>> 		sequence<GdkRectangle> getAreas(in long index);
>>> 	  as: GArray *getAreas(glong index);
>>> 	  and get not a GArray of GValue's or some ugly /
>>> 	  cumbersome 'Any' type, but real struct GdkRectangles.
> 
> All about the glib bindings, which do need some improvement.

Yep, see what I said above on gobject-introspection.

>>>     * Flow control & blocking
>>> 	+ It is often the case that two processes exist one
>>> 	  producing events & one consuming them.
>>> 	+ this situation requires careful handling to avoid
>>> 	  the source out-producing the sink, leading to run-away
>>> 	  memory consumption / failure.
>>> 	+ this can either be performed by tiresome, complex &
>>> 	  fragile user-land flow-control; or by the simple
>>> 	  mechanism of blocking the socket to let the kernel
>>> 	  deal with the issue.
>>> 	+ unfortunately - most IPC channels are shared; if an
>>> 	  out of control asynchronous event flow blocks a
>>> 	  socket, then no - necessarily-under-control
>>> 	  synchronous IPC can get down the same channel =>
>>> 	  deadlock potential.
>>> 	+ Thus it'd be nice to have the concept of a blockable
>>> 	  event 'flow', vs. a point-to-point, reliable IPC
>>> 	  channel.
>>> 	+ A simple example is the flow of accessible events,
>>> 	  each event often causing the sink to emit multiple
>>> 	  round-trip calls to the source to fetch more
>>> 	  information.

Urk, flow control is hard. The problem with the solution you propose is
it forces an application to have a thread for processing events, and
then you still end up with a possible starvation situation when handing
off events to a mainloop.

As its stands I've yet to see a real-world usecase for pumping signals
faster than user input, though of course, we've tested this case ;). In
AT-SPI the event reception is dwarfed by synchronous calls to handle an
event, usually to the process that sent the event, so starvation isn't
an issue here.

IIRC, libdbus at the moment currently always reads data off the socket
into a queue of messages to be handled, it'd probably be good to have a
method for applications to set a maximum queue size so they can use a
direct connection and have the kernel manage their flow control. This
should really be a design decision for an application writer, I don't
think we can solve it for them in a decent way - and its not common.

I should note that negatively nicing the dbus daemon seems to have a
pretty good effect on the clumpiness when emitting lots of signals
quickly or lots of asynchronous method calls.

>>> 	I hope that helps, sorry it took so long.
>>>
>>> 	Regards,
>>>
>>> 		Michael.

Great to hear from you Michael :)

> 
> It was really good to get some more feedback,
> 
> Mark

Thanks,
Rob

> _______________________________________________
> Accessibility mailing list
> Accessibility at lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/accessibility

-- 
Rob Taylor, Codethink Ltd. -  http://codethink.co.uk