RFC: DBUS & KDE 4

Sun Oct 3 00:59:48 BST 2004

On Thu, 2004-09-30 at 13:10 -0400, Maks Orlovich wrote:
>  Right now it's just vanilla 
> RPC w/a library that does some useful transport stuff, and w/a woefully 
> under-defined spec. 

I fully agree the spec is under-defined. For that matter it doesn't 100%
reflect current reality, either. I kind of want doc/TODO tidied up a bit
more, then go after rewriting the spec.

I also agree that dbus is pretty vanilla RPC. The goal was to be vanilla
and unobjectionable to get wide adoption, rather than to be flashy or
innovative. Another goal is to focus tightly on specific use-cases and
not try to be a generic IPC system that's all things to all people like
CORBA. One of the goals is to be able to drop in for DCOP, so changes
required for that are welcome.

> Well, again, it would help if the D-Bus spec specified a lot of the more 
> complex stuff involved, wrt to blocking/non-blocking call semantics 
> but consider this situation:
> 
> 1. You're a program you make an outgoing call. 
> 2. You get an incoming call
> 
> so what do you do? Well, you could handle it. But that can cause unwanted 
> reentrancy. Or you could defer the call until the outgoing call finishes. But 
> that can cause deadlock. 

FWIW this is one of the core reasons GNOME is interested in D-BUS,
because Bonobo did the uncontrolled reentrancy thing which is a Bad Idea
(tm).

The current answer (for libdbus, this isn't something the protocol
requires one way or the other):

 - reentrancy is never done implicitly. i.e. when you block,
   libdbus will never run the main loop. You could write a binding
   that blocked in the main loop, but I don't suggest it. libdbus 
   will always block without doing other stuff while blocking.

   The point is, if I make something that looks like a method call
   it shouldn't incidentally run all kinds of unrelated callbacks.
   Deadlocks are way more reproduceable and manageable bugs
   than random race conditions where if event A happens during 
   outgoing call B stuff blows up.

   Note that a dbus deadlock only deadlocks the involved apps, not 
   the daemon or other apps.

 - To fix deadlocks, you can do explicit reentrancy in two ways:
   - write explicitly async code using the main loop and callbacks,
     sort of like QSocketNotifier or g_io_add_watch() or whatever
   - use threads, since libdbus is designed to be thread safe

   In those two cases, you are writing the code with explicit 
   reentrancy expectations and have tools such as mutexes to 
   avoid reentrancy bugs.

In doc/TODO is some discussion of doing the "tracking a call stack"
trick Waldo described and that's a welcome direction that would add a
third tool for fixing deadlocks, or a way to fix most of them by
default.

Finally, all dbus calls time out eventually, though the timeout can be
sort of long to ensure reliability.

Thoughts on the right approach here are very welcome.

Havoc