Patch: wake up duchainlock writers

Tue Dec 15 01:04:53 UTC 2009

On Tue, 15 Dec 2009 06:50:21 am David Nolden wrote:
> Ok I've tried out the patch a bit, and this patch indeed makes the
>  multi-cpu usage better. It somewhat reduces the 'elapsed time' of parsing
>  a project. However it also increases the 'user time', which means that all
>  in all, the effort is higher. I cannot tell exactly what the reason of
>  this is, but probably QWaitCondition is just less efficient than a waiting
>  loop.

When I said it was slower, I meant it seemed like the background parsing was 
slower, but I didn't measure it.  Given you've found it's faster, that's most 
likely the case.  I didn't try to determine the UI responsiveness.  The lock 
still prefers waiting readers over writers, so the UI should still be as fast 
(given the main thread should only ever use readers).

If the user time is increased, that just means we were better at utilising the 
multiple CPUs, right?  Ideally we want utilisation at 100% x all cpus, which 
should result in much better wall clock time but higher user time.

> Due to the central nature of the duchain lock, I'm actually thinking of
> replacing all the mutexes in there with spin-locks, using QAtomicInt
>  instead of all the mutexes and wait conditions, to make the whole thing
>  more efficient.

What are the performance differences with multiple threads in release mode? I 
think that is what we should be targeting, as it is our core audience 
(developers usually have decent machines).

mutrace shows (with my patch + duchainify kdevplatform + 4 threads):

Mutex   Locked  Changed    Cont. tot.Time[ms] avg.Time[ms] max.Time[ms]         
  961  4455313  2067054   629087     1657.516        0.000        1.343 Mx.--.
11330  1270232   605531   286819      504.415        0.000        0.048 M-.--.
 4679   557735   215056   107019      181.942        0.000        0.044 Mx.--. 
22794   924326   233374    97301      241.144        0.000        0.459 M-.--.
28228   331038    99794    46838      102.163        0.000        0.338 M-.--.
  718  2070147  1478083    33657     2283.896        0.001       15.310 Mx.--.
 4669   267419   103753    27169       87.656        0.000        0.044 Mx.--.
  391   593990   410094    12154      616.791        0.001        3.983 Mx.--.
26929    98909    31910     8434       30.482        0.000        0.010 M-.--.
  979    34129    21028     8025       17.274        0.001        0.009 Mx.--.

These locks are (best I can determine):
  961: duchain lock
11330: within KDevelop::Identifier
 4679: within Utils::BasicSetRepository
22794: within KDevelop::IndexedString
28228: within KDevelop:shouldDoDUChainReferenceCountingInternal
  718: duchain lock wait condition (presumably the reader condition?)
 4669: another Utils::BasicSetRepository
  391: duchain lock wait condition (presumably the writer condition?)
26929: QAbstractFileEngine::create(QString)
  979: within KDevelop::DUChain::self

So, there are a number of places that may benefit from spin locking, if we 
find that we can do them faster than mutexes.

Cheers,
Hamish.