Patch: wake up duchainlock writers
Hamish Rodda
rodda at kde.org
Tue Dec 15 13:58:51 UTC 2009
On Tue, 15 Dec 2009 08:47:44 pm David Nolden wrote:
> Am Dienstag 15 Dezember 2009 02:04:53 schrieb Hamish Rodda:
> > When I said it was slower, I meant it seemed like the background parsing
> > was slower, but I didn't measure it. Given you've found it's faster,
> > that's most likely the case. I didn't try to determine the UI
> > responsiveness. The lock still prefers waiting readers over writers, so
> > the UI should still be as fast (given the main thread should only ever
> > use readers).
> >
> > If the user time is increased, that just means we were better at
> > utilising the multiple CPUs, right? Ideally we want utilisation at 100%
> > x all cpus, which should result in much better wall clock time but higher
> > user time.
>
> That time should count the 'overall' CPU usage, and if it's higher, it
> means that we've burnt more CPU cycles to get the same result.
Well, having parsing finish earlier is a better result, isn't it? See results
below, anyway.
> > > Due to the central nature of the duchain lock, I'm actually thinking of
> > > replacing all the mutexes in there with spin-locks, using QAtomicInt
> > > instead of all the mutexes and wait conditions, to make the whole
> > > thing more efficient.
> >
> > What are the performance differences with multiple threads in release
> > mode? I think that is what we should be targeting, as it is our core
> > audience (developers usually have decent machines).
>
> I've implemented my idea now, and it is much faster. Locking the duchain
> now approximately equals increasing one counter, and eventually waiting.
Here is my test results:
Test: clean .kdevduchain, hot disk cache, 'time duchainify kdevplatform'
Test run on a core 2 quad running at 3.57Ghz, 4gb ram
Non-pattern-conforming results run multiple times to get best time
Spinlock, debugfull build:
Thread count Real time User Time
1 41.14s 38.73s
2 46.97s 48.13s
4 45.54s 47.92s
8 69.37s 70.64s
Waitcondition, debugfull build:
Thread count Real time User Time
1 40.83s 37.92s
2 45.75s 49.05s
4 46.79s 55.55s
8 47.28s 54.64s
Spinlock, release build:
Thread count Real time User Time
1 21.35s 18.64s
2 23.85s 22.48s
4 31.63s 30.55s
8 39.74s 37.58s
Waitcondition, release build:
Thread count Real time User Time
1 22.81s 20.31s
2 20.82s 21.39s
4 20.73s 22.75s
8 23.25s 25.87s
In conclusion,
1) Release builds are fast :) I might have to start using them...
2) Spinlock does not scale to multiple threads, as I suspected, as it can't
efficiently handle situations of high lock contention
3) Waitcondition does scale up to number of threads == number of cpus, but
does not yet offer a significant improvement with multithreading. User time
is only slightly worse with waitcondition.
Last night as I was developing the patch I found a great improvement with
waitcondition, but that was when I had accidentally allowed write locks to be
acquired when read locks already were. That's why the patch didn't quite
perform as I found last night (where multithreaded parsing was ~30% faster in
debug mode)
Given I still think we can decrease the amount of time spent in write locks
(by rewriting code to do calculations in read locks, and then get a write lock
if changes are required), I would think continuing to work with the
waitcondition lock would be better, possibly with spinlock being used when the
background parser is only using one thread.
Cheers,
Hamish.
More information about the KDevelop-devel
mailing list