Patch: wake up duchainlock writers

Hamish Rodda rodda at kde.org
Tue Dec 15 13:58:51 UTC 2009


On Tue, 15 Dec 2009 08:47:44 pm David Nolden wrote:
> Am Dienstag 15 Dezember 2009 02:04:53 schrieb Hamish Rodda:
> > When I said it was slower, I meant it seemed like the background parsing
> >  was slower, but I didn't measure it.  Given you've found it's faster,
> >  that's most likely the case.  I didn't try to determine the UI
> >  responsiveness.  The lock still prefers waiting readers over writers, so
> >  the UI should still be as fast (given the main thread should only ever
> > use readers).
> >
> > If the user time is increased, that just means we were better at
> > utilising the multiple CPUs, right?  Ideally we want utilisation at 100%
> > x all cpus, which should result in much better wall clock time but higher
> > user time.
> 
> That time should count the 'overall' CPU usage, and if it's higher, it
>  means that we've burnt more CPU cycles to get the same result.

Well, having parsing finish earlier is a better result, isn't it? See results 
below, anyway.
 
> > > Due to the central nature of the duchain lock, I'm actually thinking of
> > > replacing all the mutexes in there with spin-locks, using QAtomicInt
> > >  instead of all the mutexes and wait conditions, to make the whole
> > > thing more efficient.
> >
> > What are the performance differences with multiple threads in release
> > mode? I think that is what we should be targeting, as it is our core
> > audience (developers usually have decent machines).
> 
> I've implemented my idea now, and it is much faster. Locking the duchain
>  now approximately equals increasing one counter, and eventually waiting.

Here is my test results:
Test: clean .kdevduchain, hot disk cache, 'time duchainify kdevplatform'
Test run on a core 2 quad running at 3.57Ghz, 4gb ram
Non-pattern-conforming results run multiple times to get best time

Spinlock, debugfull build:
Thread count	Real time	User Time
1				41.14s		38.73s
2				46.97s		48.13s
4				45.54s		47.92s
8				69.37s		70.64s

Waitcondition, debugfull build:
Thread count	Real time	User Time
1				40.83s		37.92s
2				45.75s		49.05s
4				46.79s		55.55s
8				47.28s		54.64s

Spinlock, release build:
Thread count	Real time	User Time
1				21.35s		18.64s
2				23.85s		22.48s
4				31.63s		30.55s
8				39.74s		37.58s

Waitcondition, release build:
Thread count	Real time	User Time
1				22.81s		20.31s
2				20.82s		21.39s
4				20.73s		22.75s
8				23.25s		25.87s

In conclusion,
1) Release builds are fast :)  I might have to start using them...
2) Spinlock does not scale to multiple threads, as I suspected, as it can't 
efficiently handle situations of high lock contention
3) Waitcondition does scale up to number of threads == number of cpus, but 
does not yet offer a significant improvement with multithreading.  User time 
is only slightly worse with waitcondition.

Last night as I was developing the patch I found a great improvement with 
waitcondition, but that was when I had accidentally allowed write locks to be 
acquired when read locks already were.  That's why the patch didn't quite 
perform as I found last night (where multithreaded parsing was ~30% faster in 
debug mode)

Given I still think we can decrease the amount of time spent in write locks 
(by rewriting code to do calculations in read locks, and then get a write lock 
if changes are required), I would think continuing to work with the 
waitcondition lock would be better, possibly with spinlock being used when the 
background parser is only using one thread.

Cheers,
Hamish.




More information about the KDevelop-devel mailing list