changes to duchain lock: can anybody reproduce this speedup?
Kevin Funk
kfunk at kde.org
Sat Jul 16 11:02:45 UTC 2016
On Saturday, July 16, 2016 3:47:00 AM CEST Sven Brauch wrote:
> Hey,
>
> I wanted to look at the locking strategy of the DUChain lock for a
> while, but never got around to do it. Now, at 3:30 am on a friday, I
> finally did (what else would you do at this time) and the results are at
> least ... interesting:
>
> Replacing QThread::usleep(500) with sched_yield() makes the number of
> effective used CPUs go up from ~1.7 to ~3.3 and the wall clock time down
> from ~170 to ~110 seconds on duchainify kdevplatform [1].
Note: sched_yield() is not cross-platform. We can't use that directly.
There's Qt API:
http://doc.qt.io/qt-5/qthread.html#yieldCurrentThread
Uses sched_yield() on Linux, and similar calls on other platforms, and falls
back on usleep(0) on platforms which don't have such system calls.
> See:
> https://paste.kde.org/pncqjowxp
Similar results here, on a T450s with an i7-5600U.
- CPUs utilized from ~2.0 to ~3.2
- Time down from ~170s to ~140s.
Old:
Performance counter stats for '/home/kfunk/devel/build/kf5/kdevplatform-
production/util/duchainify/duchainify -t 4 -u -r .':
337079.039482 task-clock (msec) # 1.844 CPUs utilized
525,572 context-switches # 0.002 M/sec
21,991 cpu-migrations # 0.065 K/sec
1,489,971 page-faults # 0.004 M/sec
1,035,434,120,739 cycles # 3.072 GHz
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
986,793,615,960 instructions # 0.95 insns per cycle
207,839,162,926 branches # 616.589 M/sec
2,903,415,833 branch-misses # 1.40% of all branches
182.763912691 seconds time elapsed
New:
Performance counter stats for '/home/kfunk/devel/build/kf5/kdevplatform-
production/util/duchainify/duchainify -t 4 -u -r .':
466013.893250 task-clock (msec) # 3.301 CPUs utilized
1,089,108 context-switches # 0.002 M/sec
5,021 cpu-migrations # 0.011 K/sec
1,534,239 page-faults # 0.003 M/sec
1,358,675,441,398 cycles # 2.916 GHz
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
1,158,648,756,521 instructions # 0.85 insns per cycle
243,238,121,703 branches # 521.955 M/sec
2,958,481,042 branch-misses # 1.22% of all branches
141.188915396 seconds time elapsed
Old once more:
Performance counter stats for '/home/kfunk/devel/build/kf5/kdevplatform-
production/util/duchainify/duchainify -t 4 -u -r .':
334592.967026 task-clock (msec) # 1.977 CPUs utilized
505,325 context-switches # 0.002 M/sec
20,507 cpu-migrations # 0.061 K/sec
1,635,159 page-faults # 0.005 M/sec
1,027,148,860,614 cycles # 3.070 GHz
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
978,486,494,007 instructions # 0.95 insns per cycle
206,053,686,643 branches # 615.834 M/sec
2,931,823,890 branch-misses # 1.42% of all branches
169.265335681 seconds time elapsed
New once more:
Performance counter stats for '/home/kfunk/devel/build/kf5/kdevplatform-
production/util/duchainify/duchainify -t 4 -u -r .':
440491.330170 task-clock (msec) # 3.159 CPUs utilized
1,222,420 context-switches # 0.003 M/sec
5,253 cpu-migrations # 0.012 K/sec
1,362,019 page-faults # 0.003 M/sec
1,285,443,958,271 cycles # 2.918 GHz
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
1,117,387,936,599 instructions # 0.87 insns per cycle
234,351,773,751 branches # 532.024 M/sec
2,860,386,089 branch-misses # 1.22% of all branches
139.425082744 seconds time elapsed
> For kdev-python, the effect is somewhat less severe but still noticeable:
> https://paste.kde.org/pcirhvys9
>
> Could anyone try the attached (5-line) patch and look if you see this
> speedup as well? I feel like I might be doing something wrong.
See above. I just did a quick test-run of your patch, can't judge more since I
don't have time atm. I'll give it another shot tomorrow.
Main question is: Is KDevelop still functioning without noticable lags, e.g.
during typing? All fine?
> On the other hand, I think the "sleep 500us" strategy is quite terrible,
> if only because 500us is relatively a long time and nothing else will
> happen during that wait for _any_ thread iirc.
QThread::usleep() just puts the *current* thread to sleep, though.
> Good night,
> Sven
Cheers,
Kevin
> ________________
> [1] Conditions: everything built in release mode, CLEAR_DUCHAIN_DIR=1
> set before invocation, -r -u flags passed to duchainify, 4 threads, CPU
> is a 4-core (8 threads with hyperthreading) Intel i7 3720 QM
--
Kevin Funk | kfunk at kde.org | http://kfunk.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kdevelop-devel/attachments/20160716/3dc39e67/attachment-0001.sig>
More information about the KDevelop-devel
mailing list