Helping with KDE4 (valgrind)

Julian Seward julian at
Mon May 8 22:45:38 BST 2006

>   It would be great if I could have help profilling ksysguard.

Ok.  Here's what I did to get some profile data for the KDE 3.4.2
supplied on SuSE 10.  I decided to profile it using Josef W's
excellent "callgrind" profiler.

I used the latest V sources because I happen to have them sitting
around on my machine :), because they are generally faster and
more robust than 3.1.1, and also because they contain callgrind as
a standard tool (unlike 3.1.1).  I suggest you do the same.  Easy:

  svn co svn:// valgrind
  cd valgrind
  ./configure --prefix=...
  make install

(see for details)

Since I did not know what bunch of processes ksysguard would create,
nor which are the important ones, I profiled all of them:

  valgrind --tool=callgrind --simulate-cache=yes \
           --trace-children=yes -v ksysguard

After a couple of minutes the app appeared.  I let it run for about
20 minutes (you don't need to wait that long, but I did).  There were
2 running processes taking up 100% CPU between them (remember, programs 
run much slower under V).  Then I quit the app.

The result is these 3 log files

-rw-------   1 sewardj users  387788 2006-05-08 19:51 callgrind.out.21650
-rw-------   1 sewardj users       0 2006-05-08 19:51 callgrind.out.21652
-rw-------   1 sewardj users 4459626 2006-05-08 20:11 callgrind.out.21651

I don't know what's with the .21652 one.

I loaded the other two into the kcachegrind GUI (kcachegrind is a 
standard part of KDE, I believe).  The .21650 one did not look
that interesting, but I did not know what I was looking at.  .21651 contains
a lot more detail.  I switched to the 'cycle estimation' view.  It seems
QEventLoop::processEvents consumes 70.15% of the estimated cycles, and
the GUI shows lots more info.  Because this is a non-developer build,
I only have function names to look at, but if you tell kcachegrind where
your sources are and have built with -g (and -O/-O2) you should get
details down to the line level.

A couple of other comments:

- If you're just starting out profiling, you could omit
  --simulate-cache=yes.  This provides less useful numbers but it
  provides them sooner.

- I was surprised that the two active processes (the viewer and the
  daemon?) consumed 100% CPU on my desktop 1.7GHz P4.  If we say that
  callgrind runs programs 50x slower than native (not sure of the exact
  numbers, but it's in that ballpark) then roughly we can say the two
  processes natively would take up >= 2% CPU, which seems like a lot
  (that's >= 34MHz worth of machine cycles).  So perhaps there's
  something expensive in there.

Does that help?


