Dynamic symbol table
Karl Vogel
karl.vogel at seagha.com
Fri Jul 16 12:09:43 CEST 2004
[... resend... looks like gmane swallowed the first one, if not.. ignore
the duplicate ...]
Waldo Bastian <bastian at kde.org> wrote in
news:200407160025.46067.bastian at kde.org:
> On Thursday 15 July 2004 23:55, Karl Vogel wrote:
>> Has anybody done some detailed analysis yet as to where most time is
>> wasted at startup?! Sprinkling 'date' commands in my startkde script
>> shows that ksmserver is where I loose the most time... but that
>> encompasses alot.. so time to dig into the ksmserver source (and
>> related commands that are started by ksmserver)
>
> You can run startkde like:
> strace -tt -o /tmp/profiledir -f -ff startkde
Oh that's right.. strace can also trace through fork.. forgot about that.
Lol.. would probably have been more sensible to use strace than to hack my
kernel to log open() calls.. but it also worked :-)
What I was trying to figure out, is whether loading of binaries is having a
negative effect on start time.
One should be able to speed up the start time of binaries by putting the
initial startup code together, which means it will most likely be together
on disk also, which ultimately reduces I/O's.
By (ab)using GCC's -fprofile-arcs & -fbranch-probabilities, it is possible
to put the start code in a different ELF section than the rest of the code.
The idea:
* add option to kde exe to directly stop after KApplication() setup
* compile exe with -O2 -fprofile-arcs
* start kde exe with the new stop option
* the profiled exe will dump stats to 'exename.da'
* compile exe with -O2 -fbranch-probabilities
=> gcc will now have put the startup code in section 'text.hot' and the
rest in 'text.unlikely'. So the startup code will be all together in the
exe, meaning less head seeking, less I/O & faster load. (and afterwards a
smaller RSS is needed to run the app)
The -fprofile-arcs doesn't need to be done all the time.. once you have the
names of the startup functions, you can generate an include file that marks
the functions with an __attribute__ ((section (".startup"))).
ie. getting list of the 'hot' funcs:
$ objdump -Ct testapp|awk '/F text.hot/ { print $6;}'
func0
func855
func455
main
(testapp contained func0 upto func999)
Using /usr/bin/time one should be able to see a decrease in the minor
pagefaults number (use /usr/bin/time and not 'time' as that is a bash
internal command which doesn't show the pagefaults)
But then again... this might be going overboard with the tuning :-)
More information about the Kde-optimize
mailing list