KHTML-Measurement (from my talk at aKademy)

David Faure faure at kde.org
Thu Aug 26 18:22:05 CEST 2004


On Thursday 26 August 2004 18:15, Josef Weidendorfer wrote:
> On Thursday 26 August 2004 15:32, Simon Perreault wrote:
> > On Thursday August 26 2004 6:08, Josef Weidendorfer wrote:
> > > Another source for stall time (wasted time) on modern processors is
> > > branch misprediction. Work is done in a pipelined manner. And if there is
> > > a mispredicted jump target, the pipeline has to be flushed. If I remember
> > > right, an Athlon or P-III has a pipeline length of around 15, a P4 has
> > > 21, and a P4 Prescott has 30. So every mispredicted branch costs around
> > > 15 cycles wasted time on my notebook.
> >
> > I have been doing a lot of optimization of numerical software lately,
> > nothing related to KDE, but heavy optimization nevertheless. Even for
> > numerical computations, optimizing for branch prediction is at a way too
> > low level. The effort will be huge for almost no gain. And, as you say,
> > different CPUs have different algorithms, so this almost always negates any
> > gain you might have. You'd better leave that optimization to compilers that
> > feature profile-guided optimization, like Intel's. Never optimize at a
> > level below the compiler's.
> 
> You are right.
> 
> But I never suggested optimizing the given function e.g. by doing strange 
> things like inline assembler in KDE code. I simply stated the only 
> explanation I can imagine for the difference of simulation results to 
> reality.

There's a way to help branch prediction without writing assembly code though...
See KDE_ISLIKELY and KDE_ISUNLIKELY in kdemacros.h (they use __builtin_expect from gcc)

-- 
David Faure, faure at kde.org, sponsored by Trolltech to work on KDE,
Konqueror (http://www.konqueror.org), and KOffice (http://www.koffice.org).


More information about the Kde-optimize mailing list