stupid question: how to use malloc/new efficiently

David Leimbach leimy2k at mac.com
Sat Jan 17 20:30:59 CET 2004


On Jan 17, 2004, at 12:25 PM, Tim Jansen wrote:

> On Saturday 17 January 2004 16:26, Alexander Neundorf wrote:
>> I never thought about it until I recently read it somewhere:
>> should one always use malloc/new for buffers with sizes which are 
>> multiples
>> of 1K/1 page i.e. usually 4K or something like this ?
>
> This should be really the last optimization resort, but it can have 
> advantages
> if your buffer size is a power of 2. You need to make sure that not 
> only the
> page size is correct, but the begin of the buffer is also aligned 
> (thus for
> 4K pages the address needs to be a multiple of 4096). Then, if the 
> memory is
> swapped out, the kernel may need to swap only one page in instead of 
> two.
>
> A similar optimization is possible with the CPU caches. CPUs usually 
> work with
> 'cache lines'. If your data structure fits into one or more cache 
> lines, only
> these cache lines need to be fetched. Like with pages you need to 
> align the
> data correctly.
>
> Note that both optimizations do not make sense for sequential data 
> access,
> only for random access. On the Intel developer site you can find lots 
> of
> documents that explain how to align your data and optimize for optimal 
> cache
> use.

Again...this is platform specific but the G5 actually likes sequential 
access
better than random access.

 From : http://developer.apple.com/performance/g5optimization.html

"Keep the Power Mac G5 Well-Fed

When optimizing, remember that Power Mac G5 computers are very hungry, 
very fast, and very sequential. This means that they consume very large 
amounts of data at one time, that they process it very quickly, and 
that nonsequential instructions and data accesses cause significant 
performance penalties. Many of the optimizations described below and in 
other Apple-supplied documentation cater to these characteristics. You 
should keep them in mind when you look for opportunities to optimize 
your software.
"

So an optimization for one platform may not be an optimization for 
another depending
on which direction you look.

This document actually has fairly good tips for optimizing in a more 
generic way like using
gcc's -O3 flag.  The "Optimization, Level by Level" section is geared 
towards G5 but the same
basic rules apply everywhere.


Dave



More information about the Kde-optimize mailing list