Autovectorization, SIMD, etc.
Cyrille Berger
cberger at cberger.net
Sat Dec 23 18:13:13 CET 2006
Hi,
Thanks for the link, but I allready had a look at it and discard it, for two
reasons :
- C API, it might look childish, but when you are used to well designed C++
API, most C API looks scary and you don't want to touch it
- the first time I had a look at it, I didn't understand how to use it...
- and most important, you call each functions giving a float*/char*/whatever*
array, then the function load the pointer in the mmx/sse register, do the
operation, and unload the register to the result array. It might be fine for
a small operation, but if you want to do the sum of pixels and store the
result in destination, you are doing too many unloading (and I don't want to
trust gcc to optimize stuff and remove the unneccesserary loading/unloading)
Basicaly my requirements were to write code like this:
void sumColors(char* resarr, char* srcarr, int nPixels)
{
Vector result;
Packet source(srcarr);
for(int i = 0; i < nPixels; i++)
{
result += source;
source.nextPixel();
}
result /= nPixels;
result.copyTo(resarr);
}
And then leave the compiler to build for me two sumColors functions, one with
SIMD and one without. I am close to be able to do this.
macstl (http://www.pixelglow.com/macstl/, it also work for non-mac
hardware ;) ) was very close to allow me to do this, but it had a few
problems as well, first, it wasn't really possible to select at runtime which
instruction set to choose, second point the code bellow would have looked a
little more clutered (plus an other reason I have forgotten).
(I even want the class Packet to be able to work on two pixels (or more) at a
time if their is enought room in the vector, and I want to write sumColors
only once).
--
--- Cyrille Berger ---
More information about the kimageshop
mailing list