[Kde-graphics-devel] Descriptions of new stuff

Wed Dec 15 15:30:02 CET 2004

Okay, here are some of the methods currently in KImageEffect that I've 
rewritten for MImage. All of them have been rewritten from scratch, some have 
fixes I probably forgot to mention, and all have been changed to use somewhat 
standard C++ instead of weird constructs I initially used when porting from 
ImageMagick. This time I actually took the time to make sure I understood the 
algorithm behind the effects instead of just doing straight ports of code, so 
they seem to operate much better. Much of this code I did while on hiatus 
years ago, so it's been well tested by now.

Several methods also support MMX/3dnow. I haven't programmed asm since the 
Z80, and never gcc inline asm, so it's probably not the best MMX code, but I 
tested all of it and they are at least twice as fast as the C++ versions ;-) 
Things that can be improved are adding prefetching and not using 
registers for all the parameters because I'm lazy. I haven't written a CPUID 
method yet, everything is an #ifdef set in configure, but this can easily be 
changed to KDE's cpuinfo class. I gotta do something like this anyways since 
it needs to be detected at runtime.

Anyways, onto the methods!

***Smoothscaling:
Since this is done a lot it is what everyone tries to optimize. Qt uses a 
method based on NetPBM and is mostly integer. It's pretty good. The first 
thing I did was add some MMX and 3dnow inline asm. I was able to use 
doubleword operations to handle two pixel components at once and reciprocal 
3dnow multiplication for the division. It was about twice as fast as the 
original NetPBM/Qt version.

Then I got a look at Imlib2's scale. It replaces most of the integer 
multiplication and division with fast bit shifts. Fairly complicated code, 
but the normal C version runs as fast as my MMX optimized one, (twice as fast 
as Qt/NetPBM). The MMX optimized Imlib2 scale runs twice as fast as my MMX 
optimized one and 4x as fast as Qt. Definitely a good one here. 

I've finished porting both the C and asm Imlib2 scales to QImage. It was a 
pretty straightforward port of Imlib's scale.c. Mostly I just stuck everything 
in a namespace, moved things around to work off of Qt BGRA scanlines, and 
removed the unneeded border code, (as well as fixed the formatting ;-). Seems 
to run fine. The asm routines did not need to be modified at all.

***Blur:
Imlib also has a pretty nice, fast blur. I ported this as well. 

For photographic quality you want to use a Gaussian blur, tho. This is 
currently broken in KImageEffect. Lots of crazy stuff I did when I initially 
ported this. Many things are fixed now: improper column calculation, removal 
of upscaling/downscaling that did nothing, improper scaling of the middle of 
the gaussian filter, all sorts of crap. It actually works properly now >:) I 
must of been severly brain damaged when I originally wrote this. Efficency 
improvements as well, esp when processing by column and not row.

It's not only fixed now but has a 3dnow version that's significantly faster. 
Actually, both the normal and 3dnow versions are faster. Results should be 
identical to ImageMagick 5.5.7.

I suggest using the Imlib blur for quick operations, the Gaussian blur when 
dealing with photographs. Another idea I've had is to have a global setting 
for the quality of various effects. This could do neat things like always use 
a Gaussian blur on machines that support 3dnow but use either that or the 
Imlib based blur on other machines depending on the quality setting. It would 
also affect convolve matrix sizes and other stuff. Just an idea. 

*** Convolve (Edge, Emboss, Charcoal, etc...):
Convolve is another method that's pretty important. A good convolve method can 
do everything but walk your dog for you, from Edging and Embossing to 
Gaussian filters. Here is a little background:

These methods are based on "pixel neighborhoods". The "neighbors" of a pixel 
affect that pixel's value. For example, if "p" is a pixel: 

000
0p0
000

Each pixel represented by "0" is used when calculating the value of "p" if 
using a 3x3 matrix. You can use larger, (odd numbered) matrixes as well, and 
that just adds to the surrounding pixels that are calculated.

I've entirely rewritten this method. I'm not even sure how many bugs were 
fixed ;-) Much saner pixel neighborhood calculation, no more unreadable 
jumptable code, and faster because it only checks if the pixel neighborhood 
is outside horizontal image boundaries when on an edge. Results now pretty 
much exactly match ImageMagick, although I will probably add an option to use 
a smaller matrix that more properly matches our 8bit pixel components. This 
would avoid range checking, which is now always used.

This also now supports 3dnow for a rather sizable performance increase. Many 
people don't know about convolve but it's one of the methods that really 
excite me. There are tons of algorithms for it we can implement. I'm also 
considering doing an integer-only version ala ImageMagick 4. That way we can 
do faster, lower quality convolves as well.

***HSV Contrast, (contrastHSV, simpleContrast):
A simple operation in theory, you lighten light pixels and darken dark ones. 
The thing is, you're not supposed to use a fixed amount based on brightness. 
You're supposed to use a curve. ImageMagick uses the following algorithm:

alpha*(alpha*(sin(M_PI*(brightness-alpha))+1.0)-brightness)

where alpha is 0.5+M_EPSILON. BTW, I used the wrong algorithm in KImageEffect. 
It was for HSL not HSV and brightness was calculated wrong. 

Anyways, if you calculate this for brightness values running from percentages 
of 0-255 like Qt uses then you can replace this with a 256 byte lookup table. 
Since it's a curve half the table is the same as the other half, but with 
inverse values, so it can actually be a 128 byte lookup table. This is what I 
do now.

Which brings us to the simple contrast method in KImageEffect. This is all 
wrong. Despite what I previously hoped you cannot use grayscale values to 
decide what to lighten and what to darken and you can just add increments to 
RGB values. I would like to obselete this method and replace it with the 
proper HSV contrast. It would be slower than the current simpleContrast() 
method because of the HSV calculation, but faster than contrastHSV() because 
of the lookup table, and much more correct.

***Histogram based effects (Equalize, Normalize):
This now uses faster integer histograms and maps instead of doubles and no 
longer has unneeded upscale/downscale code that did nothing. The code is now 
much more faster and readable. Not sure why I originally did this in floating 
point other than that's how ImageMagick did it...

***Despeckle (and Hull):
Rewritten and now takes 1/4th the memory it used to. I still have to do an 
8bpp palette source version, tho. It now promotes it.

***Grayscale:
MMX SIMD support for handling 2 pixels at once and pmaddwd. 32bpp images can 
optionally be reduced to 8bpp palette. This is all you need anyways since red 
= green = blue, but you may not want to reduce the image if your going to be 
doing more effects or scaling.

***Invert:
MMX SIMD for this, too. Nothing special, just does the xor on 2 doublewords 
at once.