[Kde-graphics-devel] Descriptions of new stuff
Mosfet
dan.duley at verizon.net
Wed Dec 15 15:30:02 CET 2004
Okay, here are some of the methods currently in KImageEffect that I've
rewritten for MImage. All of them have been rewritten from scratch, some have
fixes I probably forgot to mention, and all have been changed to use somewhat
standard C++ instead of weird constructs I initially used when porting from
ImageMagick. This time I actually took the time to make sure I understood the
algorithm behind the effects instead of just doing straight ports of code, so
they seem to operate much better. Much of this code I did while on hiatus
years ago, so it's been well tested by now.
Several methods also support MMX/3dnow. I haven't programmed asm since the
Z80, and never gcc inline asm, so it's probably not the best MMX code, but I
tested all of it and they are at least twice as fast as the C++ versions ;-)
Things that can be improved are adding prefetching and not using
registers for all the parameters because I'm lazy. I haven't written a CPUID
method yet, everything is an #ifdef set in configure, but this can easily be
changed to KDE's cpuinfo class. I gotta do something like this anyways since
it needs to be detected at runtime.
Anyways, onto the methods!
***Smoothscaling:
Since this is done a lot it is what everyone tries to optimize. Qt uses a
method based on NetPBM and is mostly integer. It's pretty good. The first
thing I did was add some MMX and 3dnow inline asm. I was able to use
doubleword operations to handle two pixel components at once and reciprocal
3dnow multiplication for the division. It was about twice as fast as the
original NetPBM/Qt version.
Then I got a look at Imlib2's scale. It replaces most of the integer
multiplication and division with fast bit shifts. Fairly complicated code,
but the normal C version runs as fast as my MMX optimized one, (twice as fast
as Qt/NetPBM). The MMX optimized Imlib2 scale runs twice as fast as my MMX
optimized one and 4x as fast as Qt. Definitely a good one here.
I've finished porting both the C and asm Imlib2 scales to QImage. It was a
pretty straightforward port of Imlib's scale.c. Mostly I just stuck everything
in a namespace, moved things around to work off of Qt BGRA scanlines, and
removed the unneeded border code, (as well as fixed the formatting ;-). Seems
to run fine. The asm routines did not need to be modified at all.
***Blur:
Imlib also has a pretty nice, fast blur. I ported this as well.
For photographic quality you want to use a Gaussian blur, tho. This is
currently broken in KImageEffect. Lots of crazy stuff I did when I initially
ported this. Many things are fixed now: improper column calculation, removal
of upscaling/downscaling that did nothing, improper scaling of the middle of
the gaussian filter, all sorts of crap. It actually works properly now >:) I
must of been severly brain damaged when I originally wrote this. Efficency
improvements as well, esp when processing by column and not row.
It's not only fixed now but has a 3dnow version that's significantly faster.
Actually, both the normal and 3dnow versions are faster. Results should be
identical to ImageMagick 5.5.7.
I suggest using the Imlib blur for quick operations, the Gaussian blur when
dealing with photographs. Another idea I've had is to have a global setting
for the quality of various effects. This could do neat things like always use
a Gaussian blur on machines that support 3dnow but use either that or the
Imlib based blur on other machines depending on the quality setting. It would
also affect convolve matrix sizes and other stuff. Just an idea.
*** Convolve (Edge, Emboss, Charcoal, etc...):
Convolve is another method that's pretty important. A good convolve method can
do everything but walk your dog for you, from Edging and Embossing to
Gaussian filters. Here is a little background:
These methods are based on "pixel neighborhoods". The "neighbors" of a pixel
affect that pixel's value. For example, if "p" is a pixel:
000
0p0
000
Each pixel represented by "0" is used when calculating the value of "p" if
using a 3x3 matrix. You can use larger, (odd numbered) matrixes as well, and
that just adds to the surrounding pixels that are calculated.
I've entirely rewritten this method. I'm not even sure how many bugs were
fixed ;-) Much saner pixel neighborhood calculation, no more unreadable
jumptable code, and faster because it only checks if the pixel neighborhood
is outside horizontal image boundaries when on an edge. Results now pretty
much exactly match ImageMagick, although I will probably add an option to use
a smaller matrix that more properly matches our 8bit pixel components. This
would avoid range checking, which is now always used.
This also now supports 3dnow for a rather sizable performance increase. Many
people don't know about convolve but it's one of the methods that really
excite me. There are tons of algorithms for it we can implement. I'm also
considering doing an integer-only version ala ImageMagick 4. That way we can
do faster, lower quality convolves as well.
***HSV Contrast, (contrastHSV, simpleContrast):
A simple operation in theory, you lighten light pixels and darken dark ones.
The thing is, you're not supposed to use a fixed amount based on brightness.
You're supposed to use a curve. ImageMagick uses the following algorithm:
alpha*(alpha*(sin(M_PI*(brightness-alpha))+1.0)-brightness)
where alpha is 0.5+M_EPSILON. BTW, I used the wrong algorithm in KImageEffect.
It was for HSL not HSV and brightness was calculated wrong.
Anyways, if you calculate this for brightness values running from percentages
of 0-255 like Qt uses then you can replace this with a 256 byte lookup table.
Since it's a curve half the table is the same as the other half, but with
inverse values, so it can actually be a 128 byte lookup table. This is what I
do now.
Which brings us to the simple contrast method in KImageEffect. This is all
wrong. Despite what I previously hoped you cannot use grayscale values to
decide what to lighten and what to darken and you can just add increments to
RGB values. I would like to obselete this method and replace it with the
proper HSV contrast. It would be slower than the current simpleContrast()
method because of the HSV calculation, but faster than contrastHSV() because
of the lookup table, and much more correct.
***Histogram based effects (Equalize, Normalize):
This now uses faster integer histograms and maps instead of doubles and no
longer has unneeded upscale/downscale code that did nothing. The code is now
much more faster and readable. Not sure why I originally did this in floating
point other than that's how ImageMagick did it...
***Despeckle (and Hull):
Rewritten and now takes 1/4th the memory it used to. I still have to do an
8bpp palette source version, tho. It now promotes it.
***Grayscale:
MMX SIMD support for handling 2 pixels at once and pmaddwd. 32bpp images can
optionally be reduced to 8bpp palette. This is all you need anyways since red
= green = blue, but you may not want to reduce the image if your going to be
doing more effects or scaling.
***Invert:
MMX SIMD for this, too. Nothing special, just does the xor on 2 doublewords
at once.
More information about the Kde-graphics-devel
mailing list