[Kstars-devel] XYZ Rotation matrix in 3.5.x

Luciano Montanaro mikelima at cirulla.net
Thu Jun 15 10:10:01 CEST 2006


On Thursday 15 June 2006 03:03, James Bowlin wrote:
> I implemented the X-Y-Z rotation matrix for converting from celestial
> coordinates to screen coordinates in the 3.5.x branch.
>
> The good news is that it was very easy to implement.  The bad news is
> that it did not significantly speed things up.  I used the QTime code
> that Jason put in the 4.0 trunk in order to time the drawing of the Milky
> Way.  I chose to time this because I was familiar with the code and I
> already had two different versions of it in the program.  I didn't cull
> out any points so the new version had to deal with 250 more points,
> roughly an 8% increase. The new version also did line clipping but I
> arranged the display so this only occurred in a few places.
>
> Here are typical results for drawing the outline of the Milky Way:
>
> toXY() Milky Way took 7.23 ms  (old way)
>  X-Y-Z Milky Way took 6.00 ms  (new way)
>
> I put in a loop in each branch to run the code 100 times which gave me
> two extra digits of timing resolution.
>
> I did a simple timing experiment to compare the speed of a multiplication
> to the speed of doing sin() and in this experiment the multiplication was
> faster but not by much, maybe 20% or so.

Well, a 20% improvement is not something to sneeze at. Maybe it's not 
sufficient to visibly improve things, but it may be one of the needed 
optimizations.

>
> So it appears that modern processors can do simple sin() and cos()
> calculations almost as fast as multiplication.  This implies that there
> is probably not much speed boost from caching sin() and cos() values,
> at least for modern processors.
>

Are you sure the speedup is not masked by the overhead of function calls?
Maybe the toScreen function should be inlined.

Also in the toScreen function, you evaluate     

    double Width = ( width() * screen_scale );
    double Height = ( height() * screen_scale );

each time, while it is a value which will not change between invocations.
(You could actually cache Width * 0.5 and Height * 0.5, from what I see.

Obviously, these optimizations will not do much good if the problem is the 
function call overhead anyway.

> For completeness, I've included the new code below.  Only the first two
> routines are new.  98% of slowToScreenXYZ() came from the original
> toXY(). Maybe there is a more efficient way of doing the matrix
> calculation, making use of the SSE registers or something.  I tried
> removing the sqrt() from both branches but this didn't significantly
> alter the timing results.
>

Maybe. But vector instructions works best if you feed them with, well, 
vectors, so probably it should be better to ave a function to iterate over 
a vector of coordinates.


Luciano 


-- 
Łŭčīåñø Montanaro //
              \\ //
               \x/ www.cirulla.net


More information about the Kstars-devel mailing list