[Kstars-devel] XYZ Rotation matrix in 3.5.x
Jason Harris
kstars at 30doradus.org
Thu Jun 15 17:35:55 CEST 2006
Hello,
Sorry I'm coming late to this thread. A couple of things I want to note:
+ It looks like you've implemented your own XYZ system for skypoint
("skyP->cx")...did you know that I already added XYZ members to SkyPoint in
trunk?
+ I don't think we need a function like slowToScreenXYZ() to compute the
rotation matrix. The matrix elements can be derived directly from the RA/Dec
of the Focus point:
Rotation about z-axis (i.e., rotation to bring Focus point's RA to center of
projection):
cos(RA) 0 sin(RA)
0 1 0
-sin(RA) 0 cos(RA)
Rotation about x-axis (i.e, rotation to bring Focus point's Dec to center of
projection):
1 0 0
0 cos(Dec) -sin(Dec)
0 sin(Dec) cos(Dec)
+ The code to convert from RA/Dec to Az/Alt is already in
SkyPoint::EquatorialToHorizontal(), commented out. I did some timing tests
of it just now, compared to the existing spherical trig method. To isolate
the codepath, I added a loop over the Stars which only calls
EquatorialToHorizontal() for each one, and bound this code to a key.
It also prints out the elapsed time for the loop.
In this case, I see no difference at all between the two methods! Both take
30 ms, on average. There's some extra overhead in the XYZ method, because I
redefine the rotation angles for each object, when these should only be done
once. So there are two extra SinCos() calls. Still, not a huge savings, I
think.
regards,
Jason
On Thursday 15 June 2006 02:21, James Bowlin wrote:
> On Thursday 15 June 2006 02:10, Luciano Montanaro wrote:
> > Well, a 20% improvement is not something to sneeze at. Maybe it's not
> > sufficient to visibly improve things, but it may be one of the needed
> > optimizations.
>
> There is some added expense/baggage using the rotation matrix. First,
> each SkyPoint needs storage space for 3 more doubles, the x-y-z. Also,
> in the simple way I implemented it, it does not work for AltAz coordinates.
> It is possible that this could be fixed by recalculating the x-y-z
> coordinates for all SkyPoints. But these would then need to get updated
> (I think). Also, I don't know how to deal with refraction correction.
>
> If it didn't come with this extra baggage, then I would have suggested that
> we adopt it. If it had given us a factor of 2 or 3 speed improvement then
> I would have argued that the extra expense would be worth it. But as it
> stands now, I think our efforts would be better spent elsewhere. But I
> don't have strong feelings about it. If other people want this code in
> the 4.0 trunk, I would be happy to help port it.
>
> > > So it appears that modern processors can do simple sin() and cos()
> > > calculations almost as fast as multiplication. This implies that there
> > > is probably not much speed boost from caching sin() and cos() values,
> > > at least for modern processors.
> >
> > Are you sure the speedup is not masked by the overhead of function calls?
> > Maybe the toScreen function should be inlined.
>
> This is possible. It could also be masked by the time it takes to do the
> actual drawing on the screen. The times roughly doubled when I switched
> from outline mode to filled mode for drawing the Milky Way. I'll try
> inlining and see if that makes any difference.
>
> > Also in the toScreen function, you evaluate
> >
> > double Width = ( width() * screen_scale );
> > double Height = ( height() * screen_scale );
> >
> > each time, while it is a value which will not change between invocations.
> > (You could actually cache Width * 0.5 and Height * 0.5, from what I see.
>
> Good idea. But I don't expect this change to give us a startling speed
> boost.
>
> > Obviously, these optimizations will not do much good if the problem is
> > the function call overhead anyway.
> >
> > > For completeness, I've included the new code below. Only the first two
> > > routines are new. 98% of slowToScreenXYZ() came from the original
> > > toXY(). Maybe there is a more efficient way of doing the matrix
> > > calculation, making use of the SSE registers or something. I tried
> > > removing the sqrt() from both branches but this didn't significantly
> > > alter the timing results.
> >
> > Maybe. But vector instructions works best if you feed them with, well,
> > vectors, so probably it should be better to ave a function to iterate
> > over a vector of coordinates.
>
> I can give this a try by putting the celestial x-y-z values in an array and
> also put the nine rotation coefficients in an array (or an array of
> arrays). But I doubt that the compiler is being given the flags that tell
> it to use the SSE registers. Usually this sort of thing is hand-coded but
> maybe compilers have advanced so far that they are now good at this sort of
> thing. But you did remind me, I remember working on some code like this
> about 15 years ago and there was a mild speed boost when the data was put
> into arrays.
>
> Thanks for the comments Luciano. You have made me realize that I should
> probably run a few more tests: inlining toScreen(); caching the Width and
> Height; using arrays for the matrix evaluation; and also running a test
> that just calls toScreen() and toXY() repeatedly to eliminate masking due
> to the drawing functions.
>
> I had been hoping that this might be a quick "end around" the speed
> problems in the 4.0 code that would make it more responsive and thus a more
> user friendly development platform. Unless I've made a mistake (or a
> series of mistakes) it doesn't look like that is going to happen.
--
KStars: http://edu.kde.org/kstars
Community Forums: http://kstars.30doradus.org
More information about the Kstars-devel
mailing list