[Kstars-devel] Testing the use of quaternions in KStars
Torsten Rahn
torsten.rahn at credativ.de
Mon Oct 2 08:10:20 CEST 2006
On Monday, 2. October 2006 06:31, James Bowlin wrote:
> On Sunday 01 October 2006 21:02, Jason Harris wrote:
Yes, I have fallen for a similar trap during development ;-)
> > toScreen(): 22 ms
> > toScreenQuaternion(): 31 ms
> > So either our spherical-trig method is almost 30% faster than the
> > quaternion method, or I've done something wrong.
I don't know about the "special-trig" method, so no idea how it compares to
that one. However using a Quaternion to calculate the rotation-(3x3)-matrix
from has several other advantages. And that's what I do in Marble (look into
the file vectormap.cpp, e.g. line55 ).
What actually would be faster using Quaternions is if we would have to deal
with subsequent rotations, as multiplying two rotation quaternions involves
less operations than multiplying two rotation matrices.
However that problem is not what we deal with here: in our case we rotate
vectors using a rotation matrix / quaternion. In that case matrices are
faster as they involve less operations and unless you use batch operations
they are even easier to convert to SSE code as far as I learned.
> You may recall that I implemented your suggestion of using a rotation
> matrix instead of the trig functions in the 3.5 branch. I got a slight
> speed improvement but much less than a factor of two. I don't think
> that the quaternion rotation would be any faster than straight 3-d matrix
> multiplication (please correct me if you have evidence to the contrary).
right. However using quaternions for the rotation representation has some
further advantages like that it's easier to calculate the track that is
needed if you want to display an animation which shows how the focus moves on
a straight line from one star to another.
> I'd be interested in finding out exactly how many multiplications are
> used in the quaternion rotation.
In this particular case it's more operations. I also expected in the beginning
that it would be less operations but that's not the case for the "virtual
globe problem". That was one of the reasons why I started to mess with those
quaternions initially as well. However they do have advantages and that's why
I kept them in my code.
> Someone on the list made the suggestion that we write the 3-d
> rotation code in a way that would allow the compiler to use the SSE2
> instructions. I've got no idea whether gcc is smart enough to do this
> for us or if we would have to hand-code the SSE2 code in assembly (as
> I've seen done elsewhere).
Judging from my experimenting I'd say that gcc does a pretty good job at
optimization if you compile with -msse. I originally used to rotate objects
using quaternions as well and created some (beware: I'm a beginner at that)
inline assembly code (as you can see in the commented out sections of
quaternion.cpp). However that didn't result in any noticable speed
improvement (Hey, I was happy that it wasn't actually noticably slower
judging from other people's experience with creating such inline assembly
code). Usually it's rather recommended to use the xmmintrinsics
("xmmintrin.h") to create SSE code that is cross plattform anyways.
The lack of speed improvement for quaternion multiplication with "vectors" is
probably due to the fact that rotation of "vectors" via Quaternions isn't
exactly something that can be done very efficiently for one single vector due
to the different signs of each vector component.
If you batch multiply many vectors with the very same rotation matrix that
could result in quite some performance increase. However I don't know whether
the xmmintrinsics that I mentioned earlier already take that into account or
whether they can be used to do that.
> I'm pretty sure we would get a big speedup
> if we were able to use SSE2 efficiently. If nothing else, it would
> give us more registers to work with.
>
> The best introduction I know of to SIMD (single instruction, multiple
> data) is in The Art of Assembly Language Programming. The good news
> is that it is available free on-line here:
>
> http://webster.cs.ucr.edu/AoA/Linux/HTML/TheMMXInstructionSet.html
For the "rotate vector around Matrix" this URL might be helpful:
http://www.cortstratton.org/articles/OptimizingForSSE.php
However it deals with inline assembly (not crossplattform) and does so in
Intel notation.
Best regards,
Torsten
More information about the Kstars-devel
mailing list