[Kstars-devel] Testing the use of quaternions in KStars
Torsten Rahn
torsten.rahn at credativ.de
Mon Oct 2 19:16:46 CEST 2006
Sorry, for my somewhat weird reply in the morning.
I only scanned through James' mail shortly before I had to leave my flat in a
hurry. So if you read my mail and if you didn't really understand everything
I said: Not it was not your fault. It was me assuming things from James'
about Jason's implementation that weren't in there.
So here is some more thoughts. This time hopefully much more clearly :-)
Before I cover the quaternion issue I'd like to say that the performance of
Marble is not only due to using the combination of matrix multiplication and
Quaternions:
For the texturing it's mostly because of the organisation of the bitmaps in
tiles and due to smart linear interpolation.
For the vectors it's mostly due to:
- calculating the coordinates of the polygon boundary rectangle beforehand and
not processing everything that is in boundaries that don't "touch" the
screencoordinates. (I don't know whether KStars organizes star coordinates in
memory in rectangular "tiles" already - if you don't, that could be an
approach to skip data that doesn't get displayed anyways).
- While zooming into the virtual globe you get from an interval of 0<= z <= 1
being covered to some smaller interval 1 - epsilon <= z <= 1. It should be
easy to calculate the smallest epsilon that still covers the whole screen.
So by calculating the Z-value of the objects first you can decide whether it's
worth to further progress the data or to skip it.
- clipping of polygons: I have my own clippainter class: It makes sure that
nodes outside the screen don't get painted. Unfortunately the class is still
a bit buggy for higher zoom levels.
For bumpmapping (yes, bumpmapping for the topographic map happens on the fly
for each frame as well ... and yes, I know it's not of interest for kstars):
- I do the most simple bumpmapping that I was able to come up with by
comparing the values of pixels that are horizontally 3 pixels away from each
other. I compensate for perspective distortion of the bumpmapping by doing
some really problem specific cheap approximations. I do that in the class
that colorizes the grayscale textured sphere on the fly and combines the
information of the grayscale map with the vector data.
> toScreen(): 22 ms
> toScreenQuaternion(): 9 ms
Now that looks much more in accordance with my tests that compared my earlier
cosine / sine implementation with the quaternion/matrix one.
I'd even bet that on devices that use Qtopia the difference would be even
larger as they usually don't play well with sines and cosines (and floating
point calculations as far as I heard). Since I'd like Marble to work on those
Greenphones and PDA's as well, that's a good reason for me to choose matrix
multiplications over "trigonometric" calculations.
Jason, did you do those measurements with -msse ?
CFLAGS = -pipe -O2 -msse -O2 -Wall -W -D_REENTRANT $(DEFINES)
CXXFLAGS = -pipe -O2 -msse -O2 -Wall -W -D_REENTRANT $(DEFINES)
That might boost the advantage even more as I guess that this already
optimizes the code in a way that the matrix multiplications get executed
concurrently for all components at the same time.
I'm not sure whether gcc does it as good as possible. Someone would have to
look at
http://www.cortstratton.org/articles/OptimizingForSSE.php
I'd especially be interested whether the "Batch Processing" suggested there
could be used with a significant advantage together with the crossplattform
xmmintrin.h instead of real inline assembly code (but then again maybe the
gcc is really smart already and does that already for us).
> This evening, I added experimental support for quaternions in KStars. It
> was surprisingly easy to do. [...] SkyPoint [...] (marble has a GeoPoint
> [...] SkyMap (following marble's KAtlasGlobe),
Now that are nice similarities :-)
> QPointF SkyMap::toScreenQuaternion( SkyPoint *o, double scale ) {
> QPointF p;
> Quaternion oq = o->quat();
> oq.rotateAroundAxis( m_rotAxis );
>
> p.setX( 0.5*width() - scale*oq.v[Q_X] );
> p.setY( 0.5*height() - scale*oq.v[Q_Y] );
>
> return p;
> }
Yes, that looks familiar :-))) However as mentioned before it _might_ make
more sense to check whether v[Q_Z] is within an intervall that would get
displayed on the screen before you do possibly useless calculations of p.x
and p.y. Up to you to find out ...
I agree with your replies to Luciano. And concerning Luciano's suggestions
about parallelization you might want to look at my quaternion class. At some
point of development I used the Quaternion representation of m_rotAxis to
rotate the vector: So at that point of time I used:
void Quaternion::rotateAroundAxis(const Quaternion &q) {
instead of:
void Quaternion::rotateAroundAxis(const matrix &m)
The latter has less operations and is easier to parallelize due to differing
signs of the components in the former.
As you can see there is another method which tried exactly to accomplish what
Luciano suggested:
void QuaternionSSE::rotateAroundAxis(const Quaternion &q)
(and it even worked except for that it was not faster -- maybe due to the
inherent sign issue mentioned already)
Now somebody would have to create a similar method
void QuaternionSSE::rotateAroundAxis(const matrix &m)
based on the "OptimizingForSSE" document above and maybe using XMM Intrinsics
instead of inline assembly to keep it cross plattform. Everything needed
should be well prepared already ;-)
BTW: You'll find me (tackat) on IRC #kde-edu quite often ...
Torsten
More information about the Kstars-devel
mailing list