[Kstars-devel] profiling KStars
Jason Harris
kstars at 30doradus.org
Tue May 23 20:27:19 CEST 2006
Hello,
I've begun to work on trying to speed KStars up a bit. I've been trying
to use callgrind to profile KStars, but I'm having trouble getting it to
work. When I run kstars through callgrind, it always seems to crash
when it finishes the startup sequence and draws the map (which is
exactly the part I want to profile!). If anyone is interested in trying
to help profile KStars, please reply.
Even without the profiling results, I have some ideas on how we can save
CPU cycles. One expensive procedure is that every time the skymap is
recomputed, we need to determine whether each object is on-screen, based
on its position in the sky relative to the current Focus point.
The function we have for this (SkyMap::checkVisibility()) is fairly
well-optimized, but it still needs to be run on every object in the
database, so any savings in this function should have a big payoff.
One idea is to use what's called a "hierarchical triangular mesh" (HTM).
The concept is described in detail here:
http://www.sdss.jhu.edu/htm/
but here is a basic description:
We are dividing the celestial sphere into a large number of nested
triangular sections. The toplevel division uses the Zenith, Nadir
(anti-zenith), and the 4 cardinal points on the equator as vertices to
divide the sky into 8 equilateral spherical triangles. Then, each of
these can be divided into 4 sub-triangles by connecting the midpoints of
each triangle's sides. Then each of *those* can be divided into 4
pieces the same way. You can continue this to as many levels as you
like, dividing the sphere into smaller and smaller pieces. At a given
level, any triangle can be identified by a simple integer, indicating
its location at each level in the hierarchy. For example, you could
have a triangle "+203102311201" (the leading "+/-" indicates northern
vs. southern hemisphere, then you have digits 0-3 identifying which
sub-triangle at the current level). I think I would probably use a
binary two-bit representation for each level, instead of the digits 0123
(i.e., 0=="00", 1=="01", 2=="10", 3=="11"), so the above triangle ID
would be "+100011010010110101100001", or +9,252,193 in decimal.
Ok, so how does this help us? I think it can provide a way to tell very
quickly whether a point is onscreen or not. For example, consider a
case where the four corners of the current visible section are in the
following triangles:
+100011010010110101101001
+100011010010110101111000
+100011010010110101111011
+100011010010110101110001
We must be zoomed in pretty far, because these ID strings only differ
near the end. In other words, they all share a common "parent"
triangle, namely:
+100011010010110101
Any point on the sky which is not also in that parent triangle is
instantly rejected as being offscreen. So our checkVisibility()
function can be reduced to a few bitwise comparisons of integers, which
should be very fast indeed. However, we'd still need to run
checkVisibility() on the entire object catalog every time. Maybe we
could also store objects in a hierarchical list structure that mimics
the HTM, which would allow us to skip entire sections of objects based
on their position within the tree.
I don't have all the details of exactly how this should be implemented,
but I just wanted to share things that I'm thinking about here.
Another idea I'm playing with is one that's come up before on this list:
modifying SkyPoint to store the X,Y,Z coordinates of each point,
instead of (or in addition to) the RA and Dec coordinates. This could
improve the speed of calculating each object's position on the screen.
Right now, we do spherical trigonometry on each SkyPoint and the SkyMap
FocusPoint to determine the current screen cordinates of a given object,
and these calculations are pretty expensive.
If we stored the coordinates of each object as X,Y,Z, then we only need
to do a simple multiplication to determine each object's screen
coordinates (Xs, Ys):
Xs = a1*X + b1*Y + c1*Z
Ys = a2*X + b2*Y + c2*Z
The important thing to realize here is that the numbers
(a1,b1,c1,a2,b2,c2) would only need to be determined *once* for a given
calculation of the sky. Then, we can loop through all objects and
repeat the same multiplication on their (X,Y,Z) coordinates to determine
their screen (Xs,Ys) coordinates! The a,b,c values basically represent
the coefficients of a rotation matrix, plus some other stuff to account
for the curent zoom level and the size of the SkyMap widget. I've done
some preliminary testing, and it looks like this method may be about 3
to 4 times faster than our current SkyMap::getXY() function.
Well, that's probably a lot to digest. Let me know what you think, and
if you're willing to help implement this stuff.
regards,
Jason
More information about the Kstars-devel
mailing list