[Kstars-devel] Plans for threading in KStars

Sun Nov 13 12:41:26 UTC 2011

[PS: The list rejected my mail since it was too large. I decided to
put the files up for download instead and resend to the list.]

Hi Alexey

This message is long, so here's a tl;dr:

<tl;dr>
I'm trying in specific to optimize DeepStarComponent::draw() which
draws USNO NOMAD stars. I've given a URL to profiles under various
scenarios. The scenario I think the user will most encounter is
Simulation Clock running and small pannings of the map. Profiles 1 and
2 correspond to these -- simulation clock running without panning,
panning without simulation clock running.

Quarternions sound like a good idea. What about Eigen? But threading
might still be beneficial for specific areas.

For those who don't know about KCacheGrind, you can use it to
visualize the profiles linked below:

Link to the profiles:
http://www.ph.utexas.edu/~asimha/KStars/
</tl;dr>

> I suppose that representation of point on sphere as two angles
> doesn't admit effificent rotation. But this is the case where
> quaternions should shine. In this case rotation is just
> multiplication by another quaternion.

So you mean quarternions work in 3D just like complex numbers work in
2D? That sounds like a good idea.

Also, maybe I wasn't being clear earlier -- I'm trying to fix the
"Sagittarius freeze" problem. Zoom in sufficiently into the
Sagittarius milky way region and it just freezes because it needs a
lot of time to draw those dense stellar regions (including those
annoying patches of catalog artifacts). So in particular, my focus is
on sky map rendering at high zoom with the USNO NOMAD catalog
installed, and making the panning and simulation smoother under these
circumstances.

> There is a lot food for thoughts. What is best representation of
> point on sphere? Are quaternions right solution? If they are how
> should we deal with addition degeneracy? (We have 2 degrees of
> freedom and unit quaternions which could represesent rotations have
> 3). Do we have class set up right? (I think we are not)

Quarternions with "unit magnitude" should work, right? Just like
complex numbers with unit magnitude work.

> Could you put your profiling results somewhere? I'd like to look at them.

So I profiled in 3 different ways:

1. Start, focus on a high star density region (I chose the region
   around M11 with a fair amount of zoom, so that Scutum covered 1/5
   of my screen real estate, which is still not really very high
   density, but sufficient). Start profiling, and then start the
   simulation clock.

2. Start, focus on a high star density region (I chose the region
   around M 11 and the "lid" of the Sagittarius "teapot") and start
   panning the map around.

3. Start, turn off simclock, enable profiling, slew to various
   high-density regions.

The most common things, IMO, that a user would want to do at high zoom
are to pan the map around locally, and/or run the simulation clock.

There are (broadly) three areas where time in
DeepStarComponent::draw() is spent, that scale with star density:

1. StarBlockList::fillToMag() -- Hitting the disk and fetching the
   		              	 star data into memory

2. StarObject::JITUpdate() -- Update star positions (do
   			      EquatorialToHorizontal etc)

3. SkyMapDraw::drawPointSource() -- Draw the point source.

This is _without_ OpenGL. I'd expect drawing to be much more faster
with OpenGL [To be profiled].

So, fillToMag() is predominantly called when the map is slewed to very
different areas, and profile #3 tells us what costs the most when the
user tries to move to a new region. This is less common from a user's
POV and so this cost doesn't compound much, and is less important,
IMO.

While the SimClock is running and we are on a high-star-density field,
both drawing and recalculation of coordinates cost. Profile #1 shows
that bost cost roughly the same. So this is something I expect will be
optimized if we use two threads. This is one of the things that the
user is most likely to do.

If we stop the SimClock (I do that, because I'm usually annoyed by how
slow things are going with the simclock) and wish to pan "locally",
the latency primarily comes from drawing. This is also something that
the user is likely to do. Looking further into the drawing code, one
finds that the projection and the drawing take roughly equal shares of
the cake. These can be done asynchronously, with one thread
"following" the other. So I would expect that this could be optimized
by threading as well.

Of course, we should probably explore quarternions first, if that's a
good idea. What about eigen? I don't know much about this, so I'm
copying Harry on it.

Regards
Akarsh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://mail.kde.org/pipermail/kstars-devel/attachments/20111113/f6623582/attachment.sig>