[Kstars-devel] Ideas for Adding a Spatial Index to KStars

Thu Jun 22 18:05:16 CEST 2006

On Thursday 22 June 2006 01:41, Jason Harris wrote:
> I understand your point...but there are reasons not to rename the 
> classes, other than inertia.  [...] "code is written once, but 
read many times" [...]

I will defer to your judgment on this issue.  I merely note that my
problems with the tab completion happen mainly when I am grepping
through the files.

> > I think it would be appropriate to change the class names one at a times
> > as we move components over to the indexing model
> > 
> Actually, I'd rather decide on a renaming scheme and do it all at once, 
> if we're going to do it.  If people really want the name change, I'd 
> suggest maybe changing "Composite" to "Group" and leaving "Component" 
> alone.  I'd like to hear what Thomas thinks though.

> [...]
> It doesn't make sense to move code to the Composites.  They are just 
> containers that pass messages to their child Components.

I think this is the only place where I am not on the same bus that you
and Thomas are on.  Perhaps my viewpoint is skewed because I've mainly
worked on the constellation lines and the Milky Way.  In both of these,
the child components hold a small subset of all the points.  There is
basically a new child component created for every line in the respective
data file that begins with "M": 179 in clines.dat, 135 in milkyway.dat.

To make best use of the spatial index and to improve the clipping of the
Milky Way, these numbers should probably go up.  Since the draw() method
is cascaded down from the composite to the component, there is an extra
(and unneeded) speed cost for each extra piece.  Because the draw() is
in the component and not the composite (where it should be) every extra
"M" in the data files results and another unneeded call to QSetPen()
or whatever.

If we don't care about the speed of the drawing then there is nothing
wrong with the current scheme.  But if we are concerned with the speed
of drawing (or why else are we going to the trouble of adding a spatial
index) then I think we should move the drawing code into the class where
it is most efficient.  In the two sets of classes I've worked on, it
is more efficient for the drawing code to be in the composite class.

The key idea is that in order to boost the speed of drawing we need to
break the strict (or naive) object-oriented model and "vectorize" the
inner drawing loop.

Perhaps the only two pieces I've worked on, constellation lines and the
Milky Way, are the exceptions to the rule and are the only cases where
this improvement needs to be made.

As I said in the post that started this thread:

  The plan for making things draw as fast as possible is very simple:
  move all unneeded code and control flow out of the innermost drawing
  loops.

In general, in order to making drawing fast and efficient in any object
oriented language, the object-oriented model must be broken.  the code
that actually does the drawing is always in the next level up (at least)
from the classes that contain the data so you can "vectorize" the inner
drawing loop.

Let's focus in on the MilkyWay and whatever solution we come up with
for that should also apply to the other PointList classes.  The number
of what are currently component objects for all of these PointList
classes should probably greatly increase in order to make best use of
the spatial index.

The current situation (which I helped create but I modeled my changes on
the exiting code in ConstellationLines*) is:

MilkyWayComponent contains the data and the draw code.  Each MWComponent
contains a "PointList pointList" and a "QHash<int> skip" which is used
for used for skipping points in outline mode so we can use the same set
of data for both outlines and filled polygons.  MilkyWayComposite only
holds the code for reading the data file.

The problem with this is that some decisions that should only made once
per draw cycle, such as outline vs. filled mode or anti-aliased vs. integer
draw are now made at the start of every draw() call which means 135 times
instead of one.  Likewise, setPen and setBrush are called 135 times and
even a new QColor is created 135 times.  

I see two solutions.  They both work for me.

Solution A) Move the draw() routine to the MilkyWayComposite class.
This is the solution that requires the least change to the current code
base which is why I suggested it.  The MilkyWayComponent will then become
a simple data container holding the PointList and QHash<int> but no code.
The composite must be a friend of the component so it can access the data
directly.  All MilkyWayComponent instances are in a single list in the
MilkWayComposite.

Solution B) Leave the draw() code in the MilkyWayComponent but create a
third MilkyWay class (maybe MilkyWayPolygon) which acts as the simple data
container (containing the PointList and QHash<int>).  MilkyWayComponent
becomes a friend of MilkyWayPolygon so it can access the data directly.
We have one instance of MilkyWayComposite.  It still contains the code to
read the data file and nothing else.  We have one instance of
MilkWayComponent which contains a list of 135 (or more) MilkyWayPolygons.

Perhaps you have an alternative solution that would then help me understand
our implementation of the c/c model better.

I think all of the current PointList classes suffer from a similar defect
and can be fixed the same way that MilkyWay gets fixed.  The problem is
that inner-loop "vectorization" is currently being achieved by making each
pointList fairly long so the extra calls to setPen(), etc. are amortized
over a large set of points and barely noticed.  We are using the data to
optimize the code.   I have no problems with that per se.  It is a simple
and easy way to achieve optimization. 

But the introduction of the spatial index adds new design constraints and
it now becomes desirable to break those large pointLists up into smaller
pieces.  I don't know what the optimum size is but my guess is that we will
want pointList objects broken up into pieces so the can each fit into just
one or two regions.  

As far as the spatial index is concerned, we win big when we make things
smaller.  The first win is that any given piece ends up in fewer regions
and thus gets into the list of visible regions fewer times.  The second
win is that with smaller pieces most of the things in the visible region
list are actually mostly visible.  With larger pieces, we draw them more
often *and* more of the drawing (or potential drawing) is for stuff that
is not really visible.  Smaller is win, win.  Larger is lose, lose.

If we tried to breakup the pointList objects with our current code, we
would suffer a significant speed penalty because the draw() code is in
the same class that contains the data.   This is a classic problem with
C++ and the classic solution is to use the "friend" keyword to move the
drawing code at least one level up from the classes that contain the data.

-- 
Peace, James