[Kstars-devel] Some HTM progress

Fri May 26 20:16:31 CEST 2006

On Friday 26 May 2006 06:20 am, Jason Harris wrote:
> > Levels Triangles Resolution   Indexes/sec  Time for HIP catalog
> > ------ --------- ------------ -----------  --------------------
> >   14     500 Meg  30 arc-secs  20K          6 seconds
> >   20    2000 Gig .01 arc-secs  17K          7 seconds
>
> Interesting numbers, but I don't understand the "Triangles" column.

That column is the number of triangles needed to cover the sphere.

> I guess there's no point going deper than 8 levels.  Probably 4 or 5
> levels would be fine.

I made a mistake in the table of levels.  The first level should be level
0 with 8 triangles, so I was off by one in my table.  I agree with you
that the deepest we would need to go would be around level 4.

Here is a corrected version:

Level   triangles stars/triangle
-----   --------- --------------
   0       8         16k
   1      32          4k
   2     128          1k
   3     512        250
   4       2k        64
   5       8k        16
   6       32k        4
   9       128k       1
   8       512k       1/4
   9      2,048k      1/16

> > I was also able to get a sample program working that returns a list of
> > triangles that cover a "rectangle" specified by 4 input points.
>
> Excellent.  Another (perhaps faster) option is to use the HTM primitive
> that they call a "Constraint"; i.e., the intersection of a plane with
> the sphere.  This gives a circular cross-section, which we will choose
> to be superscribed on the rectangle of the skymap widget.  For this, all
> we need is the position of the Focus point, and the angular size of the
> Skymap's diagonal.  What do you think?

This is a good question.  We will certainly want the ability to get all
of the stars within a circle for finding things close to the cursor.
My understanding is that the rectangle is implemented by intersecting
4 constraints.  Since it is already part of their library, I suggest
we experiment with both ways (rectangle, circle) and see which one is
fastest.  We could also estimate the speeds if you could tell us how
long it takes to clip one star.  

> Hmm, even that seems like it would be slow.  In the current code, the
> stars list is sorted by magnitude (bright to faint), and we simply stop
> looping through the stars when we reach the (zoom-dependent) faint
> limit.  We can do the same with the multiple lists, can't we?  Since
> they'll be populated by reading the hip*.dat files (which are sorted by
> magnitude, each triangle's star list will be naturally sorted by
> magnitude.  We can just loop over each until the faint limit is reached.

I've thought about it and I reached this same conclusion.  Having a preset
clipping threshold and sorted lists in each triangle is the best way to go.

> > If we populate X different levels then if we have to move an object,
> > it will need to be moved X times, once for each level that is
> > populated.
>
> Really?  The way I was imagining it, only the "last" level (i.e., the
> smallest triangles) would actually contain lists of objects.  "Parent"
> triangles would simply refer to the lists of their children.  Like this:
>
> class Triangle {
> public:
>    TriangleBase( int _id );
>    ~TriangleBase();
>    int id() const { return m_ID; }
>    bool isLastLevel();
>
>    virtual QList<StarObject*> starList();
>    void addStar( StarObject *s );
> };
>
> Triangle::starList() would return the Triangle's list of stars, if it is
> on the last level.  If it is not, then starList() will return the
> starList()'s of all last-level triangles contained within the Triangle.
> isLastLevel() should be a trivial matter of parsing the ID number, as
> will determining the last-level triangles that this Triangle contains.

The problem with this is that we lose the magnitude sort order.  If the
lists are all stored at level (say) 5, then there is no need to deal with
triangles at a higher level (like 3).  Might as well just deal with the
level 5 triangles.  It could be that a level 5 HTM would actually slow
things down a bit when we are viewing the entire sky because we have to
go through thousands of small lists, most of which ,don't contain any
stars bright enough to go on the map.

My idea of populating the triangles at multiple levels would get around
this problem.  Looking at the numbers above, I think we would probably
only need to populate at most two levels: one around levels 4 or 5 and
another around levels 2 or 3.

Again I think we should experiment first to find out if this is a valid
concern or not.  The experiments can be carried out by only populating
one level at a time.  I think we should first try just a single level.
After we do some speed tests, we can decide whether or not it makes
sense to populate more than one level.

Certainly let's start with doing the simplest thing first.  But while
we are still in the talking stage, I think it is good to keep in mind
possible "escape hatches" like this.  My feeling is that if we can get
a factor of 2 speedup when fully zoomed out, it would probably be
worth it.

> ... I agree, let's concentrate on the fixed stars for now.  

Good, that makes things easier for now.  The difficulties with moving
objects goes away for now.  If we decide to implement multiple levels
for the fixed stars this doesn't mean we need to do the same thing
for things that move.  In fact, I have no idea if moving objects should
use an HTM at all.  If the positions are updated at roughly the same
rate that the screen is updated then the HTM may be of no help and
might actually slow things down a bit with useless work.  On the other
hand, if the HTM indexing is so fast that it beats manually clipping
each moving object then I would be very impressed.

On my 1.6 GHz P4, their code claims they can do 100,000 level 5 indexes
per second.  Is there an easy way to time how long it takes to clip
points with the current methods?

> By the way, something we should consider while thinking about all of
> this is the new class QGraphicsView that is going to be introduced with
> Qt 4.2.  

Thanks for the heads up.  I followed the link but I'm not going to think
about it anymore.  I will concentrate on HTM for the nonce.

Here are some mundanities that might be a bit too early to decide on but
this is how my mind works.

1) Should we add the triangle info to the HIP file format?  I am thinking
   yes because it will save 5 or 6 seconds of 100% CPU usage at startup.

2) In the final implementation, how do we store the triangles for rapid
   access?  In general, we will be using the HTM algorithms to give us 
   a list of triangles, we then want to access the star data contained
   in those triangles.  In Perl or Ruby I would just store them in a hash
   keyed on the triangle ID string (S032012, etc).  I know that is a
   QHash in QT-4, but I was thinking that we might want to store pointers
   to the triangles in an array, (or maybe even an array of triangles).
   The HTM routines seem to be able to return triangle ID's as numbers
   as well as as strings.  These numbers could be used to index an array
   which would be the smallest and fastest way to access the triangles.

   It is only in proof-reading this that I've realized my array idea is
   an alternative to the quad-tree storage you were suggesting in the
   code snippet above.  Since we will know the size beforehand (and it
   is at most in the low thousands) I really think an array is the right
   tool for this job.  What do you think?

3) The up-to-date compiler seems to be barfing on the STL vector class
   which is brought into the code via include directives that look like:

   #include <vector>

   If this was a "normal" C style include like <vector.h>, I would just
   copy over the gcc-2.95 vector header file(s) into the local include
   directory and change the angle brackets to double quotes.  Any hints,
   suggestions, or dire warnings before I try something like this?

   BTW:  The older compiler is using the standard libraries from the new
   compiler, the major difference is these header files.  That is why I
   am hopeful that using the older headers may allow the code to work
   under the new compiler.  A better solution would be to fix the code
   so it works with the new headers.

I did find some GPL declarations in four of the top level example apps in
the htm/app/ directory.  I really don't know if they are meant to also 
apply to the code under htm/src or if they are just for the code in those
top level source files.

What to do next?  Here is my current plan, but I am open to suggestions.

1) Try to get the code to compile with the current compiler.  This would
   open things up so that others could play and would make life much
   easier in general.  It will have to been done eventually anyway.
   If we ever hear back from the authors, they may already have figured
   out the fixes required so I am not going to spend huge amounts of
   time on this.

2) Write a Perl wrapper around their libs.  I suppose I could probably do
   this with the older compiler but I would be happier if I could get step
   (1) above working first.  This might seem like a sidetrack but I think
   it will make it much, much easier to play around with HTM.  If I get
   really stuck, I will drop it.

3) Do some experiments with HTM.  Read in the HIP files and experiment
   with populating different levels, etc.  If we put our minds to it,
   I am confident we can write C code that is as fast as Perl.  :-)
   (I am less confident about C++ :-).

-- 
Peace, James