[Kstars-devel] Some HTM progress

Fri May 26 23:24:44 CEST 2006

Hi,

James Bowlin wrote:
> On Friday 26 May 2006 06:20 am, Jason Harris wrote:
>>> Levels Triangles Resolution   Indexes/sec  Time for HIP catalog
>>> ------ --------- ------------ -----------  --------------------
>>>   14     500 Meg  30 arc-secs  20K          6 seconds
>>>   20    2000 Gig .01 arc-secs  17K          7 seconds
>> Interesting numbers, but I don't understand the "Triangles" column.
> 
> That column is the number of triangles needed to cover the sphere.
> 
So, 2 trillion triangles at level 20?  I guess we won't go that far then :)

> This is a good question.  We will certainly want the ability to get all
> of the stars within a circle for finding things close to the cursor.
> My understanding is that the rectangle is implemented by intersecting
> 4 constraints.  Since it is already part of their library, I suggest
> we experiment with both ways (rectangle, circle) and see which one is
> fastest.  We could also estimate the speeds if you could tell us how
> long it takes to clip one star.  
> 
If the Rectangular filter is the combination of 4 Constraints, then it 
will probably be much slower than a single Constraint.  But I agree, we 
should experiment to see what works best.

>> Triangle::starList() would return the Triangle's list of stars, if it is
>> on the last level.  If it is not, then starList() will return the
>> starList()'s of all last-level triangles contained within the Triangle.
>> isLastLevel() should be a trivial matter of parsing the ID number, as
>> will determining the last-level triangles that this Triangle contains.
> 
> The problem with this is that we lose the magnitude sort order.  If the
> lists are all stored at level (say) 5, then there is no need to deal with
> triangles at a higher level (like 3).  Might as well just deal with the
> level 5 triangles.  It could be that a level 5 HTM would actually slow
> things down a bit when we are viewing the entire sky because we have to
> go through thousands of small lists, most of which ,don't contain any
> stars bright enough to go on the map.
> 
A question regarding the HTM function that returns the list of triangles 
that satisfies a given constraint: Does it return higher-level 
triangles, or only the bottom level ones?  If it returns higher-level 
triangles, does it *also* return the smaller triangles that a higher 
triangle contains, or are they assumed to be covered by the presence of 
their parent triangle?

I think it should be possible to store the actual QLists in the 
bottom-level triangles only, and in higher-level triangles, refer to the 
combined lists of all of the bottom triangles that it contains.  It 
would look something like this:

QList<StarObject*> Triangle::starList()
{
     if ( isLastLevel() ) return m_StarList;

     QList<StarObject*> returnList;
     foreach( Triangle *t, containedTriangles() )
         returnList << t->starList();

     return returnList;
}

So, we build the list for a given higher-level triangle by concatenating 
the lists of its children.  Note that my function "containedTriangles()" 
should only return the four triangles immediately below this one, 
because this function is recursive, so it will automatically build up 
the list.  This version of the function doesn't preserve the magnitude 
sorting, though.  It does, however preserve memory by not permanently 
storing redundant lists.  Again, not sure what will turn out to be the 
best option, but just something to think about.

> My idea of populating the triangles at multiple levels would get around
> this problem.  Looking at the numbers above, I think we would probably
> only need to populate at most two levels: one around levels 4 or 5 and
> another around levels 2 or 3.
> 
Could very well be.  We'll have to find the sweet spot in all of this.

> Again I think we should experiment first to find out if this is a valid
> concern or not.  The experiments can be carried out by only populating
> one level at a time.  I think we should first try just a single level.
> After we do some speed tests, we can decide whether or not it makes
> sense to populate more than one level.
> 
Agreed.

> On my 1.6 GHz P4, their code claims they can do 100,000 level 5 indexes
> per second.  Is there an easy way to time how long it takes to clip
> points with the current methods?
> 
I'm not sure of exact numbers, but it's certainly less than 1 second to 
clip, project and draw the entire sky currently.  So that involves 
clipping almost 200,000 objects, and projecting/drawing whatever it's 
decided is "on screen".  IIUC, it doesn't need to index objects every 
time it clips, right?  Do you have numbers for the "slicing" operation's 
speed?

> Here are some mundanities that might be a bit too early to decide on but
> this is how my mind works.
> 
> 1) Should we add the triangle info to the HIP file format?  I am thinking
>    yes because it will save 5 or 6 seconds of 100% CPU usage at startup.
> 
Yes, this should definitely be done.

> 2) In the final implementation, how do we store the triangles for rapid
>    access?  In general, we will be using the HTM algorithms to give us 
>    a list of triangles, we then want to access the star data contained
>    in those triangles.  In Perl or Ruby I would just store them in a hash
>    keyed on the triangle ID string (S032012, etc).  I know that is a
>    QHash in QT-4, but I was thinking that we might want to store pointers
>    to the triangles in an array, (or maybe even an array of triangles).
>    The HTM routines seem to be able to return triangle ID's as numbers
>    as well as as strings.  These numbers could be used to index an array
>    which would be the smallest and fastest way to access the triangles.
> 
>    It is only in proof-reading this that I've realized my array idea is
>    an alternative to the quad-tree storage you were suggesting in the
>    code snippet above.  Since we will know the size beforehand (and it
>    is at most in the low thousands) I really think an array is the right
>    tool for this job.  What do you think?
> 
I agree, the array is the way to go.  If you go back to my original HTM 
email, I talked about a way to convert the standard HTM index 
("S032012") to a signed integer, which may make the array indexing easier.

> 3) The up-to-date compiler seems to be barfing on the STL vector class
>    which is brought into the code via include directives that look like:
> 
>    #include <vector>
> 
>    If this was a "normal" C style include like <vector.h>, I would just
>    copy over the gcc-2.95 vector header file(s) into the local include
>    directory and change the angle brackets to double quotes.  Any hints,
>    suggestions, or dire warnings before I try something like this?
> 
>    BTW:  The older compiler is using the standard libraries from the new
>    compiler, the major difference is these header files.  That is why I
>    am hopeful that using the older headers may allow the code to work
>    under the new compiler.  A better solution would be to fix the code
>    so it works with the new headers.
> 
I don't have an answer for this at the moment.  I've been in email 
contact with the HTM maintainer; maybe we should email him about this issue.

> I did find some GPL declarations in four of the top level example apps in
> the htm/app/ directory.  I really don't know if they are meant to also 
> apply to the code under htm/src or if they are just for the code in those
> top level source files.
>    
We should also ask him about licensing.  I'm almost positive it's not 
going to be a problem, but let's make sure.

> 
> What to do next?  Here is my current plan, but I am open to suggestions.
> 
> 1) Try to get the code to compile with the current compiler.  This would
>    open things up so that others could play and would make life much
>    easier in general.  It will have to been done eventually anyway.
>    If we ever hear back from the authors, they may already have figured
>    out the fixes required so I am not going to spend huge amounts of
>    time on this.
> 
> 2) Write a Perl wrapper around their libs.  I suppose I could probably do
>    this with the older compiler but I would be happier if I could get step
>    (1) above working first.  This might seem like a sidetrack but I think
>    it will make it much, much easier to play around with HTM.  If I get
>    really stuck, I will drop it.
> 
> 3) Do some experiments with HTM.  Read in the HIP files and experiment
>    with populating different levels, etc.  If we put our minds to it,
>    I am confident we can write C code that is as fast as Perl.  :-)
>    (I am less confident about C++ :-).
> 
> 

Sounds great.  Thanks again for taking this on!

Jason