question about kstars catalogs

Wed Jun 8 07:27:35 BST 2022

Hi Vincent

Thank you for your contributions to Siril, it's an excellent piece of
software and fills a crucial need in the open-source software ecosystems.

>

Hello, I'm a developer of Siril, I'm looking at options to integrate a
> photometric star catalog for offline operation ; we currently use NOMAD
> and APASS from the vizier web service.

I'm not sure exactly what your use-case here is, but please be warned that
NOMAD was generated through algorithms running on images without human
curation, so at least in the version used in KStars there are several
artifacts, especially around milky way patches, diffraction spikes of
bright stars, and cores of galaxies. Perhaps there is a better version
released, but I haven't kept up.

> I'm looking for more information about the kstars catalog binary format,
> I was hoping you could give me pointers.

The KStars star catalog binary format is documented here:
https://invent.kde.org/education/kstars/-/blob/master/kstars/skycomponents/stars.dox
(for some reason I can't find the doxygen-generated HTML on the web)

I believe that the same file also has details of how the Hierarchical
Triangular Mesh index is structured. If you search "Hierarchical Triangular
Mesh" on the internet, you will find some papers, but a useful page is on
the Sloan DSS survey:
http://www.skyserver.org/htm/

These days HEALPix is more popular. It has more complicated mesh structure,
but they tend to prefer it due to the avoidance of compute-intensive
trigonomeric functions. I believe KStars implements both HTMesh (for stars
and deep-sky objects) and HEALPix (for progressive surveys). The HTMesh
implementation in KStars, as a C++ library, lies here:
https://invent.kde.org/education/kstars/-/tree/master/kstars/htmesh
I believe the above is a standalone library. There is also a Perl wrapper
around it:
https://invent.kde.org/education/kstars/-/tree/master/kstars/data/tools/HTMesh-0.01

When we built this system back in 2008, the approach we took was an
optimization-first approach. This resulted in many non-portable decisions,
like using a custom-designed binary file format, restricting to 16- and
32-byte data structures, and C-style memory allocations. If I were to
redesign this system today, the first thing I'd try is to use an existing
low-footprint, fast, serverless-database. The binary file is essentially a
key-value database indexed by trixel as the key, mapping to an array that
has entries sorted by magnitude within each trixel. A separate index file
is provided for lookup by Henry-Draper catalog number.

There are several other subtleties in the system, including replicating
some high-proper motion stars in multiple trixels that they may appear in
over the course of 10000 years. Most of these are explained in the doxygen
file.

>
I believe I have the raw text data from the USNO on my hard drive, if it is
easier to start from there rather than from the binary file. But obviously,
this does not have a HTM index associated with it.

The code in kstars/data/tools unfortunately does a weird rigmarole of
loading this data into a MySQL database, doing some intermediate processing
there, and then writing out the binary file from the MySQL. At that time,
it made sense to load the data into MySQL, but today, SQLite would probably
be a better choice.

Unfortunately, this has made debugging or extending the catalog hard.
Please be aware that we have some issues reported on our Gitlab, I believe
they mostly concern Tycho2-stars.  Hipparcos and Tycho-2 (I believe?) stars
have been removed from the NOMAD catalog binary file supplied in KStars so
as to avoid duplication, and they are shipped in separate binary files
(There's four files in total: namedstars.dat, unnamedstars.dat,
deepstars.dat and the NOMAD-1e8.dat -- the first two I believe are
Hipparcos, third Tycho-2, and last USNO NOMAD, although I could be wrong).

I am going to propose that if you are going to take the effort to re-work
this pipeline, we might as well collaborate and do it in such a manner that
would be beneficial to both projects. This way, our users need to install
the catalog only once. I am very open to porting away from the binary file
format, although I may not be able to commit the time to do the work today.

I've been looking at the code and binary format as shown in
> nomadbinfiletester.c, I think it would not be difficult to use this to
> create a simple search of stars in NOMAD, not using a database but
> simply the file organized in "trixels".

Yes, this should be possible. In fact our binary file format already has
this solution. The header contains the 64-bit offsets into the file that
correspond to each trixel, and at each of those offsets, one finds a small
header followed by the stars organized by magnitude, brightest first.

>
>
> I didn't find how the trixels were distributed in the sky, is there a
> code you can point me to that converts RA,DEC to a trixel number?
> In Siril we only need to get a list of stars within a radius or a
> square around a target's coordinates, with a filter on magnitude too.

See the SDSS documentation for a visualization of how trixels are
distributed in the sky:
http://www.skyserver.org/htm/
An important parameter is the "level" of the HTM, which is the number of
nested levels up to some constant. Each binary file catalog specifies the
level of HTM that it's indexed on in the header.

The trixel mesh operates, as you can expect, in J2000 coordinates.

As for finding a trixel corresponding to an (RA, Dec): you are looking for
HTMesh::index
https://invent.kde.org/education/kstars/-/blob/master/kstars/htmesh/HTMesh.cpp#L72

For the latter two usecases, finding the trixels covered by a region, you
are looking for the various overloads of HTMesh::intersect:
https://invent.kde.org/education/kstars/-/blob/master/kstars/htmesh/HTMesh.cpp#L104
I believe you can then iterate over the returned trixels through some
iterator structure.

You may also find the wrapper in SkyMesh.cpp worth looking at, although it
has several KStars-specific patterns:
https://invent.kde.org/education/kstars/-/blob/master/kstars/skycomponents/skymesh.cpp

For an example use of the mesh, see:
https://invent.kde.org/education/kstars/-/blob/master/kstars/skycomponents/deepstarcomponent.cpp#L283
(I believe aperture is a wrapper around intersect)

The intersect call is on line 283, which is followed by getting the
iterator on the next line, and the iteration loop is seen on line 310.

The deep-sky object implementation also uses HTMesh, but the data is
instead stored in a SQLite database:
https://invent.kde.org/education/kstars/-/blob/master/kstars/skycomponents/catalogscomponent.cpp#L92

>

> Another thing I was wondering is how to convert object names to
> coordinates (coordinates to name could be useful too), I guess you have
> a list in kstars, but would it easy to extract from whichever catalog
> contains them?
>

In KStars, each class of objects is managed differently when it comes to
name.

I believe all stars that have names are always held in memory and are never
loaded. For this, there is a supplemental binary file called starnames.dat
that contains the names of the stars. Each star's data in the binary file
may contain some catalog numbers (notably Henry-Draper). The reverse
mapping is stored and distributed as Henry-Draper.idx (apparently, this is
broken right now; I filed a number of issues on invent.kde.org related to
this recently).

For deep-sky objects, we store them in the SQLite database. The object
names are dynamically queried against the database via the
CatalogsComponent::findByName() wrapper, and the results are loaded into
memory (not sure of the details) perhaps upon centering the object, simply
because we load the trixel's contents.

For all other objects (eg: planets / comets ...), the corresponding
subclass of SkyComponent supplies the names (look for something like
objectNames).

Finally, if a name is not known to KStars, we submit a query to CDS Sesame
(which is an API front-end to SIMBAD and VizieR, and ostensibly NED
although it's never worked for me), parse the result, store it in the
SQLite deep-sky object database and then render it on screen for the user.

So you could pick our catalogs workflow created by Valentin Boettcher to
generate SQLite tables with information from VizieR:
https://protagon.space/catalogs/
I am unable to locate the repo which contains the code, and I'm not even
sure if it is publicly accessible. Perhaps Jasem / Valentin can comment on
that.

Depending on how you intend to proceed, we may be interested in finding
ways to modernize our implementation of star catalogs in KStars by
leveraging any portable implementations you plan to make.

Regards
Akarsh

>
> Thanks a lot for your work and your help!
>
> Vincent
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kstars-devel/attachments/20220607/26df9dcd/attachment.htm>