[Kstars-devel] Additional Catalogs

Sat Apr 3 20:28:15 CEST 2010

Hello!

Thank you! Now everything seems quite clear.

Indeed, it sounds as a really strong option for deep sky catalogs! I'm
sorry I haven't thought of it earlier as it really solves all the
requirements.
I've also heard of SQLite before, but it seems that it slipped out of
my mind at the moment.

I have started reading about using SQLite with C++ and i'm thinking on
designing the database relationship diagram (even though it's quite
intuitive, everything should be clear at this part). Working with it
shouldn't be hard, as I've been working with relational databases for
quite a time now.

> Note that star designations are also a problem -- Ideally, we should
> be able to search by HD, SAO, GSC and HIP catalog numbers. We already
> handle HD catalog numbers in KStars, but handling other catalog
> identifications might require us to organize the binary data files for
> USNO NOMAD differently. So, I see this as a difficult problem.

Whatever we'll implement for DSO catalog, I believe that the stars
catalog will certainly remain as binary files, and I think the major
optimisation we can implement is the one using the stxxl library.

If we'll implement more catalogs, I believe that working directly on
the files and limiting the amount RAM used would represent a practical
solution.
STXXL seems to have this very well optimized asynchronous file read
and write. The minimum that we can make use of stxxl is the explicit
overlap of the I/O operations and computations and use the time
between start and end of the read operationc for other processing
operations.
For example, the list of stars could be a stxxl vector container [1]
built directly from a file [2].

There are methods used to search and sort directly the file [3] [4]
and we could use these to do some basic searching (if more
optimisation is needed, I think we'll further organize / sort data
files under some other criterion too).

As you've written,
1) We should be able to search by catalog numbers: not using a
database leaves us only with binary files. Hence, this makes stxxl a
must-use library for this part, at least. There are some papers about
it's efficienty, for example [5].
2) We could make use of the current implementation as much as possible
and adapting / refining it by using the stxxl containers, where we
can.

I'll be back with some more details.

Victor

[1] http://algo2.iti.kit.edu/dementiev/stxxl/trunk/containers_2test__vector_8cpp-example.html
[2] http://algo2.iti.kit.edu/dementiev/stxxl/trunk/classvector.html#69bc1c231dbdc143da52c5ee98cd1de3
[3] http://algo2.iti.kit.edu/dementiev/stxxl/trunk/algo_2sort__file_8cpp-example.html
[4] http://algo2.iti.kit.edu/dementiev/stxxl/trunk/algo_2test__scan_8cpp-example.html
[5] http://www.docstoc.com/docs/26901503/Processing-Huge-Graphs-with-STXXL