Replacing file-system by database in KStars

Wed Jan 29 19:27:42 UTC 2014

On Tuesday 28 Jan 2014 08:22:08 Vijay Dhameliya wrote:
> Hi guys,
> 
> Currently when KStars is launched, it reads data corresponding to different
> Skyobject from respective file in loaddata() methods. And I have tracked
> out all the classes where we are loading data by reading file.
> 
> I researched bit on the topic and I found that loading data from database
> is always much better option then doing same from file.
> 
> If we replace file system with QSql following are the Pros:
> 
> 1) We will not have to ship so many files with Kstars
> 2) Loading from database is quicker than doing same from file
> 3) Code for load methods will be reduced in size
> 
> Cons:
> 1) I will have to move all data from files into database by temporary
> methods
> 
> So I am planning to start coding to replace file system by database on my
> local branch.
> 
> Can you please give your views and suggestion regarding the same ? I am
> sure that It will be very helpful to me. :)

There's two areas you need to give serious thought to:

1) What database software?  If a server like MySql, be warned that you will 
get a lot of people complaining about needing to run an entire database server 
just to use KStars (see the complaints we regularly have/had over in PIM and 
Amarok), or demanding you use their preferred server instead.  If embedded 
like SQLite then will this give you the performance improvements you're 
looking for?

2) What are the sources of the data files and how often are they updated?  If 
these are data files updated regularly by an external provider, and perhaps 
distributed through KGetHotNewStuff then you will still need code to load the 
new data and merge it into the old data in the database, so no real reduction 
in code required, but a whole increased level of complexity from the data 
sync.  You also need to think of the initial download size of the database 
file and how that will affect distros, who can currently split the data files 
up in separate packages.  And you need to check licensing, some of the files 
we use have to be downloaded by the users as we are not allowed to distribute 
them, so again you would still need import code for those.

The option of having a single database that you just load what data you need 
for the view is good in theory, but the background management of that data 
often ends up not being worth the gains, and often the gains fail to work out.  
It could be worth a limited experiment though to investigate and benchmark the 
potential gains.  If you do, pick the one big important file used everywhere, 
load it to one table, and benchmark any gains or losses.  Don't assume that 
just because a textbook or website says it will be faster that it actually 
will be, you need to prove it.

I'm not sure who the current maintainer is, but best check with them before 
doing anything too radical :-)  Akarsh Simha would be a good person to get an 
opinion from, as would the Marble team who regularly deal with large data 
files.

John.