[gcompris-devel] Search engine

Bruno Coudoin bruno.coudoin at gcompris.net
Sat Nov 2 17:29:27 UTC 2013


Le 02/11/2013 12:13, Nicolas Adenis-Lamarre a écrit :
> Hi,
>
> thanks for your feedback.
> However, i don't understand why in a first time you prefer a g_strstr_len
> implementation.
> g_strstr_len implementation is far from being perfect, and i noticed no
> advantage compared to the sqlite database.
> The word "first implementation" is strange in fact in your mail, and i've
> not really associated a meaning for the moment.
Hi,

I mean first implementation because I know it it missing the 'like' 
feature of the search.

>
> functionnally :
> contrary to the sqlite one, it doesn't analyse words, just letter sequence,
> for example, "train" matches for "entrainement" (french)
This may be good or bad. For example a search in french for 'dessin' 
will trigger 'dessiner'.

> sqlite implementation doesn't require extra prerequisite (sqlite already
> there) ; it includes a word analyser library specially done for word
> searches (exactly the need).
> Personnally i prefer letting library do their job than reimplementing the
> wheel. (ok, very simple in this case, so can be discussed).
In this case we have sqlite but we don't have the data in sqlite. Or in 
sqlite we just have the english one. This means changing the way we use 
our sqlite database.

>
> Technically :
> The implementation difficulty is not complicated.
> The fact that data are already in memory is something that could/should be
> removed in the future.
I don't see the point here. On the oposite, we could also decide in the 
future to get rid of sqlite.
> it works because gcompris has a view activities, but it makes the gcompris
> startup O(n), and memory occupation O(n) where n
> is the number of activities which is not a problem because the number of
> activities is 144 at the moment. With 2000 activities, it would make things
> more complicated.
Hum, by the time we have 2000 activities all computers will have 1TB of 
memory. On average we probaly hold 500 bytes of information on each 
activity. No reason to optimize this.

> Gcompris uses the boards sqlite table only as a backup, not as a database :
> it's loaded at startup in memory, and then, gcompris implements functions
> to look into that memory, not quering the database on time (saving memory
> and cpu).
Like I said, the memory is so small that there is no need to optimize 
this. Also memory access is much faster. Doing an sqlite request when 
you enter an activity to display the next one would implies an sql 
request and a display lag. Memory access is much faster.
> Basically,
> what i mean, is not that the memory loadind should be changed, because the
> number of activities in gcompris increase slowly, and everything works
> nicely and will work nicely for several years again,
> but doing the g_strstr_len implementation would add a requirement to keep
> data in memory to not break the search feature if we want to change the
> memory management to make gcompris eat less memory.
Like I said, even in the future having all the texts in memory will not 
be an issue.
>
> sqlite database is a bit less than 20MB. However, i was not precise, once
> compressed in a package, it's about 500KB.
Currently the sqlite database is created on the fly on the pc of the 
user and takes only 200KB.

In your proposal, all the strings for all the languages would go in this 
base but they are already in the translations databases (.mo files). 
Thus you are creating a copy of this data which brings maintenance issues.

Have you searched for 'like' algorithm that could be used in GCompris?

Bruno.





More information about the Gcompris-devel mailing list