Scrap baloo?

Christoph Cullmann cullmann at absint.com
Fri Oct 7 06:09:09 UTC 2016


Hi,

>> >>> At the same time, LMDB needs to be replaced, and fast. I'm building a
>> >>> new KVDB as an university project (it should be able to do 256GB
>> >>> indexes on 32bit machines), and if that doesn't work out there's
>> >>> Sophia (http://sophia.systems/). I'll be evaluating both as a
>> >>> replacement to LMDB.
>> >> 
>> >> Do we really want to maintain a own DB system?
>> >> IMHO that will never work out, all DB systems around need more
>> >> maintenance power than we have.
>> > 
>> > This is something I'm not sure about. The DB will be build anyway, my
>> > graduation depends on it :D And if I'm going to do something I will do
>> > it well, so it'll be simple and clean.
>> 
>> I don't doubt that you are capable to write clean and working code.
>> 
>> The only problem is: there is a big difference between a academic
>> implementation and a product ready thing. Any existing key value database
>> that is usable for general consumption is a multi man year effort, even if
>> you start today, that is a solution we can use in some years, if at all.
>> 
>> Actually the most work is to handle all different environments and corner
>> cases, which is something that more or less can only be done by getting
>> feedback over several years, and I doubt we want to incubate a new DB in
>> baloo as playground on our user production machines.
>> 
>> > If it doesn't work out, there's always Sophia to fall back on.
>> 
>> Sophia is again designed to be used in server environments, just from their
>> start page:
>> 
>> "For server environment, which requires lowest latency access (both read and
>> write), predictable behaviour, optimized storage schema and transaction
>> guarantees."
>> 
>> This means, like lmdb, most likely (at least google doesn't tell that it
>> will do it) real usable for nfs (or other network) home mounts, which is
>> very common on large scale installations.
>> 
>> (sophia doesn't get away to well after the opinion of the lmdb author, too:
>> https://www.mail-archive.com/cyrus-devel@lists.andrew.cmu.edu/msg03653.html
>> )
>> >>> Vishesh also wanted to separate out the engine and make it public API
>> >>> (apparently other projects want to make use of it as a general data
>> >>> storage library - and the engine offers fulltext search capabilities
>> >>> and other fancy logical operators that make it particularly
>> >>> attractive. My plan is to move towards that, and eventually also not
>> >>> only index files but also other kinds of objects - contacts, or
>> >>> people, for example.
>> >>> 
>> >>> I don't want to move back into the "semantic desktop" idea at all, but
>> >>> I do want some sort of infrastructure that allows for an "action on
>> >>> object" metaphor - file objects can be opened with an application,
>> >>> people objects can be sent mails, and so on.
>> >>> 
>> >>> Hope this makes sense.
>> >> 
>> >> I still not see how that should work out, atm, IMHO facts are:
>> >> 
>> >> 1) baloo is not maintained
>> > 
>> > It will, now.
>> > 
>> >> 2) lmdb will e.g. never work for us on NFS homes and the code needs major
>> >> overhaul
>> >> to handle errors (which you confirm)
>> > 
>> > LMDB goes away, either way.
>> > 
>> >> 3) you said you have "some time" left to maintain it, but you now propose
>> >> in addition to maintain
>> >> Baloo to write a DB system from scratch, I don't really see that working
>> > 
>> > I have a personal interest, an academic interest, and now a
>> > KDE-related interest in the KVDB. It *will* work, because I'm the kind
>> > of guy who puts a lot of time and effort into things (maybe even
>> > disproportionately so) into things that genuinely interest me. My
>> > challenge will be to make the codebase so that after I'm done with
>> > this (say in about 5 years or so) it'll be comprehensible to the next
>> > maintainer.
>> 
>> As stated above, I don't doubt that your are capable and earnest and hard
>> working. But I don't see that we should prototype & develop a database,
>> alone the work on top of that (what baloo does atm) will take months to get
>> right.
>> >> 4) tracker on the other side is maintained and in use and we can share
>> >> the index data with GNOME and others
>> >> 
>> >> I really doubt that doing the work to remove lmdb, replace it with an
>> >> "own one" and then starting
>> >> to fix all other issues (like indexer running amok, broken file
>> >> extractors, ...) will work out if
>> >> we don't clone some more people.
>> >> 
>> >> But that is only my opinion.
>> > 
>> > *Sigh*
>> > 
>> > I don't want to take the easy way out here. Half the fun in KDE is
>> > doing crazy things and seeing your baby work. That's the entire
>> > motivation for being here.
>> > 
>> > And right now I'm volunteering to do this.
> 
> Just chiming in here since I got a little worried when reading there are some
> foggy plans to 'roll our own' KVDB...
> 
>> I appreciate that, I only would like to avoid to have once more a
>> indexer/search that starts from scratch and is left unmaintained.
> 
>> We had strigi based stuff, we had nepomuk and now we have baloo, all more or
>> less from scratch and all ended up unmaintained and underdocumented (strigi
>> actually had more docs I remember ;=).
> 
> Yes.
> 
> Please, pretty please, don't reinvent the wheel here again, please don't
> consider an academic research project as production-ready replacement for a
> database backend. This is (sad) history repeating indeed.
> 
> There are alternatives for the DB at least, which work, are maintained (by
> more than one person) and where using them won't put another burden on us KDE
> developers who are lacking manpower in all different areas already.
> 
>> Actually, I don't insist on a tracker based solution, but I would like to
>> have some that doesn't end up in "KDE reinvents the wheel" once more if
>> there are perhaps alternatives available.
> 
> As Christoph I don't care about a specific solution, but going the NIH route
> sounds, by far, like the worst option. I'm not questioning Boudhayan's
> credibility to work out a great "draft" implementation of a KVDB for academic
> research... But, the *major* selling points of database implementations is a
> track record of being rock-stable in different environments for a continuous
> amount of time. There's no way you can guarantee this for a one-man academic
> research project.
> 
> Please reconsider the options.
any thoughts on that?

Greetings
Christoph

-- 
----------------------------- Dr.-Ing. Christoph Cullmann ---------
AbsInt Angewandte Informatik GmbH      Email: cullmann at AbsInt.com
Science Park 1                         Tel:   +49-681-38360-22
66123 Saarbrücken                      Fax:   +49-681-38360-20
GERMANY                                WWW:   http://www.AbsInt.com
--------------------------------------------------------------------
Geschäftsführung: Dr.-Ing. Christian Ferdinand
Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234


More information about the Kde-frameworks-devel mailing list