Scrap Baloo Thread Feedback

Boudhayan Gupta bgupta at kde.org
Mon Oct 17 03:15:00 UTC 2016


Hi,

Unfortunately I've been hit my multiple pretty severe health scares in the
last month, and have no idea when I'm going to be at 100% again.

For the time being I'd rather not hold up any development, so don't hold
back anything on my account.

-- Boudhayan

On 16 October 2016 at 17:46, Christoph Cullmann <cullmann at absint.com> wrote:

> Hi,
>
> (evil top posting)
>
> given the silence, I assume any interest in baloo has stopped once more,
> or?
> Or are there any plans how to fixup the current situation?
>
> Greetings
> Christoph
>
> ----- Am 7. Okt 2016 um 20:08 schrieb cullmann cullmann at absint.com:
>
> > Hi,
> >
> >> Hey
> >>
> >> On Fri, Oct 7, 2016 at 6:34 PM, Christoph Cullmann <cullmann at absint.com>
> wrote:
> >>> Hi,
> >>>
> >>>> On Fri, Oct 7, 2016 at 5:58 PM, Christoph Cullmann <
> cullmann at absint.com> wrote:
> >>>>>
> >>>
> >>> 1) No handling of DB errors beside asserting
> >>> 2) No handling of errors in the extractors (e.g. see the fixes I did,
> all
> >>> extractors will need more of that)
> >>> 3) No handling of NFS/large inodes/inconsistencies => crash
> >>>
> >>> In the end, in my opinion, you can rewrite close to all parts dealing
> with the
> >>> DB or
> >>> any other thing internally. If ever any thing gots inconsistent, ATM
> you are
> >>> doomed, forever,
> >>> if not by luck my new startup code deletes the index, then you live
> again until
> >>> it is reindexed.
> >>>
> >>>>
> >>> I am not sure, I am all for removing complete indexing and use a other
> indexer
> >>> like tracker to exactly avoid the excurse into DB world and how to
> handle it
> >>> in a safe way with close to zero person manpower.
> >>>
> >>
> >> It's avoiding the problem and hoping for the best, without any
> experiments.
> > That is not true.
> >
> > I did experiments and search works with tracker, but yes, a problem is
> tagging,+
> > which ATM doesn't work. Nor do I say that is a ready solution now, just a
> > possibility
> > to avoid having to maintain low level code with at most 1 person (how it
> looks
> > ATM).
> >
> > And I don't propose to go that road now, but ATM I see nobody doing any
> other
> > experiments.
> >
> > Besides, tracker is constantly maintained and used since >> 5 years:
> >
> > https://github.com/GNOME/tracker/graphs/contributors
> >
> >>
> >>>
> >>> => That is good that we agree, but I find it very astonishing that we
> use baloo
> >>> in its
> >>> current state more or less mandatory on all that systems were it by
> design will
> >>> fail.
> >>>
> >>> (and it fails if you read the bugs)
> >>>
> >>
> >> There is a certain amount of failure, but it's not "by-design". But
> >> maybe I'm not seeing things clearly.
> > You yourself stated that neither 32-bit issues nor NFS nor > 32-bit
> inodes have
> > any
> > error handling. And that seems to have been known even during design and
> still
> > we have this now as a framework per default used by any Plasma
> installation on
> > systems exactly featuring that without error checking.
> >
> >>
> >>>>
> >>>>>>
> >>>>>> How about requirements such as resource consumption, ease of
> >>>>>> integration, search speed are taken into consideration? Come on
> guys.
> >>>>>> We're engineers over here.
> >>
> >>>>> What is the argument here? If you take a look at bugs.kde.org, you
> see that
> >>>>> people are complaining about all
> >>>>> of that with baloo. I see no evidence nowhere that e.g. baloo is
> "superior" to
> >>>>> what GNOME uses
> >>>>> or any other solution (perhaps beside nepomuk, ok...).
> >>
> >> What tests have been to obtain the evidence?
> > What tests have been done to obtain the inverse evidence? I only hear
> here the
> > complaint
> > about not taking requirements like resource consumption or speed into
> account,
> > but
> > there is ATM zero evidence that e.g. tracker is slower.
> >
> > And yes, there are "it hogs" 100% memory or time bugs open, thought you
> can
> > hardly reproduce them
> > as people are somehow scared to pack their home and send it to us. Not
> that a
> > lot of that bugs
> > got touched at all in Bugzilla.
> >
> >>
> >>>
> >>>>
> >>>> Yup, you have. It's awesome. I no longer have the motivation to work
> on Baloo.
> >>> Thanks, but that makes me very sad, btw.
> >>> Baloo came up to replace nepomuk, which was dead because it had too
> many bugs
> >>> and all maintainers left.
> >>> Now we have baloo, which has many bugs, some even by design, and the
> maintainer
> >>> left, too.
> >>>
> >>
> >> Actually, Nepomuk was not dead. I was maintaining it. I killed it
> >> because it had too many structural problems.
> >>
> >> This is how the open source world works. People work on projects and
> >> when it no longer scratches their itch (I no longer use Baloo), they
> >> loose interest. This is "supposed" to be a hobby.
> > That is ok, to see it as hobby.
> >
> > But I am a bit unnerved that one proposes this as the generic index
> solution
> > for our desktop, which should be stable, if nothing else, and knows that
> it has
> > severe
> > limitations that are not handled (see above). I would have assumed that
> at least
> > the known "can't work here'
> > cases are handled in a graceful way.
> >
> > And given already one of the first things main.cpp of baloo_file does is:
> >
> >    // HACK: Untill we start using lmdb with robust mutex support. We're
> just going
> >    to remove
> >    //       the lock manually in the baloo_file process.
> >    QFile::remove(path + "/index-lock");
> >
> > that doesn't leave high hopes, sorry.
> >
> > And the typical error check is:
> >
> > void MTimeDB::put(quint32 mtime, quint64 docId)
> > {
> >    Q_ASSERT(mtime > 0);
> >    Q_ASSERT(docId > 0);
> >
> >    MDB_val key;
> >    key.mv_size = sizeof(quint32);
> >    key.mv_data = static_cast<void*>(&mtime);
> >
> >    MDB_val val;
> >    val.mv_size = sizeof(quint64);
> >    val.mv_data = static_cast<void*>(&docId);
> >
> >    int rc = mdb_put(m_txn, m_dbi, &key, &val, 0);
> >    Q_ASSERT_X(rc == 0, "MTimeDB::put", mdb_strerror(rc));
> > }
> >
> > without any way to pass an error to the outside, nor any error handling
> code at
> > the outside,
> > as no error can ever occur that is non-fatal.
> >
> >>
> >>>
> >>>> (This is why they run on a separate process)
> >>> That doesn't help, it just OOMs your system => dead, it needs resource
> >>> restrictions,
> >>> which is tricky to get right.
> >>>
> >>
> >> You're right. It needs a better thought out solution. A separate
> >> process is the bare minimum.
> >>
> >> Btw, have you looked if Tracker actually does any of this?
> > It has process separation and it handles crashs well enough to not screw
> up
> > client process queries. And it has maintained extractors or miners,
> unlike us.
> > But for sure, it has bugs and crashs and all things, but it is
> maintained and
> > has a
> > constant stream of fixes for a longer time than baloo + all predecessors
> > together.
> >
> >>
> >>>> My hostility was because the proposal ignores key points such as -
> >>>>
> >>>> * Indexing Speed
> >>>> * Search speed
> >>>> * Database size
> >>> => If you look at the bugs, people complain we are inferior and I see
> not
> >>> that the proposal ignores it, I just see not how to compare, given
> there are no
> >>> hard facts that we are faster than e.g. tracker in any way.
> >>>
> >>
> >> Data can be gathered about it. Not all data is publicly available.
> > That would make any decision easier to take.
> >
> >>
> >>>> * Ease of use with our existing components
> >>> My proposal did not change the interface at all, it has zero impact on
> "ease of
> >>> use".
> >>>
> >>>> * Ease of fixing problems in the code
> >>> My estimate would be: rewrite close to everything. Even the basic
> 64-bit int id
> >>> won't work
> >>> with 64-bit inodes, each DB call must be touched to check for errors,
> at each
> >>> place
> >>> one will need to check for potential inconsistencies and exit
> gracefully...
> >>>
> >>
> >> I don't follow why everything needs to be re-written? Am I missing
> >> something or do we just need to check for more errors and use a higher
> >> integer id? This certainly doesn't seem super trivial, but it sounds
> >> like less work than implementing a shim on top of Tracker.
> > If you look at your own code, you will see, that there is no error
> handling at
> > all,
> > beside asserts. (see above)
> >
> > There is not even the concept of pass an error out to higher levels.
> >
> > Perhaps I am wrong, because there is only a bit of documentation in
> addition,
> > but if you start to add error handling at the DB calls, you can start to
> rewrite
> > all internal layers.
> >
> > Besides I don't see any documentation of the DB format, but I could miss
> that.
> > (at least not in the git nor https://community.kde.org/Baloo)
> >
> >>
> >> I could be wrong.
> > So coulbe be me ;=)
> >
> >>
> >>>>
> >>>> Baloo has certain speed requirements if it is to be used with krunner,
> >>>> and we want instant feedback. This was an integral requirement.
> >>> I doubt e.g. tracker has different requirements, as it is used in
> similar places
> >>> by GNOME.
> >>>
> >>> But all that left besides, have you an proposal how to fixup the
> current
> >>> situation?
> >>> Are you willing to invest some work to fix the current issues or an
> idea what
> >>> would be a good way to tackle them?
> >>>
> >>
> >> I probably will not work more in Baloo.
> >>
> >> I'll have to investigate the problems a bit more. From the cursory
> >> look of this thread, it doesn't seem that the problems are that dire.
> >> But I may not be reading into it correctly.
> > What would be highly appreciated would be a bit of documentation what the
> > different pieces do and stuff like that, even if you have no time to
> code.
> >
> > Greetings
> > Christoph
> >
> > --
> > ----------------------------- Dr.-Ing. Christoph Cullmann ---------
> > AbsInt Angewandte Informatik GmbH      Email: cullmann at AbsInt.com
> > Science Park 1                         Tel:   +49-681-38360-22
> > 66123 Saarbrücken                      Fax:   +49-681-38360-20
> > GERMANY                                WWW:   http://www.AbsInt.com
> > --------------------------------------------------------------------
> > Geschäftsführung: Dr.-Ing. Christian Ferdinand
> > Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234
>
> --
> ----------------------------- Dr.-Ing. Christoph Cullmann ---------
> AbsInt Angewandte Informatik GmbH      Email: cullmann at AbsInt.com
> Science Park 1                         Tel:   +49-681-38360-22
> 66123 Saarbrücken                      Fax:   +49-681-38360-20
> GERMANY                                WWW:   http://www.AbsInt.com
> --------------------------------------------------------------------
> Geschäftsführung: Dr.-Ing. Christian Ferdinand
> Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-frameworks-devel/attachments/20161017/2f8f7d19/attachment.html>


More information about the Kde-frameworks-devel mailing list