Test execution for all of PIM

David Faure faure at kde.org
Sun Feb 24 21:54:58 GMT 2019


Hi Ben,

at this point I don't think that a timer-for-quitting would help, because the 
one bug I was able to reproduce locally was leading rather to a deadlock
[deep within mariadb, details attached], so the event loop wouldn't be running 
anymore -- and a timer would be useless.

In any case, https://phabricator.kde.org/D18888 fixes / works around that 
problem (by removing a data race).

So I'd like to see if this fixes the CI issue, or if there are more problems.
Can you re-enable test execution for akonadi (or all of PIM) and ping me if 
you again see the need to kill stuck processes? If possible I'd like to debug 
those stuck processes before they get killed...

I should be available with a maximum few hours delay this week, in case this 
happens.

Cheers,
David.

On samedi 9 février 2019 21:07:37 CET David Faure wrote:
> On mardi 5 février 2019 19:14:28 CET Ben Cooksley wrote:
> > On Wed, Feb 6, 2019 at 2:49 AM Daniel Vrátil <me at dvratil.cz> wrote:
> > > On Sunday, February 3, 2019 6:37:49 PM CET David Faure wrote:
> > > > On vendredi 4 janvier 2019 20:26:53 CET Ben Cooksley wrote:
> > > > > Once again, akonadi_knut_resource had failed to exit as it should.
> > > > 
> > > > I just had an idea about this. How about I make the knut resource
> > > > commit
> > > > suicide, 30 minutes after starting? We never need it for that long
> > > > anyway.
> > > 
> > > I'm pretty sure you can go lower than 30 minutes (even 10 minutes is
> > > generous). Ideally make be configurable through an env variable.
> > > 
> > > The question is how to make sure the test is failed when the resource
> > > gets
> > > stuck.
> > 
> > In all the cases i've seen, the test executable itself has already
> > exited which means the test pass/fail criterion has already been
> > determined so it won't be possible to make it a hard failure i'm
> > afraid.
> > 
> > One thing I have observed though is that sometimes it isn't just the
> > knut_resource which is sticking around - in some cases it's the whole
> > akonadi_control / akonadiserver / knut_resource / mysqld combo.
> > Thoughts?
> > 
> > > > If I implement that, do you agree to re-enabling CI for kdepim?
> > > > It smells a bit like a workaround, but better than no CI.
> > 
> > Having it exit after a reasonable timeout period would solve the problem
> > yes.
> 
> I'm currently looking at this from a slightly different angle.
> 
> Very often when I run akonadi tests locally (like mysql-tagsynctest, but not
> only), I get a warning "Resource synchronization timed out for
> akonadi_knut_resource_0"
> (after 30s of inactivity, during akonaditest setup), the test fails, and
> mysqld keeps running (not the resource though).
> 
> Until now I've drilled down to a CollectionCreateJob (from CollectionSync)
> that starts but never gets a response from akonadiserver, not sure why yet.
> 
> Looking at past test failures for akonadi I see that this also happened on
> CI:
> https://build.kde.org/job/Applications/job/akonadi/job/kf5-qt5%20SUSEQt5.10
> /32/testReport/junit/projectroot.autotests/libs/akonadi_mysql_transactiontes
> t/
> 
> So let's see if things are better once I fix this bug.
> Dan, any hints would be greatly appreciated :-)


-- 
David Faure, faure at kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5
-------------- next part --------------
An embedded message was scrubbed...
From: David Faure <faure at kde.org>
Subject: Re: Test execution for all of PIM
Date: Sat, 09 Feb 2019 22:59:31 +0100
Size: 8798
URL: <http://mail.kde.org/pipermail/kde-pim/attachments/20190224/172c9e94/attachment.mht>


More information about the kde-pim mailing list