Test execution for all of PIM

Ben Cooksley bcooksley at kde.org
Mon Feb 25 09:18:43 GMT 2019


On Mon, Feb 25, 2019 at 10:54 AM David Faure <faure at kde.org> wrote:
>
> Hi Ben,

Hi David,

>
> at this point I don't think that a timer-for-quitting would help, because the
> one bug I was able to reproduce locally was leading rather to a deadlock
> [deep within mariadb, details attached], so the event loop wouldn't be running
> anymore -- and a timer would be useless.
>
> In any case, https://phabricator.kde.org/D18888 fixes / works around that
> problem (by removing a data race).

Interesting. That would certainly align with what we've seen.

>
> So I'd like to see if this fixes the CI issue, or if there are more problems.
> Can you re-enable test execution for akonadi (or all of PIM) and ping me if
> you again see the need to kill stuck processes? If possible I'd like to debug
> those stuck processes before they get killed...

I've now re-enabled it. Historically it's only shown up every couple
of runs so we may need to wait a bit for it to show up.
If it does show up i'll let you know (we can safely leave a Linux
builder stuck as those are dynamically provisioned - if it blocks
Windows or FreeBSD though we'll have to unstick that as there are only
a couple of those)

>
> I should be available with a maximum few hours delay this week, in case this
> happens.
>
> Cheers,
> David.

Thanks,
Ben

>
> On samedi 9 février 2019 21:07:37 CET David Faure wrote:
> > On mardi 5 février 2019 19:14:28 CET Ben Cooksley wrote:
> > > On Wed, Feb 6, 2019 at 2:49 AM Daniel Vrátil <me at dvratil.cz> wrote:
> > > > On Sunday, February 3, 2019 6:37:49 PM CET David Faure wrote:
> > > > > On vendredi 4 janvier 2019 20:26:53 CET Ben Cooksley wrote:
> > > > > > Once again, akonadi_knut_resource had failed to exit as it should.
> > > > >
> > > > > I just had an idea about this. How about I make the knut resource
> > > > > commit
> > > > > suicide, 30 minutes after starting? We never need it for that long
> > > > > anyway.
> > > >
> > > > I'm pretty sure you can go lower than 30 minutes (even 10 minutes is
> > > > generous). Ideally make be configurable through an env variable.
> > > >
> > > > The question is how to make sure the test is failed when the resource
> > > > gets
> > > > stuck.
> > >
> > > In all the cases i've seen, the test executable itself has already
> > > exited which means the test pass/fail criterion has already been
> > > determined so it won't be possible to make it a hard failure i'm
> > > afraid.
> > >
> > > One thing I have observed though is that sometimes it isn't just the
> > > knut_resource which is sticking around - in some cases it's the whole
> > > akonadi_control / akonadiserver / knut_resource / mysqld combo.
> > > Thoughts?
> > >
> > > > > If I implement that, do you agree to re-enabling CI for kdepim?
> > > > > It smells a bit like a workaround, but better than no CI.
> > >
> > > Having it exit after a reasonable timeout period would solve the problem
> > > yes.
> >
> > I'm currently looking at this from a slightly different angle.
> >
> > Very often when I run akonadi tests locally (like mysql-tagsynctest, but not
> > only), I get a warning "Resource synchronization timed out for
> > akonadi_knut_resource_0"
> > (after 30s of inactivity, during akonaditest setup), the test fails, and
> > mysqld keeps running (not the resource though).
> >
> > Until now I've drilled down to a CollectionCreateJob (from CollectionSync)
> > that starts but never gets a response from akonadiserver, not sure why yet.
> >
> > Looking at past test failures for akonadi I see that this also happened on
> > CI:
> > https://build.kde.org/job/Applications/job/akonadi/job/kf5-qt5%20SUSEQt5.10
> > /32/testReport/junit/projectroot.autotests/libs/akonadi_mysql_transactiontes
> > t/
> >
> > So let's see if things are better once I fix this bug.
> > Dan, any hints would be greatly appreciated :-)
>
>
> --
> David Faure, faure at kde.org, http://www.davidfaure.fr
> Working on KDE Frameworks 5
>
>
>
> ---------- Forwarded message ----------
> From: David Faure <faure at kde.org>
> To: "Daniel Vrátil" <me at dvratil.cz>
> Cc: KDE PIM <kde-pim at kde.org>
> Bcc:
> Date: Sat, 09 Feb 2019 22:59:31 +0100
> Subject: Re: Test execution for all of PIM
> On samedi 9 février 2019 21:07:37 CET David Faure wrote:
> > Until now I've drilled down to a CollectionCreateJob (from CollectionSync)
> > that starts but never gets a response from akonadiserver, not sure why yet.
>
> akonadiserver is stuck with two threads in a MySQL query.
> Race condition in libmariadb / mysqld ??
> This hole is deeper than I thought...
>
> Thread 10
> #0  0x00007fe2c2d4c08b in poll () from /lib64/libc.so.6
> #1  0x00007fe2b967c849 in poll (__timeout=-1, __nfds=1, __fds=0x7fe28fffd8c0) at /usr/include/bits/poll2.h:46
> #2  pvio_socket_wait_io_or_timeout (pvio=pvio at entry=0x7fe288008930, is_read=is_read at entry=1 '\001', timeout=timeout at entry=-1) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/plugins/pvio/pvio_socket.c:499
> #3  0x00007fe2b967cbfa in pvio_socket_read (pvio=0x7fe288008930, buffer=0x7fe2880089a0 "\a", length=16384) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/plugins/pvio/pvio_socket.c:300
> #4  0x00007fe2b968942f in ma_pvio_read (pvio=pvio at entry=0x7fe288008930, buffer=0x7fe2880089a0 "\a", length=length at entry=16384) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/ma_pvio.c:251
> #5  0x00007fe2b9689593 in ma_pvio_cache_read (pvio=0x7fe288008930, buffer=buffer at entry=0x7fe28800c9d0 "\036", length=length at entry=4) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/ma_pvio.c:293
> #6  0x00007fe2b9680216 in ma_real_read (net=0x7fe2880060e0, complen=complen at entry=0x7fe28fffda18) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/ma_net.c:373
> #7  0x00007fe2b9680c7d in ma_net_read (net=net at entry=0x7fe2880060e0) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/ma_net.c:427
> #8  0x00007fe2b9685731 in ma_net_safe_read (mysql=mysql at entry=0x7fe2880060e0) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/mariadb_lib.c:204
> #9  0x00007fe2b9688740 in mthd_my_read_query_result (mysql=0x7fe2880060e0) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/mariadb_lib.c:2036
> #10 0x00007fe2b968f1d7 in stmt_read_execute_response (stmt=stmt at entry=0x7fe28803fe30) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/mariadb_stmt.c:1798
> #11 0x00007fe2b968fe80 in mysql_stmt_execute (stmt=0x7fe28803fe30) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/mariadb_stmt.c:1991
> #12 0x00007fe2b9902849 in QMYSQLResult::exec (this=0x7fe28800ef00) at /d/qt/5/kde/qtbase/src/plugins/sqldrivers/mysql/qsql_mysql.cpp:1098
> #13 0x00007fe2c454923a in QSqlQuery::exec (this=this at entry=0x7fe28fffe200) at /d/qt/5/kde/qtbase/src/sql/kernel/qsqlquery.cpp:1012
> #14 0x000000000056aaeb in Akonadi::Server::QueryBuilder::exec (this=this at entry=0x7fe28fffe190) at /d/kde/src/5/kde/pim/akonadi/src/server/storage/querybuilder.cpp:409
> #15 0x00000000004d8f0d in Akonadi::Server::MimeType::insert (this=this at entry=0x7fe28fffe2a0, insertId=insertId at entry=0x0) at /d/kde/build/5/kde/pim/akonadi/src/server/entities.cpp:3789
> #16 0x00000000004eb8f7 in Akonadi::Server::MimeType::retrieveByNameOrCreate (name="inode/directory") at /d/kde/build/5/kde/pim/akonadi/src/server/entities.cpp:3696
> #17 0x0000000000500983 in Akonadi::Server::DataStore::appendMimeTypeForCollection (this=this at entry=0x7fe288005730, collectionId=3, mimeTypes=QStringList<QString> (size = 1) = {...}) at /d/kde/src/5/kde/pim/akonadi/src/server/st
> orage/datastore.cpp:967
> #18 0x00000000005004b1 in Akonadi::Server::DataStore::appendCollection (this=this at entry=0x7fe288005730, collection=..., mimeTypes=QStringList<QString> (size = 1) = {...}, attributes=QMap<QByteArray, QByteArray> (size = 0)) at /
> d/kde/src/5/kde/pim/akonadi/src/server/storage/datastore.cpp:779
> #19 0x0000000000443f06 in Akonadi::Server::Create::parseStream (this=0x7fe28802f800) at /d/kde/src/5/kde/pim/akonadi/src/server/handler/create.cpp:124
>
> Thread 11 is in Akonadi::Server::MimeType::retrieveByNameOrCreate, waiting for the QMutex held by thread 10 in that same method, that's fine.
>
> Thread 12
> #0  0x00007fe2c2d4c08b in poll () from /lib64/libc.so.6
> #1  0x00007fe2b967c849 in poll (__timeout=-1, __nfds=1, __fds=0x7fe28effb890) at /usr/include/bits/poll2.h:46
> #2  pvio_socket_wait_io_or_timeout (pvio=pvio at entry=0x7fe284008df0, is_read=is_read at entry=1 '\001', timeout=timeout at entry=-1) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/plugins/pvio/pvio_socket.c:499
> #3  0x00007fe2b967cbfa in pvio_socket_read (pvio=0x7fe284008df0, buffer=0x7fe284008e60 "\a", length=16384) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/plugins/pvio/pvio_socket.c:300
> #4  0x00007fe2b968942f in ma_pvio_read (pvio=pvio at entry=0x7fe284008df0, buffer=0x7fe284008e60 "\a", length=length at entry=16384) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/ma_pvio.c:251
> #5  0x00007fe2b9689593 in ma_pvio_cache_read (pvio=0x7fe284008df0, buffer=buffer at entry=0x7fe28400ce90 " ", length=length at entry=4) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/ma_pvio.c:293
> #6  0x00007fe2b9680216 in ma_real_read (net=0x7fe2840065a0, complen=complen at entry=0x7fe28effb9e8) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/ma_net.c:373
> #7  0x00007fe2b9680c7d in ma_net_read (net=net at entry=0x7fe2840065a0) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/ma_net.c:427
> #8  0x00007fe2b9685731 in ma_net_safe_read (mysql=mysql at entry=0x7fe2840065a0) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/mariadb_lib.c:204
> #9  0x00007fe2b9688740 in mthd_my_read_query_result (mysql=0x7fe2840065a0) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/mariadb_lib.c:2036
> #10 0x00007fe2b968f1d7 in stmt_read_execute_response (stmt=stmt at entry=0x7fe28403cf50) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/mariadb_stmt.c:1798
> #11 0x00007fe2b968fe80 in mysql_stmt_execute (stmt=0x7fe28403cf50) at /usr/src/debug/mariadb-connector-c-3.0.3-lp150.1.2.x86_64/libmariadb/mariadb_stmt.c:1991
> #12 0x00007fe2b9902849 in QMYSQLResult::exec (this=0x7fe28402f3d0) at /d/qt/5/kde/qtbase/src/plugins/sqldrivers/mysql/qsql_mysql.cpp:1098
> #13 0x00007fe2c454923a in QSqlQuery::exec (this=this at entry=0x7fe28effc1e0) at /d/qt/5/kde/qtbase/src/sql/kernel/qsqlquery.cpp:1012
> #14 0x000000000056aaeb in Akonadi::Server::QueryBuilder::exec (this=this at entry=0x7fe28effc170) at /d/kde/src/5/kde/pim/akonadi/src/server/storage/querybuilder.cpp:409
> #15 0x00000000004b708f in Akonadi::Server::Entity::addToRelationImpl (tableName="CollectionMimeTypeRelation", leftColumn="Collection_id", rightColumn="MimeType_id", leftId=leftId at entry=5, rightId=rightId at entry=1) at /d/kde/src/5/kde/pim/akonadi/src/server/storage/entity.cpp:135
>
> Strangely enough, https://phabricator.kde.org/D18888 fixes / works around this.
>
> --
> David Faure, faure at kde.org, http://www.davidfaure.fr
> Working on KDE Frameworks 5



More information about the kde-pim mailing list