[CRITICAL] KIO Test "threadtest" can enter into infinite loop

David Faure faure at kde.org
Sat Nov 7 10:23:06 UTC 2015


On Saturday 07 November 2015 10:36:18 David Faure wrote:
> On Saturday 07 November 2015 11:17:31 Ben Cooksley wrote:
> > Hi all,
> > 
> > It appears the test running with the binary name of "threadtest" in
> > kio has a grave bug which can lead to it entering into an infinite
> > loop.
> > 
> > This was consuming virtually the entire resources of one builder with
> > old hung processes, and the whole core of another builder -
> > drastically limiting the capabilities of the CI system (even though
> > KIO was not being built at the time).
> > 
> > Can someone please investigate? Manual intervention (with kill -9) is
> > needed to remove these hung processes.
> 
> It of course works fine on my own machine.
> 
> And I just tried running it on LinuxNode2, in ~/builds/kio/stable-kf5-qt5/build/autotests
> (after sourcing ~/kio.env which I just generated), and it ran fine (multiple times).
> 
> Any suggestion on how / where to hit the issue?

OK, it happens in kf5-qt5 rather than in stable-kf5-qt5

One of the KIO-using threads seems stuck in QProcess.... smells like a Qt bug?
Or is there a reason why starting a new process or creating a new pipe would
sometimes fail (some ulimit?)

Context: this test starts 20 threads at once, each of which starts a QProcess 
(using startDetached) (for the kioslave binary). It works most of the time, but sometimes
hangs with the bt below.

(gdb) bt   
#0  0x00007ffff4bdd49d in read () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff6f27637 in read () from /usr/lib/x86_64-linux-gnu/libasan.so.1
#2  0x00007ffff565a2ed in qt_safe_read (fd=43, data=0x7fffddf14cc0, maxlen=1) at ../../include/QtCore/5.5.1/QtCore/private/../../../../../src/corelib/kernel/qcore_unix_p.h:265
#3  0x00007ffff565e2b4 in QProcessPrivate::startDetached (program=..., arguments=..., workingDirectory=..., pid=0x0) at io/qprocess_unix.cpp:1246
#4  0x00007ffff560142d in QProcess::startDetached (program=..., arguments=...) at io/qprocess.cpp:2461
#5  0x00007ffff652f1f5 in KIO::Slave::createSlave (protocol=..., url=..., error=@0x7fffddf15260: 214848, error_text=...) at /home/jenkins/builds/kio/kf5-qt5/src/core/slave.cpp:499
#6  0x00007ffff658ec6d in KIO::ProtoQueue::createSlave (this=0x6110000db8c0, protocol=..., job=0x60300002e330, url=...) at /home/jenkins/builds/kio/kf5-qt5/src/core/scheduler.cpp:529
#7  0x00007ffff658fc60 in KIO::ProtoQueue::startAJob (this=0x6110000db8c0) at /home/jenkins/builds/kio/kf5-qt5/src/core/scheduler.cpp:616
#8  0x00007ffff6599711 in KIO::ProtoQueue::qt_static_metacall (_o=0x6110000db8c0, _c=QMetaObject::InvokeMetaMethod, _id=0, _a=0x7fffddf154b0)
    at /home/jenkins/builds/kio/kf5-qt5/build/src/core/moc_scheduler_p.cpp:250
#9  0x00007ffff571273e in QMetaObject::activate (sender=0x6110000db918, signalOffset=3, local_signal_index=0, argv=0x0) at kernel/qobject.cpp:3713
#10 0x00007ffff5711f2a in QMetaObject::activate (sender=0x6110000db918, m=0x7ffff5a45420 <QTimer::staticMetaObject>, local_signal_index=0, argv=0x0) at kernel/qobject.cpp:3578
#11 0x00007ffff57b6b49 in QTimer::timeout (this=0x6110000db918) at .moc/moc_qtimer.cpp:197
#12 0x00007ffff571e420 in QTimer::timerEvent (this=0x6110000db918, e=0x7fffddf158c0) at kernel/qtimer.cpp:247
#13 0x00007ffff570b77b in QObject::event (this=0x6110000db918, e=0x7fffddf158c0) at kernel/qobject.cpp:1220
#14 0x00007ffff56d0782 in QCoreApplicationPrivate::notify_helper (this=0x60f00000ef50, receiver=0x6110000db918, event=0x7fffddf158c0) at kernel/qcoreapplication.cpp:1093
#15 0x00007ffff56d0409 in QCoreApplication::notify (this=0x7fffffff7570, receiver=0x6110000db918, event=0x7fffddf158c0) at kernel/qcoreapplication.cpp:1038
#16 0x00007ffff56d02f1 in QCoreApplication::notifyInternal (this=0x7fffffff7570, receiver=0x6110000db918, event=0x7fffddf158c0) at kernel/qcoreapplication.cpp:965
#17 0x00007ffff56d42a5 in QCoreApplication::sendEvent (receiver=0x6110000db918, event=0x7fffddf158c0) at ../../include/QtCore/../../src/corelib/kernel/qcoreapplication.h:224
#18 0x00007ffff574cac1 in QTimerInfoList::activateTimers (this=0x60f00008e6c0) at kernel/qtimerinfo_unix.cpp:637
#19 0x00007ffff574dfc0 in timerSourceDispatch (source=0x60f00008e660) at kernel/qeventdispatcher_glib.cpp:177

Config: Using QtTest library 5.5.1, Qt 5.5.1 (x86_64-little_endian-lp64 shared (dynamic) debug build; by GCC 4.9.2)

After many tries I got it to hang under strace -f as well.
Here's the log: http://www.davidfaure.fr/2015/strace.output.bz2
pipe2() succeeds 40 times, so that's not the problem...

Process 7389 (which works) does set_robust_list, close(43), execve(kioslave).
Process 7391 (which hangs) does set_robust_list, close(43) ... and nothing else.
Overall I see 60 calls to clone(), 40 calls to pipe2(), but only 16 calls to execve(kioslave).
Help?

-- 
David Faure, faure at kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5



More information about the Kde-frameworks-devel mailing list