DVCS crashes and KJob (bug 172309)

Evgeniy Ivanov powerfox at kde.ru
Sun Oct 12 20:09:01 UTC 2008


Andreas Pakulat wrote:
> On 11.10.08 17:15:46, Evgeniy Ivanov wrote:
>> Hi all,
>> I've found the following thing in KJob and I think it is causing most
>> part of our DVCS crashes:
>> http://api.kde.org/4.x-api/kdelibs-apidocs/kdecore/html/kjob_8cpp-source.html#l00186
>> Look, start() is implemented in DVCSjob and it can emit result() (or
>> connect KProcess to slots emitting it). But it can emit result() before
>> loop will be execed and there will be no signals to stop it (deadlock or
>> crash depending on deferred deletion).
>> I have committed DVCS tests (vcs/dvcs/tests) which prove something
>> really wrong with deferred deletion, see dvcsjobTest.cpp:
>> First I have a deadlock in first job->exec(). I debugged and saw DVCSjob
>> emitted result() before loop is execed. QProcess::waitForStarted()
>> before starting a process fixed it.
> 
> Interesting that this doesn't seem to be a problem with the svn plugin. It
> also has code that calls emitResult inside start() and AFAIK its also uses
> with KJob::exec(). However it could be that svn is just more relaxed in the
> case of wrong inputs, for example svn info just checks wether the given
> location is valid (i.e. KUrl::isValid) and if its not it might still fail
> further down the road...

QProcess::error() signal is delivered very fast (faster, then finished)
The problem is with error() signal or when I set failed without running.

> 
>> Then I added another exec for the same job. Depending on
>> "job->setAutoDelete(false)" from the dvcsjobTest.cpp, tests got crash or
>> deadlock.
>> I think the reason is really in KJob::exec(). IMHO we should connect
>> result() to KJob::takeResult(){resultIsGot = true;} and do loop.exec()
>> only if resultIsGot == false. #qt confirms it.
> 
> I think you're right. In theory the eventloop itself could actually check
> that but that would need to wait until Qt4.5 is out. However I don't think
> we need a new slot, simply put the flag-setting into emitResult, that one
> is always called anyway. So the attached patch should work, can you please
> test that?

Did. It looks like your patch (see below, what I added) solves the
problem with tests.

+    if( !d->isFinished ) {
+        loop.exec();
         d->isFinished = false;    // the Job can be used for
multiple 		//processes (not the same time), so we should restore initial
state.
+    }


But it's a heisenbug: we removed the problem in tests, but it's still in
the code. I spoke with Thiago in irc, he doesn't think such problem with
KJob can occur, but tests confirm we're true.
It will be cool if someone try to run a test from vcs/dvcs/tests to
confirm it fails without KJob's patch.
I tried helgrind... But my hardware didn't like it... I don't have any
idea what's wrong now and can't reproduce it in tests. Will continue
next weekend. If you have any tips, what can be done you're welcome to
share :)

> 
>> But I don't understand how deleteLater kills an application. gdb shows
>> we stay in loop.exec() from KJob, so we do not return to the main loop
>> to activate deletion, right?
> 
> If you want to know the answer to that I suggest to look through Qt's event
> loop code. I'm not sure wether the delete-event is scheduled only for the
> main event loop, it might be that KJob's event loop also executed all
> deferred-delete's.

I guess I will *have* to do that =(
But docs say deferred deletion should be executed in the loop, where we
used deleteLater() and I believe trolls have tests for it.


-- 
Cheers, Evgeniy.
Key fingerprint: F316 B5A1 F6D2 054F CD18 B74A 9540 0ABB 1FE5 67A3





More information about the KDevelop-devel mailing list