Review Request: Fix crash due to deleteLater being sent to the wrong object sometimes

Thu Dec 20 06:31:14 GMT 2012

> On Dec. 11, 2012, 7:23 a.m., Frank Reininghaus wrote:
> > Thanks for the patch!
> > 
> > > If updateItemStates was called from slotThreadFinished,
> > > m_updateItemStatesThread would be different when slotThreadFinished was
> > > over. deleteLater would be called on the *new* m_updateItemStatesThread,
> > > instead of the just-finished one, causing it to be deleted before
> > > it had finished running, and a crash.
> > 
> > Are you sure? VersionControlObserver::updateItemStates() connects the finished() signal and the deleteLater() slot of the *same* thread, so I don't see how a thread can be deleted before it's finished.
> > 
> > About the "} else if(!m_updateItemStatesThread){" changes: could you elaborate on why this could fix a crash? Moreover, could it be a problem that your new implementation may return an empty list 'actions' if lockPlugin() fails?
> > 
> > Sorry if my questions are stupid! I have little experience with the version control stuff ;-)
> 
> Simeon Bird wrote:
>     I'm not sure, but I have a 2-sigma confidence :) 
>     
>     This is how I figured it out: 
>     
>     I was getting this crash occasionally, so I tried removing the unlock/relock pair. 
>     This was slower, of course, so then I tried making the lock a QReadWriteLock. 
>     Rather to my surprise, I then got the crash all the time! And in the logs there was the message:
>     
>     QThread: Destroyed while thread is still running
>     
>     Which means that a delete event has to be passed to the QThread *before* the finished() signal is emitted. 
>     
>     Here is what I think happens: 
>     
>     updateItemStates() connects m_updateItemStatesThread's finished() to slotThreadFinished() and m_updateItemStatesThread->deleteLater() (in that order). So when the event is fired, Qt calls slotThreadFinished(), and then once it has finished, calls m_updateItemStatesThread->deleteLater().
>     
>     BUT m_updateItemStatesThread is changed in slotThreadFinished, and thus does not point to the same object that emitted finished()!
>     
>     Most of the time we have m_updateItemStatesThread=0 (which just leaks), but sometimes we call updateItemStates() from slotThreadFinished(), which assigns m_updateItemStatesThread to a new QThread, which then gets deleted too early.
>     
>     
>     As to the } else if(!m_updateItemStatesThread){ bit:
>     
>     The code tries to get the plugin lock if m_updateItemStatesThread != 0. But getting the plugin lock may fail (because m_updateItemStatesThread is holding onto it too long). If it does fail to get the lock, however, 
>     the code used the plugin anyway, which is not safe, because it may be being changed by m_updateItemStatesThread::setData at the time. 
>     I think returning an empty action list is fine, because it will just mean that some file will not have the "git add" menu items.
>     I think in practice that the plugin lock can always be obtained though, so it is a bit academic.
>     
>     You want I should put all this in the commit messages? :)
> 
> Frank Reininghaus wrote:
>     First of all, thanks for your work! I really appreciate it. The only reason why I ask so many questions is that I want to make sure that we don't push something that just hides the real bug (making it harder to find and fix it later on - I've seen such things happen) rather than fixing it.
>     
>     > Here is what I think happens: 
>     > 
>     > updateItemStates() connects m_updateItemStatesThread's finished() to slotThreadFinished() and m_updateItemStatesThread->deleteLater() (in that order). So when the event is fired, 
>     > Qt calls slotThreadFinished(), and then once it has finished, calls m_updateItemStatesThread->deleteLater().
>     >
>     > BUT m_updateItemStatesThread is changed in slotThreadFinished, and thus does not point to the same object that emitted finished()!
>     > 
>     > Most of the time we have m_updateItemStatesThread=0 (which just leaks), but sometimes we call updateItemStates() from slotThreadFinished(), which assigns m_updateItemStatesThread 
>     > to a new QThread, which then gets deleted too early.
>     
>     Just to make sure that I really understand this: You are saying that we fist have an UpdateItemStatesThread "thread1". Its finished() signal is connected to its own deleteLater() slot and to VersionControlObserver::slotThreadFinished(). When the latter is called, a new UpdateItemStatesThread "thread2" might be created in VersionControlObserver::updateItemStates(). This function connects "thread2"'s finished() signal to "thread2"'s deleteLater() slot. I don't see a problem here - "thread1"'s finished() signal is still connected to its own deleteLater() slot, not to the one of "thread2".
>     
>     If I understand your statement
>     
>     > BUT m_updateItemStatesThread is changed in slotThreadFinished, and thus does not point to the same object that emitted finished()!
>     
>     correctly, then you are saying that now "thread1"'s finished() signal invokes "thread2"'s deleteLater() slot? AFAIK, this is not possible. The connection between "thread1"'s signal and its own slot is not changed just because a member of the VersionControlObserver changes.
>     
>     > As to the } else if(!m_updateItemStatesThread){ bit:
>     >
>     > The code tries to get the plugin lock if m_updateItemStatesThread != 0. But getting the plugin lock may fail (because m_updateItemStatesThread is holding onto it too long). 
>     > If it does fail to get the lock, however, 
>     > the code used the plugin anyway, which is not safe, because it may be being changed by m_updateItemStatesThread::setData at the time. 
>     
>     But UpdateItemStatesThread::setData() is only called by the VersionControlObserver, so this can't really happen at the same time as a call to VersionControlObserver::actions() because both are supposed to run in the main thread, right? Moreover, UpdateItemStatesThread::setData() only changes the m_plugin member of the thread (by assigning a new value to the pointer), so I don't really see why we have to protect accesses to the m_plugin member of the VersionControlObserver at all, but maybe I'm missing something.
>     
>     > I think in practice that the plugin lock can always be obtained though, so it is a bit academic.
>     
>     As I said, I don't see why the lock is needed at all, maybe we should ask Peter about it.
>     
>     > You want I should put all this in the commit messages? :)
>     
>     Not necessarily, but I want to understand why a change makes sense before it is pushed :-)
>
> 
> Simeon Bird wrote:
>     > First of all, thanks for your work! I really appreciate it. The only reason why I ask so many questions is that I want to make sure that 
>     > we don't push something that just hides the real bug (making it harder to find and fix it later on - I've seen such things happen) rather > than fixing it.
>     
>     You're very welcome! Sorry for the late reply, real life intervened. And please do ask the questions, because I am new to this, and especially because in this case you are right. 
>     
>     > correctly, then you are saying that now "thread1"'s finished() signal invokes "thread2"'s deleteLater() slot? AFAIK, this is not possible. > The connection between "thread1"'s signal and its own slot is not changed just because a member of the VersionControlObserver changes.
>     
>     That is exactly what I was saying. Also, you're right, signals and slots do indeed work like that (I checked with a test program), and the event really should not be delivered to the wrong item. Furthermore, I can still get the crash, just less often with all these patches. 
>     So I think at the moment that moving the deleteLater around doesn't fix it, but it changes whatever race is happening just enough to make it less likely to occur. 
>     
>     Also there was a deadlock bug in the posted code (I forgot to change the order the mutexes are taken in setData() ), 
>     which was causing the loading to hang sometimes, hiding the crash. 
>     
>     So to summarize:
>     
>     I have this crash bug. It persists. 
>     
>     I'm getting this message:
>     
>     QThread: Destroyed while thread is still running
>     
>     from gdb, which should be completely impossible. The QThread is only deleted from once the finished() signal is emitted. 
>     QObject auto-deletion can't be happening here because the calling object is manifestly still there, blocking in setData (right?).
>     
>     > But UpdateItemStatesThread::setData() is only called by the VersionControlObserver, so this can't really happen at the same time as a call 
>     > to VersionControlObserver::actions() because both are supposed to run in the main thread, right? Moreover, 
>     > UpdateItemStatesThread::setData() only changes the m_plugin member of the thread (by assigning a new value to the pointer), so I don't 
>     > really see why we have to protect accesses to the m_plugin member of the VersionControlObserver at all, but maybe I'm missing something.
>     
>     Ah, I see what you mean. I thought when reading the code (but I didn't check this) that plugin->beginRetrieval() could internally have some state which could affect the output of plugin->actions(), and that was why the mutex was there. I don't think the git plugin has any such thing (and for that endRetrieval is a no-op). 
>     I also tried disabling the locking, and it didn't seem to make much difference, so it's up to you.

Wait, m_plugin is a KVersionControlPlugin, which is a QObject, and we're using it from multiple threads. 
Is that the real cause of the crash here?

- Simeon

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/107656/#review23301
-----------------------------------------------------------

On Dec. 10, 2012, 4:27 a.m., Simeon Bird wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/107656/
> -----------------------------------------------------------
> 
> (Updated Dec. 10, 2012, 4:27 a.m.)
> 
> 
> Review request for Dolphin and Frank Reininghaus.
> 
> 
> Description
> -------
> 
> Three commits here:
> 
> commit bd89a3ebe4f546e2032676c4b445befd9606e877
> 
>     Fix a potential crash if the plugin lock could not be taken
>     in VersionControlObserver.
> 
> (the "} else if(!m_updateItemStatesThread){" changes: I didn't see this, and I think it is unlikely, but best to be safe.)
> 
> 
> commit b1473eb1247e9168805ac92964cab11a10d6a0e2
> 
>     Make the UpdateItemStatesThread use a QReadWriteLock for protection
>     instead of a QMutex, now the previous crash is fixed and we can.
>     
>     Makes listings a bit faster.
> 
> commit 8f17e50800a7769521e496ab324d608ec0e36df4
> 
>     VersionControlObserver:
>     
>     If updateItemStates was called from slotThreadFinished,
>     m_updateItemStatesThread would be different when slotThreadFinished was
>     over. deleteLater would be called on the *new* m_updateItemStatesThread,
>     instead of the just-finished one, causing it to be deleted before
>     it had finished running, and a crash.
>     
>     Instead call deleteLater from slotThreadFinished directly.
>     
>     BUG: 302264
>     FIXED-IN: 4.10
> 
> This should really fix bug 302264 this time. 
> 
> 
> This addresses bug 302264.
>     http://bugs.kde.org/show_bug.cgi?id=302264
> 
> 
> Diffs
> -----
> 
>   dolphin/src/views/versioncontrol/updateitemstatesthread.h f0f91d7 
>   dolphin/src/views/versioncontrol/updateitemstatesthread.cpp e07d72c 
>   dolphin/src/views/versioncontrol/versioncontrolobserver.cpp 42e00de 
> 
> Diff: http://git.reviewboard.kde.org/r/107656/diff/
> 
> 
> Testing
> -------
> 
> Bug is fixed! 
> 
> It was very easy to reproduce (for me) once the window for it was made wider by using QReadWriteLock in UpdateItemStatesThread, and is no longer there.
> 
> 
> Thanks,
> 
> Simeon Bird
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.kde.org/mailman/private/kfm-devel/attachments/20121220/9577454f/attachment.htm>