Review Request 121134: make KGlobal reference counting threadsafe
René J.V. Bertin
rjvbertin at gmail.com
Sun Nov 16 17:19:28 GMT 2014
> On Nov. 16, 2014, 3:58 p.m., Milian Wolff wrote:
> > as I said before: Did you check whether that is actually the case? Did you add some assertions to see what other thread is calling this code?
> >
> > I'm not sure whether this is supposed to be threadsafe or not. If it must be threadsafe, you'll also need to make s_allowQuit threadsafe (QAtomicBool).
>
> René J.V. Bertin wrote:
> I have indeed looked at this aspect, but not exhaustively and haven't found evidence of cross-thread reference counting. Yet.
> It's unclear what you mean with "that" (the thing actually the case). It "that" refers to the quitting issue, I can of course never prove that it is due to a reference counting race condition because the issue isn't reproducible reliably. I can only invalidate the idea that it is due to such a race condition, whenever it happens again with properly protected reference counting.
>
> I think we have to leave it up to the kdelibs maintainers/authors to decide whether or not this has to be thread safe. The doxygen documentation for KJob does not mention anything about thread safety. It does however note that *KJob and its subclasses is meant to be used in a fire-and-forget way. It's deleting itself when it has finished using deleteLater() so the job instance disappears after the next event loop run.* . It is not said that a KJob cannot use a KJob itself, which could lead to concurrent reference (de)counting, nor whether a KJob instance is guaranteed to be deleted in the same thread it was created, and I think that a *concurrent processing* class that does not explicitly mention it does thread unsafe things can be assumed to be thread safe.
>
> There's no QAtomicBool, btw.
>
> Milian Wolff wrote:
> As I told you before in other patches you send in: Please make sure you find the actual reason before changing other code in the hope of it improving things. If there is no hard evidence that your bug manifests itself because of a race in KGlobal::ref/deref, then don't change it. Adding an assertion to the code that its only accessed from the mainthread is done easily and should be hit reliably (contrary to a race condition or lock contention that Thomas proposes).
>
> And also note that a KJob is not neccessarily implemented using threads. You can also use e.g. QIODevice or external processes internally to get the desired behavior.
KJobs not using threads to implement concurrency will not cause race conditions in the reference counting; either they are in separate processes or they live in the same thread.
Let me repeat once more what I also wrote in reply to Thomas below: a feature (like reference counting) of a framework that is frequently used in multithreaded applications can be expected to be thread safe. That's why my original description states that I consider the fact that it isn't an important bug regardless of whether it's the cause of my issue. Regrettably I am a scientist trained to give background anytime I propose something, I keep forgetting that there are contexts where that actually turns out counterproductive :(
You suggest to use an assert or other detection of doing reference counting off the main thread, and doing so imply that reference counting should apparently happen only on that thread. There is no evidence that that is any more valid an assumption than that it should be thread safe. But I can do it ... only it won't tell us if multiple threads try to access the counter at the same time.
NB: my initial proposition used about the cheapest way to get interlocked counting I could think of, to introduce as little side-effects as possible for applications that are not concerned by possible race conditions. I don't believe there can be any other side-effect than additional cycles (though by using QAtomicInt we have to assume that it is implemented correctly).
- René J.V.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/121134/#review70441
-----------------------------------------------------------
On Nov. 16, 2014, 4:53 p.m., René J.V. Bertin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/121134/
> -----------------------------------------------------------
>
> (Updated Nov. 16, 2014, 4:53 p.m.)
>
>
> Review request for KDE Software on Mac OS X and kdelibs.
>
>
> Repository: kdelibs
>
>
> Description
> -------
>
> I have been experiencing unexpected exits from KDevelop that were not due to any kind of error in the KDevelop code; it was as if someone told the application to exit normally. This happens mostly on OS X, but also sometimes on Linux.
> I finally traced this to `KGlobal::deref` reaching a zero reference count and invoking `QCoreApplication::quit` when called from one of KDevelop's KJob dtors. There does not appear to be a reference counting mismatch in the code, so the issue might be due to a race condition in KGlobal::ref/deref.
>
> This patch introduces thread-safety to KGlobal's reference counting by turning the simple global `static int s_refCount` into a `static QAtomicInt s_refCount`. I consider this an important bug fix regardless of whether it corrects the issue I have with KDevelop.
>
>
> Diffs
> -----
>
> kdecore/kernel/kglobal.cpp cf003a4
>
> Diff: https://git.reviewboard.kde.org/r/121134/diff/
>
>
> Testing
> -------
>
> On OS X 10.6.8 only for now.
>
>
> Thanks,
>
> René J.V. Bertin
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20141116/93c572c0/attachment.htm>
More information about the kde-core-devel
mailing list