<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Hello Dmitry,<br>
<br>
The part I have problem here is the deletion of the tile inside
the deref() call. See the following case:<br>
<br>
1. There is a tile with a single owner, the owner thread protected
it with a ref() call.<br>
The reference counter is 1.<br>
<br>
2. A second thread gets a pointer to the tile to use it. Right
after that, the task scheduler pauses this thread.<br>
<br>
3. The first thread decides to release the tile. Now it calls the
deref(), the reference counter drops to 0 and the thread deletes
the tile.<br>
<br>
4. The second thread starts running again and tries to ref() the
already deleted tile.<br>
<br>
I agree this is very unlikely but I think it is theoretically
possible. If there are thousands of ref() and deref() calls per
second it can happen. The tile data object is on the heap so
perhaps it is accessible after its destruction without a segfault.
So even if this happens, perhaps it remains hidden. The null data
pointer means the tile was swapped out so perhaps the program
can't notice if it's using a deleted tile. I don't really know
what happens on the heap if you delete an object twice.<br>
<br>
My solution was the using of shared pointers. But perhaps you're
right and it has too much overhead and this is not a real bug
because other mechanisms in the program protects it to happen.<br>
<br>
To be sure about it, I think the following solution may helps: let
the initial value of the reference counter be 1 (I think it
already becomes one right after its creation). Then if the ref()
call encounters a zero reference counter it means the tile was
deleted so it can drop an error message. It is not a real overhead
since it only checks the reply value of the Qt ref() call. Then we
can see if this bug happens or not.<br>
<br>
Fazek<br>
<br>
<br>
<br>
2016-03-22 07:54 keltezéssel, Dmitry Kazakov írta:<br>
</div>
<blockquote
cite="mid:CAEkBSfULE8ByjrUKp3p1Nv4rTAb5xH+65RJYf-v0n1kwGtgzJA@mail.gmail.com"
type="cite">
<pre wrap="">Hi, Fazekas!
I know it's dangerous. In my opinion Krita is still very unstable so I can
</pre>
<blockquote type="cite">
<pre wrap="">totally imagine this bug happens sometimes. There are just too many crashes.
</pre>
</blockquote>
<pre wrap="">
</pre>
<blockquote type="cite">
<pre wrap="">For the ref() and deref() calls, I still think it's not a safe way to
allocate and delete things. Because there is an unprotected time gap
between accessing and protecting the tile. You can build extra protection
around this, but multi threading is very tricky sometimes. I know these
calls must be highly regulated by the lockings, but I'm also sure the
deref() call is not protected - it cant be - by the lock I tried to avoid
in my second commit.
</pre>
</blockquote>
<pre wrap="">
Well, this code is here for almost five years now and there was not a
single backtrace pointing to this place. Could you please provide *at least
theoretical* sequence of multithreaded calls that could cause a problem
here? Most of the acquire() and ref() calls happen when at least one object
(mostly (*this) of a KisTile) still owns at least one ref+userCount to the
KisTileData. Or (*this) of KisMementoItem holds the ref counter. I know
that this is more a C-way of coding, but take care, that code is called
million times per second when working with 10k images. So calling ref(),
deref() on every function call is not a go. Memory barrier is not a cheap
operation, so we should avoid it if possible.
I made some tests to see how the different threads are calling these
</pre>
<blockquote type="cite">
<pre wrap="">functions and it happens very often in an asynchronous way.
</pre>
</blockquote>
<pre wrap="">
Yes, this part is called from multiple threads all the time.
</pre>
<blockquote type="cite">
<pre wrap="">Perhaps I can make a test to force this bug for you, but it's not easy,
since it would happen only if the threads are synchronized exactly right.
And my brain hurts if I'm trying to imagine what happening in the different
threads simultanously. So for now, I used the fact that a solution without
time gaps is surely better.
</pre>
</blockquote>
<pre wrap="">
I cannot understand what "gaps" you mean. Please elaborate a bit, then we
will be able to find a solution if needed. Doing a test might be a bit
difficult, but if you decide to do it, you can use
KisTiledDataManagerTest::stressTest() for a base.
</pre>
<blockquote type="cite">
<pre wrap="">I noticed the test directory, but I don't know how to include the
unittests for building so I left the test part untouched. But I'd like to
make the test codes too, even for my csv plugin so how can I build the
unittests actually?
</pre>
</blockquote>
<pre wrap="">
You should pass -DBUILD_TESTING=ON to a cmake command.
</pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Krita mailing list
<a class="moz-txt-link-abbreviated" href="mailto:kimageshop@kde.org">kimageshop@kde.org</a>
<a class="moz-txt-link-freetext" href="https://mail.kde.org/mailman/listinfo/kimageshop">https://mail.kde.org/mailman/listinfo/kimageshop</a>
</pre>
</blockquote>
<br>
</body>
</html>