KDirWatch bug and the analysis. Help is welcome!

Mark markg85 at gmail.com
Fri Aug 2 14:49:18 BST 2013


On Fri, Aug 2, 2013 at 3:41 PM, David Faure <faure at kde.org> wrote:
> On Friday 02 August 2013 15:07:19 Mark wrote:
>> Just for my view on this. What's the difference between stat() and
>> kio::stat()?
>
> Err, you know the answer is in the docs, right?
> stat() = unix-only, local files only, sync
> KIO::stat() = portable, network-transparent, async
>
>> >> Or follow your initial approach and not let KDirWatch do these stat
>> >> calls for new files and just re-list.
>> >
>> > You're confusing two layers. KDirLister[Cache] is the one doing the re-
>> > listing. But that's one layer above KDirWatch (which has other users than
>> > KDirLister).
>>
>> I actually wrote it down poorly. KDirWatch is sending a signal on
>> which KDirLister responds and re-lists the whole folder.
>
> Right. Which means the re-listing is unrelated to what KDirWatch has to do
> before emitting the signal.
>
>> Funny thing is that KDirWatch already has the proper API for this. It
>> just doesn't seem to be used.
>>
>> It has the signals:
>> created(QString)
>> deleted(QString)
>> dirty(QString)
>
> I know that.
>
>> where dirty is a bit abused i guess ;)
>
> Yes and no. As I said, it's all we know when using other backends than
> inotify.
>
>> > It used to be. Then it moved down to kdecore because, well, it's not only
>> > useful in relation to KIO (the network transparency technology).
>> > Just like QFileSystemWatcher is also in QtCore.
>>
>> So where is it going to end up in KF5?
>
> KCoreAddons. (Doesn't really matter for this discussion, does it?)
>
> Well, unless you want to work on QFileSystemWatcher, which would be even
> better in the long run. But you would start from even further there, it
> doesn't even have created/deleted signals (although I thought I saw someone
> starting to work on that a year ago or so... clearly that wasn't completed).
>
>> + a little idea. While monitoring the inotify events i noticed that
>> you can have quite a bit of stat overhead just by updating a file in a
>> way that dolphin is doing it. The .directory. It is probably the
>> safest way with minimal risk of data loss.
>> So now i was thinking, if i try to improve this code to actually emit
>> the create and delete signals as well then in this case you will end
>> up with some needless signals. Lets take my initial example, these are
>> send from inotify:
>>
>> (CREATE signal) /home/mark/massive_folder/.directory.lock
>> (MODIFY signal) /home/mark/massive_folder/.directory.lock
>> (CREATE signal) /home/mark/massive_folder/.directoryc13357.new
>> (MODIFY signal) /home/mark/massive_folder/.directoryc13357.new
>> (DELETE signal) /home/mark/massive_folder/.directoryc13357.new
>> (CREATE signal) /home/mark/massive_folder/.directory
>> (DELETE signal) /home/mark/massive_folder/.directory.lock
>
> Right. Which shows that inotify has very low-level notifications, while
> ideally we'd like KDirWatch to only say created(".directory") as a result of
> all this...
>
>> in a fixed KDirWatch that would become:
>>
>> (CREATED signal) /home/mark/massive_folder/.directory.lock
>> (DIRTY signal) /home/mark/massive_folder/.directory.lock
>> (CREATED signal) /home/mark/massive_folder/.directoryc13357.new
>> (DIRTY signal) /home/mark/massive_folder/.directoryc13357.new
>> (DELETED signal) /home/mark/massive_folder/.directoryc13357.new
>> (CREATED signal) /home/mark/massive_folder/.directory
>> (DELETED signal) /home/mark/massive_folder/.directory.lock
>
> We don't want that. See below.
>
>> That's just in a few miliseconds while the end result is a dirty file:
>> /home/mark/massive_folder/.directory
>>
>> The temporary file and the lock file both get deleted and where not
>> there in the first place so there is no need to send any signal about
>> those files.
>>
>> That makes me wonder if we could prevent some signals to be send by
>> delaying the signal sending for - lets say - 100ms. If during that
>> time another inotify event comes by that makes it pointless to send a
>> created signal (for example an inotify that tells us the file is
>> deleted again, like the lock file) then we simply don't send any
>> signal for that file. This would be somewhat the same like the
>> batching i implemented for listDir. It would be sending
>> created/deleted/dirty signals if needed for every 100ms as long as
>> inotify events stream in.
>
> That's already a bit like what KDirWatch slotRescan is doing - it's called
> after a timer, in order to compress some events.
>
> Note that this is also why there are additional stat() calls. If I remember
> the KDirWatch code correctly, we basically interpret the inotify signals as
> "check this path in 500ms" and 500ms later we use stat() to check if the file
> is still there, and if anything about it changed, etc. So I think this is
> already there to some extent, please look into the signals that are actually
> emitted by KDirWatch.
>
> I agree that we don't want a one-for-one mapping with the inotify events, but
> that's not what we do anyway.
>
>> This will obviously not prevent every case. If a file got added or
>> removed just after the 100ms trigger then you'd end up with the same
>> signals as in the above example. But it has the potential to prevent
>> signals from being send for small file changes.
>
> This isn't about a recurring timer, but a single-shot timer after the first
> event. So as long as the overall operation takes less 100ms the compression
> will happen.
>
> --
> David Faure, faure at kde.org, http://www.davidfaure.fr
> Working on KDE, in particular KDE Frameworks 5
>

Just responding after all of the above.

Thank you very much for all your input, you've helped me tremendously here :)
Next up, i'il make a patch (with the unittests still working!) and
report back or with a followup discussion if i encounter unforeseen
issues (not unlikely in this area).




More information about the kde-core-devel mailing list