KDirWatch bug and the analysis. Help is welcome!

David Faure faure at kde.org
Fri Aug 2 14:41:31 BST 2013


On Friday 02 August 2013 15:07:19 Mark wrote:
> Just for my view on this. What's the difference between stat() and
> kio::stat()?

Err, you know the answer is in the docs, right?
stat() = unix-only, local files only, sync
KIO::stat() = portable, network-transparent, async

> >> Or follow your initial approach and not let KDirWatch do these stat
> >> calls for new files and just re-list.
> > 
> > You're confusing two layers. KDirLister[Cache] is the one doing the re-
> > listing. But that's one layer above KDirWatch (which has other users than
> > KDirLister).
> 
> I actually wrote it down poorly. KDirWatch is sending a signal on
> which KDirLister responds and re-lists the whole folder.

Right. Which means the re-listing is unrelated to what KDirWatch has to do 
before emitting the signal.

> Funny thing is that KDirWatch already has the proper API for this. It
> just doesn't seem to be used.
> 
> It has the signals:
> created(QString)
> deleted(QString)
> dirty(QString)

I know that.

> where dirty is a bit abused i guess ;)

Yes and no. As I said, it's all we know when using other backends than 
inotify.

> > It used to be. Then it moved down to kdecore because, well, it's not only
> > useful in relation to KIO (the network transparency technology).
> > Just like QFileSystemWatcher is also in QtCore.
> 
> So where is it going to end up in KF5?

KCoreAddons. (Doesn't really matter for this discussion, does it?)

Well, unless you want to work on QFileSystemWatcher, which would be even 
better in the long run. But you would start from even further there, it 
doesn't even have created/deleted signals (although I thought I saw someone 
starting to work on that a year ago or so... clearly that wasn't completed).

> + a little idea. While monitoring the inotify events i noticed that
> you can have quite a bit of stat overhead just by updating a file in a
> way that dolphin is doing it. The .directory. It is probably the
> safest way with minimal risk of data loss.
> So now i was thinking, if i try to improve this code to actually emit
> the create and delete signals as well then in this case you will end
> up with some needless signals. Lets take my initial example, these are
> send from inotify:
> 
> (CREATE signal) /home/mark/massive_folder/.directory.lock
> (MODIFY signal) /home/mark/massive_folder/.directory.lock
> (CREATE signal) /home/mark/massive_folder/.directoryc13357.new
> (MODIFY signal) /home/mark/massive_folder/.directoryc13357.new
> (DELETE signal) /home/mark/massive_folder/.directoryc13357.new
> (CREATE signal) /home/mark/massive_folder/.directory
> (DELETE signal) /home/mark/massive_folder/.directory.lock

Right. Which shows that inotify has very low-level notifications, while 
ideally we'd like KDirWatch to only say created(".directory") as a result of 
all this...

> in a fixed KDirWatch that would become:
> 
> (CREATED signal) /home/mark/massive_folder/.directory.lock
> (DIRTY signal) /home/mark/massive_folder/.directory.lock
> (CREATED signal) /home/mark/massive_folder/.directoryc13357.new
> (DIRTY signal) /home/mark/massive_folder/.directoryc13357.new
> (DELETED signal) /home/mark/massive_folder/.directoryc13357.new
> (CREATED signal) /home/mark/massive_folder/.directory
> (DELETED signal) /home/mark/massive_folder/.directory.lock

We don't want that. See below.

> That's just in a few miliseconds while the end result is a dirty file:
> /home/mark/massive_folder/.directory
> 
> The temporary file and the lock file both get deleted and where not
> there in the first place so there is no need to send any signal about
> those files.
> 
> That makes me wonder if we could prevent some signals to be send by
> delaying the signal sending for - lets say - 100ms. If during that
> time another inotify event comes by that makes it pointless to send a
> created signal (for example an inotify that tells us the file is
> deleted again, like the lock file) then we simply don't send any
> signal for that file. This would be somewhat the same like the
> batching i implemented for listDir. It would be sending
> created/deleted/dirty signals if needed for every 100ms as long as
> inotify events stream in.

That's already a bit like what KDirWatch slotRescan is doing - it's called 
after a timer, in order to compress some events.

Note that this is also why there are additional stat() calls. If I remember 
the KDirWatch code correctly, we basically interpret the inotify signals as 
"check this path in 500ms" and 500ms later we use stat() to check if the file 
is still there, and if anything about it changed, etc. So I think this is 
already there to some extent, please look into the signals that are actually 
emitted by KDirWatch.

I agree that we don't want a one-for-one mapping with the inotify events, but 
that's not what we do anyway.

> This will obviously not prevent every case. If a file got added or
> removed just after the 100ms trigger then you'd end up with the same
> signals as in the above example. But it has the potential to prevent
> signals from being send for small file changes.

This isn't about a recurring timer, but a single-shot timer after the first 
event. So as long as the overall operation takes less 100ms the compression 
will happen.

-- 
David Faure, faure at kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5





More information about the kde-core-devel mailing list