Review Request 111462: Prevent a flood of file dirty signals from causing tons of stat calls in KDirLister

Dawit Alemayehu adawit at kde.org
Sun Jul 14 19:11:32 BST 2013



> On July 9, 2013, 1:38 p.m., David Faure wrote:
> > To save 2 stat() per second, you want the user to not be notified anymore that his download is going well and the local file is growing? I don't think this is a good compromise at all. How is 2 stat calls per second "a flood" ?
> > 
> > (I'm thinking of the use case of downloading one very big file; copying tons of small files is a different issue)
> 
> Dawit Alemayehu wrote:
>     To me the number stats, which btw are 4 and not 2 because of the additional destination directory stat, simply seem to be an overkill. If you copy just 4 files that is 16 stats every second until these downloads complete. How is this any different in terms of impact to users with their home partition on nfs/smb mounts? In fact this code is exactly in the same path as the stat problem I fixed recently with https://git.reviewboard.kde.org/r/110834/. Granted this current problem only applies if the destination a local file.
>     
>     Anyhow, I understand the implication of my patch and have said as much when I posted it. It would mean the user would not see the size increase by simply looking at the size column in Konqueror/Dolphin details view. Having said that, would it really matter to the user if the update happened every second instead of every half second? That by itself would cut down the # of stats by half. And if I could figure out the source of the continuous destination directory stat and stop that, we would have reduced the number of stats by 3/4 without really impacting the user's ability to check on the progress of file downloads.

Well. I found out the source that continually stats the destination directory. It is KDirWatch. Basically, here is what happens on my machine (Linux box):

1. KDirWatch calls KDirWatchPrivate::inotifyEventReceived when it receives an event from inotify.
2. KDirWatchPrivate::inotifyEventReceived determines the event type (modification of a file), appends the changed filename to the m_pendingFileChanges list and starts the rescan timer (500 msec).
3. The rescan timer fires and invokes KDirWatchPrivate::slotRescan.
4. KDirWatchPrivate::slotRescan marks all entries with m_mode set to "INotifyMode" (inotify) or QFSWatchMode as dirty and then iterates through these entries to deal with any changes. Inside of this iterator, it calls KDirWatchPrivate::scanEntry to find out what changed about a given entry. 
5. KDirWatchPrivate::scanEntry uses an entries m_mode settings to determine what to do. In case of inotify it does the following:

  if (e->m_mode == FAMMode || e->m_mode == INotifyMode) {
    // we know nothing has changed, no need to stat
    if(!e->dirty) return NoChange;
    e->dirty = false;
  }

That is a problem because all the entries were already dirtied in step #4. Even though inotify informed us explicitly that a file we are downloading that changed, we have marked the destination directory as dirty. Hence, scanEntry will end up doing a stat call on it over and over again until the file(s) we are downloading are done. The question I have here is why? Is this intentional? Or are we doing something wrong in either slotRescan or scanEntry?


- Dawit


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/111462/#review35788
-----------------------------------------------------------


On July 9, 2013, 1:05 p.m., Dawit Alemayehu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/111462/
> -----------------------------------------------------------
> 
> (Updated July 9, 2013, 1:05 p.m.)
> 
> 
> Review request for kdelibs and David Faure.
> 
> 
> Description
> -------
> 
> The attached patch changes the logic that attempts to deal with a flood of the dirty signals received by KDirLister when existing files are change. The current code simply puts the file in an update pending list and starts a delayed timer (500 ms) to process the request. As a result, every half a second or so the pending update request is processed. This would have been fine were it not for the fact that the processing results in a call to KFileItem::refersh which does a stat the local file. Hence, a stat is done on the file being downloaded every 500 ms. Additionally, some other code is also doing a stat every half second on the download destination directory as well.
> 
> This patch attempts to prevent the flood of stat calls by restarting the update timer if we get another dirty signal on a file that is already in the update pending list. IOW, we only do the last stat call. The downside of doing that is the information about the file being download will be stale until the download finishes. Unfortunately I still have not been able to figure out what is doing the stat on the destination directory itself and hence do not have a solution for that yet. Any ideas?
> 
> 
> Diffs
> -----
> 
>   kio/kio/kdirlister.cpp aad6893 
> 
> Diff: http://git.reviewboard.kde.org/r/111462/diff/
> 
> 
> Testing
> -------
> 
> - strace an instance of Konqueror or Dolphin.
> - copy a file from a remote location (ftp,sftp, smb) to your local file system.
> - check if there is a flood of stat calls on both the file being copied and its destination directory.
> 
> 
> Thanks,
> 
> Dawit Alemayehu
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20130714/505ee0ad/attachment.htm>


More information about the kde-core-devel mailing list