[Nepomuk] Review Request: Reduce number of watches created by nepomukfilewatch

Simeon Bird bladud at gmail.com
Tue Aug 28 05:42:22 UTC 2012


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/106086/
-----------------------------------------------------------

(Updated Aug. 28, 2012, 5:42 a.m.)


Review request for Nepomuk, Vishesh Handa and Sebastian Trueg.


Changes
-------

Yes, we have a bug, it's watching all folders, in the filter list or not. 
Introduced by aae24be923c0868725952e3df35b11bc76c7b4ce in June, I think.

The attached fixes the bug, by not watching folders in the filter list. 

The bonus this time is to delete an unneeded private variable, watchHiddenFolders.

Could we also add a string to the index folders kcm explaining this? Perhaps something like: 
"Files which are not indexed can still have metadata, but files which are on the filter list cannot." ?


Description
-------

Current master nepomukfilewatcher installs watches on all sub-folders of a watched folder.

This is problematic:

 - It means we have to walk the entire directory tree, even for not-indexed folders.
This is quite a lot of work if you happen to have a large complex directory structure
mounted over a network in your $HOME (as I do)

- It means we get inotify watches for directories which are on the filter list; eg, on this computer
$HOME/build/nepomuk-core/services/filewatch/CMakeFiles/nepomukfilewatch.dir/__/fileindexer
is watched, causing the filewatcher to go nuts every time I build something.

- It means we install many more watches than we need to, vastly increasing the probability  
of hitting the inotify limit.

This code instead walks the tree until it finds a folder we don't want to index and then STOPS. 
I couldn't find a way to avoid walking the whole tree with QDirIterator and QDir::Subdirectories, 
so I use QDirIterator without subdirectories, then create a new QDirIterator for each subfolder to index.

I can see two objections to this change: 

1) If someone moves a file into an ignored directory, they will now presumably lose their metadata. 
This is true, in my opinion not a big problem; the default configuration is to watch
$HOME minus temporary build directories. If people are moving files into temporary 
directories they should probably lose the metadata, and if people manually add directories
to the ignore list they probably have a good reason and expect nepomuk to ignore them. 

2) I changed filterWatch from always returning true to returning true if we want to watch the
file and false otherwise. I couldn't work out the reason for it always returning true before, 
so whatever it was, I've probably broken it. 

Bonus fixes:

 - Properly pass the return value of addWatch up the tree, so that if we run out of watches, 
we stop trying to add more.

- Check for inotify on kernels that have a two-number version string, like 3.0

- To find the length of event->name, qstrlen was used. If an event is returned 
for a file outside a watched directory, event->name will not be allocated, and qstrlen 
may read beyond the end of allocated memory, causing chaos, anarchy and confusion. 
Use qstrnlen instead.

Thanks, let me know what you think.


Diffs (updated)
-----

  services/filewatch/kinotify.h ab12d66 
  services/filewatch/kinotify.cpp 4a744d4 
  services/filewatch/nepomukfilewatch.cpp 9fd5d9c 

Diff: http://git.reviewboard.kde.org/r/106086/diff/


Testing
-------

Compiled, run, used for a couple of days, checked which files were actually watched, timed the filewatch service's startup.


Thanks,

Simeon Bird

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/nepomuk/attachments/20120828/0650a769/attachment-0001.html>


More information about the Nepomuk mailing list