[Nepomuk] Review Request: Reduce number of watches created by nepomukfilewatch

Simeon Bird bladud at gmail.com
Tue Aug 21 03:59:52 UTC 2012



> On Aug. 20, 2012, 9:10 a.m., Sebastian Trueg wrote:
> > This is an absolute no-go for several reasons:
> > 1. Even non-indexed files can have annotations like tags or comments or download locations, etc.
> > 2. The move-into-unwatched dir problem you already mention.
> > 3. Nepomuk's index-folder system is more complex. It allows to index sub-folders of non-indexed folders.
> > 
> > Sadly there is currently no way around installing this many watches. It needs to be fixed in the kernel.
> > 
> > The qstrnlen and kernel version fixes are good though.
> 
> Simeon Bird wrote:
>     Well - is there some way to compromise a bit here? 
>     
>     Sorry to be argumentative, I imagine you have to explain this a lot. 
>     
>     The problem I'm trying to solve is not so much the number of watches, 
>     as to provide finer control over which directories are watched.
>     
>     There are (I think) good reasons for not wanting to watch certain specific directories; 
>     for pretty much the same reason as there is already an option to not watch removable drives.
>     
>     In my case ~/sshfs was (basically) a removable network drive, which just happened 
>     to be mounted under $HOME. CMakeFiles is similarly a temporary directory, whose contents can change rapidly.
>     
>     I know, because I made these directories, that they are just temporary storage mounted in an unusual way,
>     so they don't need to be watched, but I need a way to tell nepomuk this.
>     Currently nepomuk watches $HOME but not hidden folders (I think), which is already a heuristic;
>     what I want to do is add a way to tune this heuristic by hand.
>     
>     I actually always (mis-)understood (from long before I read the code) that "index these folders" meant 
>     "these folders are the ones nepomuk cares about" which is why I tried to re-use the index list 
>     for this feature. However, from what you say this is not a good choice. 
>     Perhaps one could add another list of "do not watch these folders, 
>     they are really removable drives" to the kcm? 
>     
>     Or (better) perhaps re-use the filter list? The default for that contains 
>     mostly temporary build files already.
>     
>     So then we would have:
>     - Folders on the index list are watched and indexed.
>     - All other folders in $HOME not on the filter list are watched
>     - Folders on the filter/do-not-watch list and their subfolders are neither watched nor indexed.
>     
>     What do you think? Could you consider accepting something like that? 
>     I agree that fixing the problem in the kernel is the right way of doing things, 
>     but until that happens we're stuck with heuristics and best guess behaviour, so 
>     we might as well try to make the best guesses as good as we can.
>     
>     Thanks,
>     Simeon

PS: qstrnlen and kernel versions pushed and cherry-picked to 4.9, thanks.


- Simeon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/106086/#review17745
-----------------------------------------------------------


On Aug. 19, 2012, 5:50 p.m., Simeon Bird wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/106086/
> -----------------------------------------------------------
> 
> (Updated Aug. 19, 2012, 5:50 p.m.)
> 
> 
> Review request for Nepomuk, Vishesh Handa and Sebastian Trueg.
> 
> 
> Description
> -------
> 
> Current master nepomukfilewatcher installs watches on all sub-folders of a watched folder.
> 
> This is problematic:
> 
>  - It means we have to walk the entire directory tree, even for not-indexed folders.
> This is quite a lot of work if you happen to have a large complex directory structure
> mounted over a network in your $HOME (as I do)
> 
> - It means we get inotify watches for directories which are on the filter list; eg, on this computer
> $HOME/build/nepomuk-core/services/filewatch/CMakeFiles/nepomukfilewatch.dir/__/fileindexer
> is watched, causing the filewatcher to go nuts every time I build something.
> 
> - It means we install many more watches than we need to, vastly increasing the probability  
> of hitting the inotify limit.
> 
> This code instead walks the tree until it finds a folder we don't want to index and then STOPS. 
> I couldn't find a way to avoid walking the whole tree with QDirIterator and QDir::Subdirectories, 
> so I use QDirIterator without subdirectories, then create a new QDirIterator for each subfolder to index.
> 
> I can see two objections to this change: 
> 
> 1) If someone moves a file into an ignored directory, they will now presumably lose their metadata. 
> This is true, in my opinion not a big problem; the default configuration is to watch
> $HOME minus temporary build directories. If people are moving files into temporary 
> directories they should probably lose the metadata, and if people manually add directories
> to the ignore list they probably have a good reason and expect nepomuk to ignore them. 
> 
> 2) I changed filterWatch from always returning true to returning true if we want to watch the
> file and false otherwise. I couldn't work out the reason for it always returning true before, 
> so whatever it was, I've probably broken it. 
> 
> Bonus fixes:
> 
>  - Properly pass the return value of addWatch up the tree, so that if we run out of watches, 
> we stop trying to add more.
> 
> - Check for inotify on kernels that have a two-number version string, like 3.0
> 
> - To find the length of event->name, qstrlen was used. If an event is returned 
> for a file outside a watched directory, event->name will not be allocated, and qstrlen 
> may read beyond the end of allocated memory, causing chaos, anarchy and confusion. 
> Use qstrnlen instead.
> 
> Thanks, let me know what you think.
> 
> 
> Diffs
> -----
> 
>   services/filewatch/kinotify.h ab12d66 
>   services/filewatch/kinotify.cpp 509abff 
>   services/filewatch/nepomukfilewatch.cpp 9fd5d9c 
> 
> Diff: http://git.reviewboard.kde.org/r/106086/diff/
> 
> 
> Testing
> -------
> 
> Compiled, run, used for a couple of days, checked which files were actually watched, timed the filewatch service's startup.
> 
> 
> Thanks,
> 
> Simeon Bird
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/nepomuk/attachments/20120821/6395a1f6/attachment.html>


More information about the Nepomuk mailing list