[Nepomuk] Fwd: Patches and issues with search/strigi

Daniel Winter dw at danielwinter.de
Fri Jan 1 17:35:53 CET 2010


Sorry, didn't know that the list has been moved.


---------- Forwarded message ----------
From: Daniel Winter <dw at danielwinter.de>
Date: 2010/1/1
Subject: Patches and issues with search/strigi
To: nepomuk-kde <nepomuk-kde at semanticdesktop.org>


Hello together,

found some time today to at least try out the "new Nepomuk" (trunk).

Found some issues and fixed two/three of them.

Filewatch.patch: (against filewatch service)

The nfo::fileName wasn't updated when moving file. (The search ioslave
uses that to get the filename). Patch fixes this

dolphinsearch.patch: (against dolphin)

Two isses fixed by this patch:
- Date - Today was an "is equal query", but it works on times. So it
gives no results most of the time. Change it to is equal or greater.
In theory it will also hit files from the future. Put they shouldn't
exist anyway.

(There are more similar but harder to fix issues for example search
for files eqal or less 1.1.2010 will not find files from today.
(because it compares it with the start of the current day. And so
files from today are not equal or less that..)

- Searches in NIE::lastModified instead of NAO::lastModified with the
patch.  I think that is what the user cares about. (The modifed date
of the file, not the information when tagged it or something.) This
way one actually get the expected results. (or do you really think NAO
is correct? )

Can i commit those?


There are some more issues, which I don't have the time or knowledge
(atm) to fix:


- strigiservice  monitoring:

It doesn't catch changes in files. It recurses over the directorys for
changed modifed time. The problem is: The directory modified time only
gets changed when one renames, creates or deletes file in the
directory. Not when a file just gets changed (like the modification
date or content). I think catching this is important.

There is no obvious solution to this. Inotify would be one. But there
is the limit on handles. I think it is the right way anway. Most users
will not add all directories to strigi/nepomuk and therefore never hit
the limit. Nepomuk could count the number of directories the user has
in his indexed dirs and check if there are enough inotify handles set
in the system. (It should leave at least 30% of them to other
applications) If there are enough (respecting the reserver for the
rest) it should use them.

The kde filewatch thing (the one the filewatch service uses) could be
a fallback for systems with not enough or no inotify at all.

- Search result caching/auto updating:

There seam to be some regression there. You have to wait a lot of time
or restart Nepomuk to find newly tagged files for example.

1. Search for files tagged with "x"
2. Tag another file with "x"
3. search for files tagged with "x"

The file will not show up for a quite some time. (or restart of
Nepomuk) It used to get updated in realtime in 4.3.

-- Nepomuk IO-Slave  URL/Forwaring/file handling

Not sure what exactly is happening or if it is intented to work this
way. From my point of view (as a user of it) it is a regression.

Opening search results from Dolphin opens the nepomuk://  uri or
something. You get some really weird filename shown in the
Applications. One could live with that. The problem is: Non KDE
Applications are getting some tmp file. It doesn't look like writing
to this file is possible. (KIO asks if I want to upload the file, and
if I say yes it tells me that writing to Nepomuk is not supported).

I believe related to this is the really slow preview of files (for
example Images) in Dolphin when there are a lot. Every single preview
results in a request to Nepomuk wich opens and closes a connection to
soprano on every request (this seems to be a general issue?)

Also somehow the caching of Nepomuk::ResourceManager doesn't really
work? It opens a new Soprano connection to virtuoso on every single
mouse over on the results of a search in Dolphin.

-- Dolphin search (or the search io slave) should have a (default)
limit of results.


I am just reporting these. I most likely will not find the time to
look further into fixing them. I will try a patch or bring more
details if needed though.

Daniel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: filewatch.patch
Type: application/octet-stream
Size: 1244 bytes
Desc: not available
Url : http://mail.kde.org/pipermail/nepomuk/attachments/20100101/f21d00ce/attachment.dll 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dolphinsearch.patch
Type: application/octet-stream
Size: 3296 bytes
Desc: not available
Url : http://mail.kde.org/pipermail/nepomuk/attachments/20100101/f21d00ce/attachment-0001.dll 


More information about the Nepomuk mailing list