Desktop search in KDE -- Baloo configuration, tweaking, functionality, docs?

Mark Rousell mark.rousell at signal100.com
Tue Apr 14 01:27:46 BST 2020


I posted a version of the following on the Opensuse mail list but didn't
get any responses (other than to ask here). I was asking in the context
of Opensuse Tumbleweed but I suspect that the issues relate to any OS
distribution running KDE. I am not wedded to Opensuse. Thus my question
is really about KDE and Baloo.

I know that KDE definitely does what I need but it's a matter of
tweaking and UI integration.



I'm new to KDE (and Opensuse) and I'm trying to get my head around
the file content search capabilities of KDE and Baloo. Is anyone
familiar with Baloo in KDE (on Opensuse or any other KDE-using Linux dist)?

In particular at the moment, my requirements/expectations are:
(1) Dynamic indexing of new file content and changed file content. I'd
expect this to be almost instant on a freshly installed system with only
a few files.
(2) Search of arbitrary files types from the GUI.

On point (1):
I've successfully set Baloo to index file contents and it can do so when
forced (which I've been able to confirm using the "balooshow" command).
However, Baloo doesn't seem to want to dynamically index new or changed
files. For example, I create a new text file with some unique content,
try to search the file contents from the KDE GUI, but the contents are
not found until I issue a "balooctl check" command to force re-indexing.
As far as I can see, Baloo is supposed to use a file system watcher to
see file changes and re-index new/changed files dynamically but it
doesn't seem to be doing it. If it is just waiting, how long do I have
to wait? And can I increase its responsiveness?

On point (2):
Both Krunner and the search box on the KDE start menu appear to filter
file types for which they return data but I can't see how to alter this.
For example, I can create a text file (file.txt) and a HTML file
(file.htm) in ~/Desktop, both containing the text "jabberwocky". If I
then use "balooctl check" to make sure that they are both indexed, I can
confirm that their contents have both been successfully indexed by using
"balooshow -x file.txt" or "balooshow -x file.htm". These two commands
correctly show the indexed words in the two files. If I use the
"baloosearch jabberwocky" command it will correctly return the two
files: "file.txt" and "file.htm". BUT, typing "jabberwocky" into either
the KDE start menu's search box or Krunner will only list the text file,
not the HTML file. In other words, it seems that the GUI is filtering
the file types for which it interrogates the index. How can I force it
to search for all file types unless I explicitly choose to filter them?

I've not yet tested beyond text and HTML files types.

If anyone can help me with this or point me to somewhere else to ask,
I'd be really grateful.

The above was on Opensuse Tumbleweed but I can move to any other Linux
distribution if need be.

For what it's worth, my use case for this is as follows: I have a large
collection (about 1.5 million files) of various document file types,
including text, HTML, MHTML, PDF, XPS, DOC, DOCX, ODT, and various
others, as well as some image file types that have embedded metadata
(e.g. description text or tags). I currently use a different desktop
search system to index and search these files and metadata and I'd like
to migrate to Baloo under KDE. In principle, I know that Baloo can do
all this but the details (e.g. dynamic index update) and KDE GUI
integration don't seem to be quite as transparent or smooth as I'd have
expected.

The desktop search system I currently use is Windows Search on Windows
10 and Windows server: It is extremely fast in both indexing (virtually
instant for new/changed files with no noticeable effect on system speed)
and search. Also the GUI integration is outstanding (available from the
start menu and all file manager windows, and accessible from apps via
API). However, I'd rather move from Windows to Linux. As I mentioned, I
know that Baloo is capable of file content indexing/searching (and I
think metadata too) but the actual
implementation in KDE (at least as implemented on Opensuse Tumbleweed)
does seem to be lacking in smoothness, for want of a better word. So I
am wondering how I can tweak it as per my two main points above.



-- 
Mark Rousell







More information about the kde mailing list