[KPhotoAlbum] Quick search proof of concept patch
Shawn Willden
shawn-kimdaba at willden.org
Sun May 13 07:51:45 BST 2007
A lot of ideas related to my "quick search" notion were kicked around, but I
felt like I needed to simplify it some to test out some ideas. Attached is
the not-so-quick-but-quite-dirty result.
I'd like to get feedback and performance results (the code is instrumented and
prints some timing data to stdout).
One of my biggest concerns for the idea was performance. Would it be possible
to make it fast enough for interactive search-as-you-type usage, so I decided
to focus on that first. Besides, I had some ideas for a nifty approach to
searching that would be fun to play with ;-)
This patch implements the basic idea with the following constraints:
1. It only works for XML DB.
2. It doesn't really manage the UI correctly; some stuff acts funny.
3. It is slow in the "home" view, though performance seems good to me in the
thumbnail view.
4. It doesn't support quoting for searching for keywords with embedded spaces
5. It doesn't support "OR" -- all search terms are ANDed together
6. It does integrate with "drilldown" searches and date bar searches, by the
simple expedient of applying the quick search first and then testing the
remaining images against the drilldown and data bar criteria.
7. It doesn't search on date or EXIF fields, only 'tags'. It does, however,
search all tags, including supercategories, case insensitively.
8. It searches only by tag "prefix". That is "foo" will
match "foo", "foobar" and "footh", but not "myfoo".
In addition to all of that, the code is somewhat ugly, doesn't follow
kphotoalbum style standards and generally needs a lot of work.
That said, my goal was to test the performance of my in-memory approach, and
I'm quite happy with the results, at least with my database on my system, and
I think it will also do well even on larger databases and slower machines.
In my case (Core Duo 2.2 GHz, 12.6K images, 1109 unique tags) building the
ternary search tree used for the lookups takes about 700 ms, and the
worst-case searches take under 40 ms. Most searches take under 2 ms.
Please give it a try, especially those of you with big image databases and
lots of tags. To stress it a little, pick some single-letter searches that
have large result sets. "i" matches "Image" from the media type category (it
probably doesn't make sense to index that one), so that's a big result set.
Then put them together. For example, I tried "i j c m e". Those other
letters are the first letters of my kids' names, so they have large result
sets.
Oh, and do all this from the thumbnail view, with your full database showing.
The search tree works just as well in the "home" view, but the process used
to recalculate the tags per category is slow.
Thanks,
Shawn.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: quicksearch_v_0_1.patch
Type: text/x-diff
Size: 29464 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/kphotoalbum/attachments/20070513/45cf4f58/attachment.patch>
More information about the Kphotoalbum
mailing list