Removing doubled search results from the grepview plugin

George Syron mr.syron at googlemail.com
Fri Jul 9 12:09:34 UTC 2010


>
> Btw, this seems like it is a deeper problem, because project->files() is
> not a unique list, and I think that also causes the "Show uses" feature to
> show some results more than once.
>

I think it is a true problem with the IndexedString repo. I haven't yet
fully understood how it it works (especially the ProjectBaseItem part), but
it seems to me that the string hashing has problems with longer strings like
(semi-)absolute file paths.
If the hashing is the problem I would suggest a hybrid structure of a tree
and hash maps. To make it clear:
1) a string is broken into 2-char-chunks that can be interpreted as an
uint16_t.
2) each uint represents a level within the tree, where each level is a
multi-hash-map of that uint to a list of another level of hash maps.
so to find i.e. the string "foobar" it would effectively become:
map['fo']->map['ob']->map['ar']->[[data]]

I've used that structure for indexing the old "beyond unreal" wiki, and it
was really fast (300.000 words, a query took less than a second on my old
2ghz machine, and it was not even optimized).

It's just an idea, and I could be totally wrong here, so please forgive me
if I am.

-- Syron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kdevelop-devel/attachments/20100709/3934e9e9/attachment.html>


More information about the KDevelop-devel mailing list