<div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">Btw, this seems like it is a deeper problem, because project->files() is not a unique list, and I think that also causes the "Show uses" feature to show some results more than once.<br>
</blockquote></div><br>I think it is a true problem with the IndexedString repo. I haven't yet fully understood how it it works (especially the ProjectBaseItem part), but it seems to me that the string hashing has problems with longer strings like (semi-)absolute file paths.<br>
If the hashing is the problem I would suggest a hybrid structure of a tree and hash maps. To make it clear:<br>1) a string is broken into 2-char-chunks that can be interpreted as an uint16_t.<br>2) each uint represents a level within the tree, where each level is a multi-hash-map of that uint to a list of another level of hash maps.<br>
so to find i.e. the string "foobar" it would effectively become:<br>map['fo']->map['ob']->map['ar']->[[data]]<br><br>I've used that structure for indexing the old "beyond unreal" wiki, and it was really fast (300.000 words, a query took less than a second on my old 2ghz machine, and it was not even optimized).<br>
<br>It's just an idea, and I could be totally wrong here, so please forgive me if I am.<br><br>-- Syron<br><br>