fuzzy-matching in quickopen...

Waqar Ahmed waqar.17a at gmail.com
Sun Sep 25 22:13:58 BST 2022


You seem to think that an in sequence match should always be preferred.
That's not how it works _by design_. And the examples you gave are not very
good examples at all. If you are doing ese like searches for the given
filenames, then don't expect good results, rather improve your searches
instead so that the tool is able to help you better.

- matches at the beginning are preferred because usually people tend to
search stuff like that and because all other fuzzy filter implementations
that I came across did the same thing. This is not changing, sorry.
- in sequence matches are preferred once the pattern is >= 4.

This is called fuzzy filter for a reason. It's not exact matching as you
want it to be.

I have created an MR which should prefer open files over non-open ones. If
you can try that, it would be great.

Thanks.



On Mon, Sep 26, 2022, 1:49 AM Alexander Neundorf <neundorf at kde.org> wrote:

> Hi,
>
> On Samstag, 24. September 2022 00:06:34 CEST Waqar Ahmed wrote:
> > I am against adding the old way, but if it's optional, ok sure as long as
> > it is disabled by default.
> >
> > Your approach is completely incorrect though and the only reason I will
> say
> > ok to the patch is because Christoph already said ok. We can and should
> > improve the algorithm instead rather than just bringing back the old way
> on
> > the first complaint.
>
> Here are 3 examples (in the kate source tree) where the calculated score
> is
> IMO not good:
>
> I want to switch to "KateSearchCommand.cpp", which is already open.
> filter "ese":
> KateSearchCommand.cpp gets a score of 113
> MultilineStartEndOfLineMatch.txt gets a higher score of 116, even though
> it
> does not contain the string "ese", but only the "eS" and "E" with 4
> characters
> inbetween
> I think a string which contains the filter exactly should get a higher
> score
> than a string which "just" contains the characters.
>
>
> filter "tes":
> KateSearchCommand.cpp score gets a score of 118 and comes in place 23,
> i.e.
> not visible without scrolling.
> tests.qrc score gets a higher score of 159, probably because it starts
> with
> "tes", but it is not open yet. There are about 20 files which start with
> "test", they are all not open.
> I often leave out the start of the filename, because often this is the
> same for
> many files in a project (e.g. "kate" in kate, or "q" in Qt, or "algo" in
> some
> other project), so I start typing with something in the middle of the
> filename.
> So I'd suggest that the "is open" bonus should be bigger than the "starts
> with" bonus.
>
> Different example: I want to switch to "kfts_fuzzy_match.h"
> filter "fts":
> kfts_fuzzy_match.h gets a score of 100
> filetree_model_test.cpp gets a higher score of 120. Again, I'd suggest
> that a
> string which contains the filter string exactly should get a higher score
> than
> a string which "just" contains the characters.
>
> The following gives IMO better results:
>
> bonus for "already open" = 15
>
> if (matched) {
>    int sequentialBonus = 25;
>    int separatorBonus = 10; // bonus if match occurs after a separator
>    int camelBonus = 10; // bonus if match is uppercase and prev is lower
>    int firstLetterBonus = 10; // bonus if the first letter is matched
>    int leadingLetterPenalty = 0; // penalty applied for every letter in
> str
> before the first match
>    int maxLeadingLetterPenalty = 0; // maximum penalty for leading letters
>    int unmatchedLetterPenalty = -1; // penalty for every letter that
> doesn't
> matter
>    int nonBeginSequenceBonus = 20;
>
>
> I'm not sure I understand this. Doesn't this mean that a long filename
> gets a
> big bonus ?
>             // extra points if file exists in project root
>             // This gives priority to the files at the root
>             // of the project over others. This is important
>             // because otherwise getting to root files may
>             // not be that easy
>             if (!matchPath) {
>                 score += (sm->idxToFilePath(sourceRow) == name) *
> name.size();
>
>
> Alex
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kwrite-devel/attachments/20220926/45a5b71d/attachment.htm>


More information about the KWrite-Devel mailing list