Find uses/rename is being painfully slow

Tue Mar 1 21:53:39 UTC 2011

2011/3/1 Dmitry Risenberg <dmitry.risenberg at gmail.com>:
> 2011/3/1 David Nolden <david.nolden.kdevelop at art-master.de>:
>> 2011/3/1 Dmitry Risenberg <dmitry.risenberg at gmail.com>:
>>>>> I've tried adding the document for parsing twice - simple parsing with
>>>>> normal priority and full parsing with low priority. This way renaming
>>>>> speeds up a lot.
>>>> I cannot see how this could speed up the renaming. Before uses can be
>>>> searched, the _full_ parsed result is required anyway.
>>>
>>> The full parsing is performed in background when there are no other
>>> parsing jobs, so that the user won't have to wait for it when he calls
>>> for a rename.
>>
>> But the user has to wait until the rename is finished anyway.. also he
>> wants to see a preview of the items are will be renamed, so where's
>> the gain?
>
> Currently all files are reparsed with AllDeclarationsContextsAndUses
> when the renaming is triggered. But this parsing can be done on
> startup, after simple parsing has finished, so when the renaming is
> triggered, only the changed files have to be reparsed while the rest
> of files will be up to date - this is much faster than reparsing
> everything.

Ah ok. But I am against it. It builds a _huge_ duchain store, which
would explode on really large projects like kdelibs/kdebase or even
worse the linux kernel, and would hog the whole CPU forever. Also, the
information is not very valuable, because a reparse is required anyway
as soon as a header has changed which affects the files.

>>
>>>>> However, there is still a problem - some of the files are reparsed too
>>>>> often (on every startup, even if they are not changed), and this kills
>>>>> most of the gained benefits. I'm trying to debug this issue, but it is
>>>>> complicated, because there are no unit tests for background parsing
>>>>> and duchain. Looks like the call for 'featuresSatisfied' on
>>>>> preprocessjob.cpp:199 is failing, but I can't track why.
>>>>>
>>>>> So the question is - how can I debug duchain-related code and see what
>>>>> is stored in duchain - which file, with what features, when it was
>>>>> parsed, etc.? And is there any special way to debug updating-related
>>>>> issues?
>>>>
>>>> We'll have to live with the fact that we need to re-parse nearly
>>>> always. For example, it's enough if you open one central header-file,
>>>> insert one character and remove it again (the revision is now
>>>> changed), now you start your rename, parse everything, and when you
>>>> re-start kdevelop you will have to re-parse everything again, because
>>>> the document revision is different.
>>>
>>> I am aware of that, but I mean opening/closing without changing
>>> anything - some open files still get reparsed, and mostly the same
>>> each time. I want to fix that, because it may become costly if full
>>> parsing is performed.
>>
>> I don't exactly remember, but maybe we assign different revisions to
>> open documents than to closed documents (depends on kate). We probably
>> should always force the revision to zero for documents without a
>> difference to disk.
>
> Looks like I found the cause of problem - it is preprocessjob.cpp:199:
>
> if(updatingEnvironmentFile->featuresSatisfied(parentJob()->minimumFeatures())
> && updatingEnvironmentFile->featuresSatisfied(parentJob()->slaveMinimumFeatures()))
>
> The bad case is when slaveMinimumFeatures() are greater than
> minimumFeatures() - minimumFeatures() eventually get stored in the
> duchain and the next time the second condition fails again, causing a
> reparse. I see two ways of fixing that: either remove the second check
> (seems inappropriate here anyway, because it is checked in
> sourceNeeded()) or do minimumFeatures |= slaveMinimumFeatures,
> probably except the ForceUpdate set of flags.

The check does more than what you think it does. It makes sure that
_all_ recursive imports have the required features given through
slaveMinimumFeatures, and it also checks whether any of the recursive
imports have a global mismatched feature-requirement attached through
ParseJob::staticMinimumFeatures . ParseJob::staticMinimumFeatures is
basically the way how the uses-collector tells the parse-jobs for
which top-contexts uses are required.

A possible reason for your observation is that some files, for which
you build uses in the previous run of kdevelop, were updated without
uses after the next document startup, and now they are re-updated with
uses.

There may also be some strange other problems hidden here though.

Btw. I have implemented, tested and pushed the "grep" pre-filtering
now, and especially when the actual searched string is not found too
often, the processing is indeed by magnitudes faster than before.

Greetings, David