Review Request 128480: fix clang plugin highlighting screwup

Milian Wolff mail at milianw.de
Wed Jul 20 19:57:18 UTC 2016



> On July 20, 2016, 9:55 a.m., David Nolden wrote:
> > Copy of my e-mail: While reviewing the clang duchain code, I found some severe problems which are probably the main reason for these glitches to appear.
> > 
> > The core of the problem is, that the code assumes that for "unmodifed" files, it doesn't need to do any revision mapping, which is wrong. When you save a file in revision 10, the revision will still be 10 after saving, and the mapping still needs to be performed in the same way.
> > 
> > Some of these things were done on-demand with the foreground lock in the old cpp plugin, but that isn't possible straightforwardly any more, because the "unsaved files" need to be prepared _before_ parsing now.
> > 
> > So in principle, what needs to be done is:
> > * Before parsing, either in foreground or with the foreground lock, the text of _all_ open documents need to be read, and the exact corresponding document revisions need to be locked and stored in the parse session. When the extracted text is used for parsing, then this can be used for glitch-free highlighting and navigation.
> 
> Sven Brauch wrote:
>     Thanks David, I think I agree with your conclusion. The root of all evil is the "!isModified()" shortcut (I already mentioned it as odd above but it slowly becomes clear to me just how much trouble it can cause). Should we just remove that ...? It seems potentially slow when you have hundreds of files open.
>     
>     While we're at it, David, what do you think about storing the revision references in the top context instead of the change tracker? I still think we sometimes have the case that this happens: parse job runs -> finishes -> new job starts, resets tracker and discards revision -> highlighter is invoked for old job. We somehow have to avoid this.
> 
> David Nolden wrote:
>     Some cache would be needed to avoid the overhead of re-reading all the files. Maybe a global QVector<UnsavedFile> should be managed and kept up-to-date in the foreground, and just copied away with each parse-job. Due to the copy-on-write property of Qt containers, this should be very efficient. Each UnsavedFile should also have an own RevisionReference then, to check whether it's up-to-date etc.
>     
>     In principle, RevisionLockerAndClearer is safe w.r.t. it being deleted from within a background thread, so it probably would make sense to attach such a locker to each top-context. In practice though, I'm not sure whether this could actually cause the glitches, because, I think, mapping even works with revisions that are _not_ locked. Nevertheless, this case needs to be considered of course.
> 
> Sven Brauch wrote:
>     Thanks for your advice, I'll have a look.
>     
>     > I think, mapping even works with revisions that are not locked
>     
>     Pretty sure that is not the case, in the document change tracker there is this code:
>     
>         if((fromRevision == -1 || holdingRevision(fromRevision)) && (toRevision == -1 || holdingRevision(toRevision))) { ... }
>     
>     It explicitly checks whether the revision is locked and does nothing otherwise.
> 
> Milian Wolff wrote:
>     Removing the `!isModified()` shortcut is a very bad idea, performance wise. Please let us find the culprit and fix that instead of removing legitimate features that are implemented buggily.
>     
>     As I wrote in my email in response to David's: Can you please try out to lock the revision in the UnsavedFile? Maybe that's all that's missing!
> 
> Sven Brauch wrote:
>     But David is right, the isModified() is just wrong -- it does not mean _anything_ whether the file is modified or not :/
>     Just imagine I press save after each keypress, isModified() will always be false but otherwise the exact same thing needs to happen.
> 
> David Nolden wrote:
>     @Milian: I didn't see any email.
>     
>     Anyway, this probably _is_ the real culprit and you cannot work around it. We just have to implement a solution which is efficient performance-wise. I think my previous proposal with a global QVector<UnsavedFile> could actually be more efficient than what's implemented now, because it could cache the extraction of the contents.

Oooh! Now I understand what you mean, _that_ `isModified()`! I really should have studied the code again. I was assuming you where talking about `isUpdateRequired()` in the `::run()` method. Sure, if removing `isModified()` helps (and now I also understand how it would help), feel free to remove it.

And I agree with David that introducing a global cache could help, but so far I haven't seen this taking a considerable amount of time either, so it's not worth doing it right now, I think. I.e. just remove it, and profile a normal duchainify run. If you get less than 0.1% samples there, then it shouldn't be too bad. Memory wise it it's a bigger impact, esp. if you have hundreds of files open.

That last part is btw. why I added that shortcut. We often have tons of docs open, but only a few of them we work on and thus have unsaved changes. Feel free to remove the check for now as a hotfix (if it really fixes the issue). But thinking ahead, is there no way we could keep the `isModified()` check in place and lock the associated revision and use that one? Actually, if we'd have a global cache as suggested by David, we could in principle have that one updated in a thread safe manner from the foreground and then use it as needed in the background thread when we hand over to clang, with the revisions from that version, or?


- Milian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/128480/#review97654
-----------------------------------------------------------


On July 19, 2016, 9:49 p.m., Sven Brauch wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/128480/
> -----------------------------------------------------------
> 
> (Updated July 19, 2016, 9:49 p.m.)
> 
> 
> Review request for KDevelop, Kevin Funk and Milian Wolff.
> 
> 
> Repository: kdevelop
> 
> 
> Description
> -------
> 
> Terrible patch, I know. Sorry. It cost a lot of nerves and time though and I want somebody to look at / try out whether this is a working fix in principle.
> 
> In the end, the issue(s) are simple:
> 
> 1. we need to hold the foreground lock between reading the modification revision and reading the contents. Otherwise they can mismatch.
> 
> 2. in buildDUChain(), there was
> 
>     envFile->setModificationRevision(ModificationRevision::revisionForFile(context->url()));
> 
> which is wrong. It sets the modification revision of the parsing environment file to the revision the file has _right now_. The contents, however, were read before in the constructor, and several milliseconds might have passed until this line is reached, with no locks held to prevent anything.
> Instead, we have to set the revision to the one which was stored when the tracker was last reset (which happened right after reading the contents, coupled to the contents by the foreground lock).
> 
> Actually I'm not sure if the foreground lock is required in this case, because clang for some reason reads the contents in the constructor, which afaik runs in the main thread anyways.
> 
> 
> Diffs
> -----
> 
>   languages/clang/clangparsejob.cpp 8375eb5 
>   languages/clang/duchain/clanghelpers.cpp cea8cd9 
> 
> Diff: https://git.reviewboard.kde.org/r/128480/diff/
> 
> 
> Testing
> -------
> 
> Very hard to reproduce. Apply this patch to kdevplatform:
> 
>     diff --git a/language/backgroundparser/documentchangetracker.cpp b/language/backgroundparser/documentchangetracker.cpp
>     index 294bc14..6f56250 100644
>     --- a/language/backgroundparser/documentchangetracker.cpp
>     +++ b/language/backgroundparser/documentchangetracker.cpp
>     @@ -237,6 +237,11 @@ KDevelop::RangeInRevision DocumentChangeTracker::transformBetweenRevisions(KDeve
>     {
>         VERIFY_FOREGROUND_LOCKED
> 
>     +    if ( !((fromRevision == -1 || holdingRevision(fromRevision)) && (toRevision == -1 || holdingRevision(toRevision)) ) ) {
>     +        qWarning() << "invalid transform: from" << fromRevision << "to" << toRevision
>     +                   << "but not both revisions held: [from/to]:" << holdingRevision(fromRevision) << holdingRevision(toRevision);
>     +    };
>     +
>         if((fromRevision == -1 || holdingRevision(fromRevision)) && (toRevision == -1 || holdingRevision(toRevision)))
>         {
>             m_moving->transformCursor(range.start.line, range.start.column, KTextEditor::MovingCursor::MoveOnInsert, fromRevision, toRevision);
>     @@ -348,6 +353,8 @@ void DocumentChangeTracker::unlockRevision(qint64 revision)
>             m_moving->unlockRevision(revision);
>             m_revisionLocks.erase(it);
>         }
>     +
>     +    qDebug() << "** clearing revision" << revision;
>     }
> 
>     qint64 RevisionLockerAndClearer::revision() const
>     
> You get _lots_ of those warnings iff you manage to trigger the issue. The best bet seems to be open a relatively large cpp file, and keep removing text and readding it with Ctrl+Z with varying wait times in between (500ms-parse-delay-ish). Do that for a minute or two and it will trigger eventually.
> 
> You still get a few of the warning messages sometimes even with the patch applied (but far more without). I think they are from the problem reporter plugin, and I'm not sure if the plugin does something wrong or the parse job.
> 
> 
> Thanks,
> 
> Sven Brauch
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kdevelop-devel/attachments/20160720/979e312f/attachment-0001.html>


More information about the KDevelop-devel mailing list