Patch: TODO highlighting in comments

Sat Nov 13 19:11:08 UTC 2010

2010/11/13 Milian Wolff <mail at milianw.de>:
> On Saturday 13 November 2010 18:09:18 David Nolden wrote:
>> I think it was too early to apply this on master. There are some
>> severe efficiency problems. First of all, there were some very heavy
>> QString/QByteArray operations on every single comment encountered
>> during parsing, which is evil, since we already before spent a lot of
>> time processing comments. I have somewhat fixed these efficiency
>> problems by adding an _efficient_ pre-check for the todo keywords
>> (based on fast IndexedString token comparison).
>
> I'll benchmark it, or do you have some callgrinds around to show us where the
> bottleneck actually exists? And I clearly disagree with the "should not have
> been merged". Noone, esp. not you, would have tested it and hence it would
> just have bitrotten. This way we could try it out (it worked fine for me), and
> you could find these problems.

The efficiency problems in todo-extraction shouldn't exist anymore,
after all I fixed them. It didn't need profiling, only looking at the
code, plus some simply math: Many many comments -> for each comment
many many QString operations -> Damn slow. Ideally there shouldn't be
_any_ QString operations during parsing (if we do it only for rare
cases, like "there is a TODO in the comment", then it shouldn't
matter). Unfortunately we have some QString/QByteArray operations in
CommentFormatter::formatComment. At my last profiling those took
around 5% of the parsing time, for some very trivial operations. Since
the operations in extractToDos are  more complex (including the called
sub-functions), one can expect the old todo-extraction to slow down
the parsing by approximately 10%.

>> The other big problem is (as Dmitry already noted) the way the
>> problems are collected from the top-contexts when the scope goes over
>> the whole project. There doesn't seem to be a proper updating scheme
>> (see BackgroundParser::parseJobFinished for a possible entry point
>> into updating), and the whole TopDUContexts of the whole project are
>> processed whenever the items are collected.
>>
>> As first, it has to be made sure that the problem-reporter doesn't do
>> any significant work at all when the problem-report view is not
>> visible. Then, it needs to be made sure that, when one parse-job has
>> finished, only the problems from the corresponding TopDUContext are
>> updated, and all other problems remain untouched.
>
> Doesn't sound that hard to fix.
>
>> For the slightly farrer future, it would be better to not use the
>> duchain directly, for example to build the TODO lists, but to transfer
>> the TODOs of the whole project into a separate ItemRepository, where
>> they can be stored and accessed persistently and efficiently (this
>> repository would need to be partially updated only after each
>> parseJobFinished).
>
> If this *really* is that big a problem, then we could do this, sure. It would
> only store IndexedString + array of todo ranges, right? A simple Mutex to make
> it threadsafe and done.

Yep. I just remember KDevelop3 where it was the problem-reporter that
made the whole application slow due to _exactly_ such problems, this
really has to be crafted carefully and efficiently. After all the
effort scales linearly with the size of the project.

Greetings, David