D18551: clang: Create preamble only on second parse
Aaron Puchert
noreply at phabricator.kde.org
Sun Feb 2 18:23:18 GMT 2020
aaronpuchert added a comment.
To keep this short I didn't reply to the parts I completely agree with.
In D18551#604249 <https://phabricator.kde.org/D18551#604249>, @mwolff wrote:
> > Now I actually have `/tmp` as tmpfs, and so these 400 MB don't come for free. This is perfectly fine when I'm actually working on these files, but so far I have only opened KDevelop.
>
> I see. The problem really is that the preamble files shouldn't be written to /tmp then - it would be perfectly fine to move them to a temp dir on the disk I think - it would still be faster than having to redo the work.
There are arguments for both in-memory and on-disk, it depends on how much memory there is. I have another more more generously sized machine where the files are perfectly fine in RAM.
I think `/tmp` is correct by the FHS <https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s18.html> and `/var/tmp` wouldn't be appropriate <https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s15.html>. I could easily put this on disk if I wanted to. I just wanted to make the argument that resource usage is a concern, and that there is some benefit in limiting ourselves to the preambles that we need.
>>>>> Also note how we just reparsed the file solely for code completion purposes!
>>>>
>>>> You mean the preamble? Since code completion is usually invoked after some initial changes, we have to reparse anyway. What you said would only be true if we invoke code completion without changing anything first, which seems rather weird.
>>>
>>> I mean: We trigger a reparse when we open a document, to get hold of the clang TU to do code completion. If this *does not* create a preamble, then the following call to query code completion items - *after* editing - will create the preamble. The preamble only includes the headers of the file, not the contents of the `.cpp`.
>>
>> Well, then why don't we just skip the initial reparse, and only reparse after the first edit?
>
> Yes, I agree that we should try this out. Note that this will require changes in `clang/codecompletion/model.cpp` at least, potentially also in other places. I.e. right now this code relies on the fact that we trigger a reparse after opening a document. This then allows us to write code like this there:
>
> auto top = DUChainUtils::standardContextForUrl(m_url);
> if (!top) {
> qCWarning(KDEV_CLANG) << "No context found for" << m_url;
> return;
> }
>
> ParseSessionData::Ptr sessionData(ClangIntegration::DUChainUtils::findParseSessionData(top->url(), m_index->translationUnitForUrl(top->url())));
>
> if (!sessionData) {
> // TODO: trigger reparse and re-request code completion
> qCWarning(KDEV_CLANG) << "No parse session / AST attached to context for url" << m_url;
> return;
> }
>
>
> If we only attach the AST after the first edit, it won't be enough. We will also have to trigger a reparse and force attaching the AST/parse session data when code completion is explicitly requested via `ctrl + space` without editing.
How bad would it be if we don't do that and only provide KTextEditor-completions before the first edit? I think it's pretty rare that code completion is immediately requested, typically one starts with a line break, or by removing code, before new code is written.
>>> You usually edit the `.cpp` file and then you want to get code completion results immediately.
>>
>> We're talking about a one-time delay, all further edits will have a preamble.
>
> Yes, but the patch proposed here in its current form just disables the preamble always, no? So I think we agree that this isn't good enough?
It shouldn't, unless I'm misunderstanding our usage of `libclang`. Without the flag <https://clang.llvm.org/doxygen/group__CINDEX__TRANSLATION__UNIT.html#ggab1e4965c1ebe8e41d71e90203a723fe9aa27840716b90a9346019c9f0914f9ab8>, the preamble is created not on the first parse, but on the first reparse of a file. Meaning there is no preamble when I just open a file, but as soon as I edit and the IDE triggers a reparse, the preamble is created and used for subsequent reparsing.
Let's do some math to clarify this. Let
- //t_all// be the time needed to parse a source without preamble,
- //t_source// be the time needed to parse a source with preamble, and
- //t_preamble// be the time needed to serialize the preamble.
Obviously //t_source < t_all//. Then the times look as follows:
| | Time for opening | Time after first reparse | Time after //n//-th reparse |
| Old | //t_all + t_preamble// | //t_all + t_source + t_preamble// | //t_all + n · t_source + t_preamble// |
| New | //t_all// | //2 · t_all + t_preamble// | //2 · t_all + (n-1) · t_source + t_preamble// |
| New - Old | //-t_preamble// | //t_all - t_source// | //t_all - t_source// |
|
Now let //m_read// be the number of documents that are only being read and //m_write// the number of documents being read and written. Then the cost difference is //m_write · (t_all - t_source) - m_read · t_preamble//, which is negative iff //m_write / m_read < t_preamble / (t_all - t_source)//. The right-hand side is a constant that depends on the nature of the project and the hardware one is working on:
- //t_all// and //t_source// include reading from disk or page cache and doing some heavy CPU work, the difference between them is the parsing of the header files,
- //t_preamble// includes a bit of CPU work and writing to a file (which might be on disk).
The left-hand side depends on usage patterns. I use KDevelop a lot to read and browse code without doing any changes, so this would very likely benefit me. But I might be alone in that.
REPOSITORY
R32 KDevelop
REVISION DETAIL
https://phabricator.kde.org/D18551
To: aaronpuchert, #kdevelop, mwolff, brauch, rjvbb
Cc: rjvbb, kdevelop-devel, hmitonneau, christiant, glebaccon, domson, antismap, iodelay, alexeymin, geetamc, Pilzschaf, akshaydeo, surgenight, arrowd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kdevelop-devel/attachments/20200202/a24f73f5/attachment.html>
More information about the KDevelop-devel
mailing list