KDevelop problems suitable for a Ph.D. thesis
Denis Steckelmacher
steckdenis at yahoo.fr
Wed Sep 4 08:23:48 BST 2024
Hello,
Post-doc and former KDevelop contributor here (kdev-qmljs about 10 years
ago). To answer your question, we would first need to know which faculty
you want to get your PhD from (Faculty of Sciences it seems?), and how
much theoretical versus implementation work would be expected from you
to get your PhD.
I would personally recommend finding a topic that involves
moderately-difficult fundamental research (not too easy, not too
difficult), because this is easy to publish. Ideally, that research
would be foreseen to lead to an easy implementation in KDevelop. So no
research that would translate to a full rewrite of KDevelop, you would
never have time to do that, and you patch would never be accepted
upstream. Something that fits in a plugin would be best.
Current KDevelop developers are best placed to identify areas that need
research, but here is my humble contribution for a possible research topic:
Coding assistants like Github Copilote are all the rage for now, but as
an AI researcher my self, I don't see these tools, as they are now,
still existing in 2 to 5 years. The generated code is over-trusted by
the developers, leading to security issues or poor performance. As such,
some companies are starting to implement policies forbidding their
developers from using these tools. There is also the question of running
these generative models, training them, and having them generate code
that looks too much like code you don't want to license (for instance,
you write MIT code, and Copilot spits out a full GPL-licensed function).
I believe that AI should remain in useful but less visible areas such as
energy transport, manufacturing optimization, medical diagnosis,
preventive maintenance, etc.
However, Copilot allowed to identify a dire need from developers for
some automation and assistance while they code. I don't believe that
IDEs should generate entire functions (they should be in well-maintained
and reusable libraries), but some design patterns cannot be put in
libraries, and guidance from the IDE would be useful. Boilerplate
generated by the IDE would not come from an expensive AI model, but from
a human-vetted and developed, well-maintained library of code snippets
(by the way, KDevelop already supports snippets).
I see two possible research areas related to this proposal:
1. User intent identification, related to user interface design. The
IDE could detect that the user typed a class name that probably
should be a singleton (like "class DataStoreSingleton", but more
complex detection could be possible), and proposes to generate the
boilerplate code in the code-completion popup. Entries in "Tools",
or buttons in the toolbar, or keyboard shortcuts, could also be
used. This point involves research on user-interface design, but
also code understanding, natural language processing (class names
have English-like names usually), and maybe simple forms of Machine
Learning and statistics (would be awesome for the method to
automatically adapt to the user, the project being worked on, or the
company using KDevelop).
2. Writing of advanced templates, that do more than put some text in
the editor with placeholders to fill by the user. These templates
could look at the existing DUChain (Igor: I don't know how much you
already worked with KDevelop, but this is roughly KDevelop's
language-agnostic AST) and guess variable names, functions to call,
stuff to put in constructors, what class is the Visitor you want to
use, etc. This requires research of existing design patterns, in
various programming languages, along with compiling and decompiling
technologies (the latter to go from the DUChain to higher-level
stuff, for instance to look at the methods of "PrettyPrinter" and
guess that it is a Visitor). Probably that looking at the name of
things will be needed too here to get good results, so this point
reuses the Natural Language Processing and Feature Engineering stuff
of the first point.
Good luck with your search for a topic! I know that this is very hard to
do, but the better the topic, the easier and funnier the PhD!
Denis Steckelmacher
On 3/09/24 14:18, Igor Kushnir wrote:
> Hi everyone!
>
> I am urgently looking for a computer science Ph.D. thesis theme. I'd
> like to solve some important and challenging KDevelop problem(s),
> write scientific papers and a thesis about them. Any ideas?
>
> I have considered integrating Language Server Protocol (LSP). But
> based on
> https://commits.kde.org/kdevelop?path=kdevplatform/language/duchain/Mainpage.dox
> ,
> https://microsoft.github.io/language-server-protocol/overviews/lsp/overview/
> and
> https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/
> , I conclude that proper integration into existing KDevelop
> language/duchain framework is impossible. For one thing, a LSP server
> reply does not group/structure declarations in nested scopes, which is
> necessary to fill duchain context hierarchy. LSP can be integrated
> separately from duchain, similarly to but more thoroughly than Kate's
> LSP plugin, which could be a very good starting point. However, I
> expect such an undertaking to consist almost exclusively of
> plugin/library/protocol integration work rather than interesting new
> algorithms or algorithm improvements/adaptations worthy of scientific
> papers.
>
> The following two merge requests represent my KDevelop work that comes
> closest to what I am looking for:
> https://invent.kde.org/kdevelop/kdevelop/-/merge_requests/224 and
> https://invent.kde.org/kdevelop/kdevelop/-/merge_requests/118 . But
> both are fairly small and mostly complete. Not much need or room for
> improvement is left there.
>
> Thank you,
> Igor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kdevelop-devel/attachments/20240904/c7efbc38/attachment-0001.htm>
More information about the KDevelop-devel
mailing list