Review Request: Add support for "language features"

Andreas Pakulat apaku at gmx.de
Mon Jul 9 21:58:36 UTC 2012



> On July 4, 2012, 8:02 a.m., Andreas Pakulat wrote:
> > Hmm, what does the stringlist contain? I don't like string's to transport type-like information and features of a language do rather sound like a flags-type. I'm also wondering what this thing does that needs it in the buildsystem manager and not in the language support itself? An implementation for both sides would probably help understanding the intent.
> > 
> > Oh and even if its decided to go into the project interface I'd say it should be in the base interface, not the buildsystem one so project managers which have no buildsystem support can still supply this feature.
> 
> Aleix Pol Gonzalez wrote:
>     I think he needs it to tell the (c/c++) language support whether it should be analyzed with a C or a C++ parser. The build system support can tell which one should be used.
>     
>     On the other hand, I agree that string is too generic.
> 
> Andreas Pakulat wrote:
>     Hmm, what information does the buildsystem have that the c++ support does not? I mean currently cmake just sees a bunch of C or C++ files. So this would require some special buildsystem plugin to work in the first place. I dislike that very very much, its the wrong approach IMHO. The parser mode for the C++ parser is completely orthogonal to the buildsystem being used. So the parsing mode shouldn't be something decided based on the buildsystem, but rather based on the actual file.
>     
>     Can C vs. C++ be determined for a file automatically? If not, I'd say a project config page supplied by the cpp-plugin that lets you choose between C and C++ parsing mode is completely sufficient. If there is sufficient demand for having more finegrained control over the type of parsing per folder or file, then the config page needs to get some support for wildcard matching or the like.
>     
>     If there is indeed a way to find this out for most of the files, then the c++ plugin should do exactly that.
> 
> Aleix Pol Gonzalez wrote:
>     well, in the system side, it's cmake who decides if gcc or g++ should be used, no? So it would make sense if we did likewise.
>     
>     Am I missing anything?
> 
> Alexandre Courbot wrote:
>     As Aleix pointed out, the build manager is ultimately the only entity who knows how a file should be compiled, by which compiler, and with which options. Thus it is the only one who can tell with precision whether a .h file should be interpreted as C or C++, whether a .c file should be parsed as C99 or ANSI C, etc. Language support is not enough.
>     
>     As for passing this information as a string, this is because language features are a contract between build manager and language support. The information passed is of type "this file is C99, so don't use the C++ features here". Since we are working at the platform level, concepts like C99 do not exist and thus flags would not be suitable either.
> 
> Andreas Pakulat wrote:
>     While the buildsystem may be able to know the compiler switches, kdevelop's plugins won't. Thats partially due to not being able to execute all the steps that are declared via the buildsystem - running external processes, writing files etc. are problematic to execute just for parsing and it can be that certain compiler switches are only used after such actions have been performed (un)successfully.
>     
>     In addition to find this level of information from the buildsystem requires already quite some work for CMake and even more for any other managers we have. I won't ever add support for that to the custom buildsystem manager for example.
>     
>     So, while such a fully automatic solution might be really nice to have, I think its ultimately not going to be fully working or only for the combination of 1 buildsystem manager and 1 language plugin. Hence My suggestion to instead support this via a config page with a default and some wildcard-matching for individual files. This is not automatic but should suffice for most usescases without causing too much extra work.
>     
>     Regarding the String I didn't see any arguments from you that would rule out having an enum or some other abstract type. Might even make sense to supply that from the language support, i.e. ask the language support which features it can support on the given file and how they match to the compiler and compiler flags, i.e. ANSI-C - enable when gcc is used with flag X, C99 enable with gcc and -c99 flag, C++ enable with g++, C++0x enable with g++ and flag Y. Heck, I'd put the decision into another API on ILanguageSupport which taks a list of compiler args, filename and compiler binary name and let the language support return an enum value...
> 
> Aleix Pol Gonzalez wrote:
>     Alexandre, maybe you can try to figure out if it's C it in the language support side?
>     
>     Also we'd have the same problem we have with the .h files, this should be decided by the language support (aka it's included from c++ or c). In case of the .c/.cpp files, you can try to discriminate by extension, for the moment?
>     
>     It can look like a hack, but it's what we'd to in the buildsystem side anyway. Also it would mean depending on a newer kdevplatform as well.
> 
> Alexandre Courbot wrote:
>     Andreas, the first part of your answer sounds like you think I plan to run "make -n" in order to extract compilation options. This is not the case. The idea is that the build manager, be it CMake or anything else, can pass extra information about a file that language support can not get otherwise. This information could be obtained by a dry-run of make, but I don't think this is a good idea neither - I had plenty of problems with the automatic include paths resolver because of the points you mentioned and do not want to go that way. But without going that far, there is still some information that would be valuable to obtain from the build manager.
>     
>     I agree with Aleix that in the case of C/C++ guessing from the file extension will work most of the time with the current parser. But this is still a hack. For .h files as Aleix pointed the best thing is to infer the language depending on which files included it, but once again this does not work if you open that file directly or if it is not included by any visible source file, and we end up with guesses and approximations again.
>     
>     Persisting with approximations and hacks instead of doing things so they can scale right will backdraft at us at some point in the future. This is already the case with define() and includeDirectories() which should have no place in a generic build manager interface and should be part of a more general solution. The present patch is a proposal for a more generic solution.
>     
>     Note that every build manager is not forced to implement the new method at all - those which don't will still continue to work as they did so far, and leave language support without hints as to how to parse the file - just as it is today. This change is not intrusive at all, and does not add any dependency (we already have links from language support to the build manager through define() and includeDirectories()). It is just a new interface that allows more collaboration to take place between language support and build manager. Thus the amount of resistance towards it takes me by surprise.
>     
>     Regarding the string as a vector for this information, you simply cannot use enums as they would be language-specific. For C you could have { ANSI, C99, CPP/CPP11, AllowXX, AllowYY }, and some of these terms could also be combined. For Python { Python2, Python3 } which would be exclusive. A string or list of strings is simple and efficient and I don't see the need to take things further - think of the user-agent used by web browsers for instance. It's really that simple - the build manager gives hints about how the file should be parsed, and language supports sees how much of it it can understand and tunes the parser accordingly. That's it.
> 
> Andreas Pakulat wrote:
>     Well, if the cmake code does a try-compile loop to find out wether the compiler understands certain features and then sets compiler flags accordingly and changes from C99 to ANSI C or whatever. Then the cmake manager needs to do the same thing (and I don't think try-compile is actually implemented right now) to find out the type of C dialect. If this is not done, then you'll never reach 100% coverage so why put in the effort in the first place (which has to be maintained and adapted to newer versions of the buildsystem, something thats a real issue with the few developers we have).
>     
>     Frankly, I don't see a problem with includeDirectories and define's, they're simply specialized versions of a too generic compileFlags() function which wouldn't make any sense with Python for example anyway. Also note your python examples towards the end don't fit, since python projects often don't have a buildsystem in the first place and hence simply use the projectmanager interface.
>     
>     Having an interface thats not used anywhere is simply not useful, if there is only 1 user, its still not very useful since its bound to bitrot over time. In fact, in 2 years or so, some new contributor may look at the c++ support and the cmake manager and notice that C++ uses something which cmake doesn't implement and simply remove it again since he sees no point.
>     
>     As I said before I don't think this is really needed, however if it goes in it needs to use a type-safe API and that means no strings in the API. As I outlined its very much possible for the language support to supply information how to match compiler flags+file to a certain enum and hence let the language support decide how to interpret the information. If we can think of more things that could influence the decision, pass them along as well.
>     
>     Oh and for header files you're doomed anyway, cmake at least cannot tell you wether a header is to be considered C or C++, it doesn't know itself either it only knows the source files. However the DUChain should be able to tell you for a header file which C++ files use it and hopefully this doesn't even need a reparse with full data.
> 
> Alexandre Courbot wrote:
>     Well, I am not going to fight over it if nobody else sees the point of this change. I thought it was an elegant way to solve the C/C++ discrimination problem (besides what can be done through extensions checking) by involving the build system.
>     
>     This patch seemed lonely but I also have adapted the C/C++ parser to accept C code and adapt accordingly. This was supposed to be on top of this change, so I wanted to check the API change first. And kdev-kernel was also using that, always reporting "C" as there is no C++ file in the kernel project.
>     
>     Unless there is more debate in favor of this change in the next few days I will abandon this review request and go ahead with the rest.
> 
> Andreas Pakulat wrote:
>     Oh, I do see the point of having the buildsystem telling the parser how a file is going to be compiled. I just don't see this ever implemented so that it works out-of-the-box in all cases - in particular not for any of the currently included buildsystem plugins. And API thats not used/implemented is just dead code.
>     
>     Anyway, the only technical objection I have against the patch is the use of strings, as I said its doable without too much effort (just a bit more indirection) to have type-safety with this and still get the same features.
> 
> Alexandre Courbot wrote:
>     Ah well, I don't expect this simple interface to solve the whole issue - just to offer opportunity to improve it. My point has never been that it would work out-of-the-box, but some build managers and language supports could take advantage of it (the idea was to selfishly use this for kdev-kernel in particular).
>     
>     But I think we can actually reach a satisfactory point where this feature would have users, especially if we suppress the use of strings as you suggested. If I got it correctly, you are suggesting that the interface returns a dummy class, which would be dynamically casted by the language support into something more meaningful. I.e. the build manager and language support would be contract-tied by some subclass. If we do so, then it totally makes sense for C support to move the defines and includepaths into this class. That way, instead of having these language-specific features into the generic classes, where they definitely don't belong, they would be encapsulated into the language features, and the language features would then have users (and I could snap these parser features into it at the same time). The instance returned by the language features for a C/C++ file would thus have the following methods:
>     
>     - defines() (taking over IBuildSystemManager::defines())
>     - includePaths() (taking over IBuildSystemManager::includeDirectories())
>     - languageFeatures() (returning a bit-field of the features the C parser should enable/disable)
>     
>     Then the whole thing would need to be named differently, "parser features" comes to mind as this is only used by the parser. How does that look? All in all, it should make everything more sound.
> 
> Andreas Pakulat wrote:
>     Hmm, I'm not sure we're talking about the same thing, what I had in mind was something like:
>     
>     QSet<ILanguageFeature> ILanguageSupport::determineLanguageFeatures( const KUrl& file, const QStringList& commandline, const QMap<QString,QString>& environment);
>     
>     And the buildsystemmanager calls that for each file after it determined the commandline during parsing and caches the result. It can be re-computed when necessary as the files change. The parser can then query this based on the file when it needs it...
>     
>     If we notice that additional things are necessary in the future to determine the language features we could add another overload which takes more parameters. Alternatively we could extend the buildsystem manager to simply expose the information:
>     
>     QStringList IBuildSystemManager::buildCommandForFile( const KUrl& file );
>     QMap<QString,QString> IBuildSystemManager::buildEnvironmentForFile( const KUrl& file );
>     
>     The latter would allow to drop the ILanguageFeature interface from the public API and let the language support have a simple enum for the different features internally.
> 
> Milian Wolff wrote:
>     Andreas, how would you figure out whether your ILanguageFeature is actually something like "enable C99 mode"? dynamic_cast<Cpp::C99LanguageFeature>(set.at(...))?
>     
>     But I agree that by making the buildCommand + build environment publically accessible to the parser, we could read it and enable/disable language-specific features on demand. That does sound like a scalable way to address this issue, no? The question of course remains how easy this is to implement... For the QMake manager e.g. I'd have no idea how to actually return a build command... The environment could be returned though, which might also include something like e.g. QMAKE_CXX_FLAGS but then the cpp plugin would need to handle that... And how one sets the compiler in QMake - I have no idea - apparently like this: http://stackoverflow.com/questions/4500720/qmake-extra-compilers-problems-with-dependencies-between-header-files ... So no idea how that should be handled in this new API...
>     
>     All in all I agree that this all seems to be too complicated. I'd rather have a simple configuration interface as proposed by Andreas, where a user could e.g. say "foo/*.c + foo/*.h" => this is C-code, while "bar/*.cpp" and "bar/*.h" is c++ code. Maybe this could be done in such a way that a project manager like kdev-kernel could automatically supplement information such that "path-to-kernel-project/*" is handled as C code. Of course that would be CPP-language-plugin specific, but thats enough. I have not yet heard of any language plugin that would require this in a generic way anyways, i.e. Sven does not plan to support both Python 2.x and Python 3.x.

Well as Alexandre said its not just C vs. C++, there are certain features for each language that oe might want to have enabled or disabled. Though most of the time I guess having the IDE parser accept a bit more code than the compiler is ok as long as language itself is correct.

The python 2 vs. 3 example cannot be handled automatically anyway, since there is no buildsystem only the interpreter which is used to run the script and the intention of the author which version he wants to use. So for Python it wouldalso need to be a project config.


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/105299/#review15362
-----------------------------------------------------------


On June 20, 2012, 1:43 a.m., Alexandre Courbot wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/105299/
> -----------------------------------------------------------
> 
> (Updated June 20, 2012, 1:43 a.m.)
> 
> 
> Review request for KDevelop.
> 
> 
> Description
> -------
> 
> Add support for "language features"
> 
> Sometimes the same language can run using different variants - the most
> obvious example is the C++ language support, which may also support C
> and other variants and behave differently according to the type of file.
> 
> This patch adds a new method to IBuildSystemManager allowing it to get a
> list of features to pass to ILanguageSupport::createParseJob as an
> additional argument. ILanguageSupport can then adapt the behavior of its
> parser according to the features the build manager says the parsed file
> uses.
> 
> Corresponding support for kdevelop (since this patch breaks API compatibility) is there: https://git.reviewboard.kde.org/r/105300/
> 
> 
> Diffs
> -----
> 
>   language/backgroundparser/backgroundparser.cpp 417a8e4b7f38acfa959959895f186c11e3a76f93 
>   language/backgroundparser/tests/testlanguagesupport.h ed3864c9e8da8eed97d3d91500eec6c623fae41e 
>   language/backgroundparser/tests/testlanguagesupport.cpp 3f88894d728610ebd433bff46936f38dcd2138be 
>   language/interfaces/ilanguagesupport.h 22cedf09656aaf80275dd3a14d3752003fe9a912 
>   project/interfaces/ibuildsystemmanager.h c0813d8f781b0be29829b9278f191299af823b68 
>   project/interfaces/ibuildsystemmanager.cpp 74af0e76f8c8bc9276d79ff54be4d3d41927c298 
> 
> Diff: http://git.reviewboard.kde.org/r/105299/diff/
> 
> 
> Testing
> -------
> 
> Compiled kdevplatform & updated kdevelop, checked things were working.
> 
> 
> Thanks,
> 
> Alexandre Courbot
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kdevelop-devel/attachments/20120709/b5759d30/attachment.html>


More information about the KDevelop-devel mailing list