C++ Parsers for KDevelop (kinda long)

Richard Dale Richard_Dale at tipitina.demon.co.uk
Fri Feb 25 04:33:31 GMT 2000


I think an important consideration in evaluating the best parser for KDevelop is
how easy it would be to add grammars for additional languages - such as Java or
Perl. Is the OpenC++ parser code easy to adapt to parse other languages?

I'm not an expert in C++, but are there any major problems with the current
parser? Code under development often doesn't quite parse anyway, so to me an
IDE which is 'over fussy' could be counter-productive.

-- Richard

On Thu, 24 Feb 2000, you wrote:
> I was browsing the archives of the mailing list, and about a week ago,
> their was a discussion of C++ Parsers, and what kdevelop should be using.
> Since this is an area i have a *lot* of expertise in, i figure i'd
> subscribe and offer some useful info.
> 
> First off, i've literally looked at every single C++ parser out there. In
> the projects i was needing a C++ parser for (2 development environments),
> I had just about the same requrements Kdevelop has. I hunted down every
> single C++ parser available, be it part of some larger compiler, or
> standalone.
> I also evaluated writing my own in various ways (Parsing is kinda my
> forte. I know my way around antlr,yacc,btyacc,etc).
> 
> Before i go over the various parsers available, let me say this: I'm not
> listing them all. But i've evaluated them all. If you come across one and
> want to know anything about it (IE how hard to integrate it into a class
> browser, etc), feel free to ask.
> 
> So with that out of the way, let me go over the parsers, their pros, cons,
> etc.
> 
> The C++ Parser in OpenC++ (do a web search on openc++):
> 
> Pros:  Probably the most
> complete of any parser. It can easily handle the STL now. It requires
> preprocessed text, but it took me about 3 days to hack enough of a
> preprocessor into it to so that i could parse non-preprocessed text, and
> not have to see the system classes.
> You can get *any* info you need out of it. Types of variables in
> functions,etc.
> It's written in C++, and very very well written. If you want an example of
> how to hand code a recursive descent parser, this is a great one.
> It's also the fastest. When i removed the last pass (wrting the info back
> out), and made it not have to deal with system include files, it took ~60
> milliseconds to parse and understand a 30k file.
> OpenC++ builds as a library, and then the compiler is a very small driver
> linked with the library. So it's already able to be easily linked with
> other programs.
> 
>  Cons: Uses a garbage collector. Boehm's to be exact. On
> the upside, it can
> be turned off with a simple define. But you'd have to delete in the right
> places.
> This could be looked at as a Pro, but i was on BeOS, and BeOS doesn't
> support Boehm's (I ported it eventually).
> The format of the parse tree is hard to use. 
> On the upside, OpenC++ was built to be able to get the info about a C++
> file, and has walkers and whatnot.
> You are better off overriding the various functions that were used to call
> metaobjects, and make them do your dirty work.
> 
> The C++ parser in CTAGS:
> 
> Pros: As fast as OpenC++.
> Doesn't need preprocessed text
> Supports more than just C++
> Provides a lot of the info you need
> 
> Cons:
> Not as easy to integrate
> Not all that well written (IMHO. It's hard to follow the flow).
> 
> The C++ parser in G++:
> 
> Pros: Sorry, there really are none. They want to rewrite it as well. 
> Cons: Completely full of hacks, not all that quick, etc.
> 
> The C++ parser in Doc++/Doxygen (DOC++ is the original, Doxygen made
> improvements onit).
> 
> Pros: Written in flex. I kid you not. It's a full C++ parser, written in
> flex. It's very easy to extend, and very well written.
> Pretty fast, about 40% as fast as OpenC++.
> Doesn't care if you give it preprocessed text or not.
> 
> Cons:
> Uses it's own string/vector/etc implemntation. Took about a day of
> straight hacking to convert it to use the STL
> I had many problems with memory leaks.
> You need to rip out the stuff that handles parsing the doc++ comments, but
> this takes a few minutes at the most.
> 
> 
> C-Browser yacc grammar (hard to find):
> Pros: Written in Yacc, fast.
> Cons: Doesn't handle much
> 
> The Empathy C++ Parser (the PCCTS based one):
> Pros: Pretty complete, includes preprocessor.
> Cons: My brain exploded when i tried to understand how to actually
> integrate it, or modify it. And i know PCCTS and ANTLR inside and out.
> 
> 
> There are about 10 others (mainly part of compilers), not really useful. 
> (This includes Roskind's yacc grammar)
> 
> All told, your best approach is probably to take OpenC++'s parser, and
> make it use something besides lispish PTrees.
> 
> I've implemented class browsers and auto-completion, or attempted to, with
> each of these parsers.
> So feel free to ask if you have any questions.
> HTH,
> --Dan




More information about the KDevelop mailing list