probably the best C++ parser...

Mon Jul 4 16:11:43 UTC 2005

i figured i'd weigh in since this is interesting to me...

> IMO, the optimal solution would be to use gcc parser running in background.
> So, as you type, the parser will consume the tokens you've typed, and
> provide exact information about type/members and so on.

last time i checked, gccxml sucks as a "reverse/reengineering parser". it 
generates too much data to be even remotely efficient. unless the authors 
have changed how the program defines "translation unit", it's still 
performing regular c preprocessing (which is what really causes the bloat).

> Another C++ parser I know about is Synopsis
> (http://synopsis.sourceforge.net/), which is reported to even parse Boost.
>
> At this point, I'm not quite sure if hacking gcc or improving Synopsis is
> the best way to go.

synopsis is certainly an option. you should also look at srcml 
(http://www.sdml.info/projects/srcml).

the problem with all of these is that they're non-incremental. you run them. 
they generate data. you change the code and then you have to completely 
re-run the program and regenerate the data.

i think what you'd really want in an ide is an incremental, interactive 
parser. of course, it's output ast also has to be highly interactive to 
allow, not only for analysis, but transformations (e.g., refactoring, 
injected code, etc). just think about it like the CodeDom from that .NET 
thing.

it might be worth sitting down and actually thinking about what you want to 
get out of the background parser before suggesting existing projects that can 
be integrated or hacked up to accomodate your immediate needs. reengineering 
parsers have very different requirements than a traditional compiler... they 
also have very different architectures.

andrew sutton
asutton at cs.kent.edu