c++ code completion status report

Tue Jan 8 13:17:04 UTC 2002

On Monday 07 January 2002 9:11 pm, Thomas Schilling wrote:
> > But one the other hand, I like the idea of using the gcc 3.0.x
>
> preprocessor,
>
> > tokenizer and bison grammar code as a starting point in 'advancedcpp'.
>
> Then
>
> > gideon would become a sort of 'Visual gcc'. It isn't that your grammar
>
> would
>
> > be any better or worse than the gcc one, but actually using the same code
> >
> > >from gcc for class browsing and code completion etc. would be new in an
>
> IDE I
>
> > believe.
I had a look at the sources in gcc last night, and have to say I got 'fear of 
large grammars' from looking at the C++ grammar in gcc/cp/parse.y :). The 
lexer in lex.c/lex.h (it doesn't use flex) looked as though it might be more 
easily adaptable. I haven't looked at the preprocessor code yet.

> Yup! Sounds good but I found out something bad:
> One part of the CC parser would be evaluating expressions -
> _incomplete_ expressions. Consider this example:
>  a+b. // cc needed
> would be parsed by BISON/YACC bottom-up, so if any error occurs
> (as always does in code that needs completion) the root element
> (since we bould up a kind of expression tree) cannot be assigned.
> A way to avoid this would be a top-down-parser. But errors are
> generally a severe problem. The parser could skip all non-decl.
> statements that are not in the current line but it's hard (if possible
> at all) to let BISON parsers start at any other rule than the start rule.
> I think I need to try to write a top-down parser with back-tracking.
> Does anyone has a better idea? (if you have undestood my problem ;-)
One way of forcing an entry point into a bison grammar might be to use a 
backdoor into lex. You could have a function which puts lex into a certain 
start state where it emits a special symbol like 'CODE_COMPLETION_START' 
which didn't really exist, and then after that the lexer behaves normally. 
Then bison would still start from the start rule, but in the example below it 
would skip the parsing such as 'normal_start_to_rule', and dive straight into 
'rest_of_rule':

my_grammar_rule:
	normal_start_to_rule rest_of_rule
	| CODE_COMPLETION_START rest_of_rule
	;

Another way would be to add extra stuff to the partial expression, so it was 
always complete and grammatically correct before passing it on to bison.

But I think it would be best to use bison to parse up to the previous complete 
statement, then just do something simple with regular expressions to pick out 
the code completion variable in the current statement. In the example above, 
as long as b wasn't declared in the same statement as the expression, it 
would work ok. You don't need to parse 'a+b.' well enough to evaluate it, 
just well enough to find the name of identifier b. You just need to look for 
a type specifier or a cast before the code completion variable with regular 
expression and QStrings.

+ b. ==> look for a type declaration on a previous statement
(mytype) b ==> use mytype as type
mytype b ==> use mytype as type

-- Richard