c++ code completion status report

Tue Jan 8 16:22:03 UTC 2002

Hi Richard,

> I had a look at the sources in gcc last night, and have to say I got 'fear
of
> large grammars' from looking at the C++ grammar in gcc/cp/parse.y :). The
> lexer in lex.c/lex.h (it doesn't use flex) looked as though it might be
more
> easily adaptable. I haven't looked at the preprocessor code yet.

Horrifying, isn't it? But fortunately we only need to parse only
declarations.
Only few statements.

> One way of forcing an entry point into a bison grammar might be to use a
> backdoor into lex. You could have a function which puts lex into a certain
> start state where it emits a special symbol like 'CODE_COMPLETION_START'
> which didn't really exist, and then after that the lexer behaves normally.

Sounds good. I'll consider it.

I tested my top-down-parser on a buggy expression and -
it worked without moaning. But now I need backtracking.
So I need to think twice before continuing work.

> Another way would be to add extra stuff to the partial expression, so it
was
> always complete and grammatically correct before passing it on to bison.

Hm, but how to find out what causes no errors?

> But I think it would be best to use bison to parse up to the previous
complete
> statement, then just do something simple with regular expressions to pick
out
> the code completion variable in the current statement.

Actually we don't need the previous statement (it even can
stop the parser if it's buggy). Only if it's a declaration we have
to heed it.

>  In the example above,
> as long as b wasn't declared in the same statement as the expression, it
> would work ok. You don't need to parse 'a+b.' well enough to evaluate it,
> just well enough to find the name of identifier b. You just need to look
for
> a type specifier or a cast before the code completion variable with
regular
> expression and QStrings.

Yes, I also had some thoughts about 'reverse parsing'. But it's also quite
difficult. Also it may be better if we had a more powerful parser to let
it be useful for later extensions.

> + b. ==> look for a type declaration on a previous statement
> (mytype) b ==> use mytype as type
> mytype b ==> use mytype as type

mytype (b) ==> use mytype as type

And how about this:
class A { ... }; class B { ... }; class C { ... };
C operator+(A,B);
A a; B b;
... (a+b). // we should list C's members

Darn C++, uh? ... ;)

BTW: "a+" could also get CC - if it's a class and has '+' overloaded
 (so it's like argument hinting)