more code completion fun!

Daniel Berlin dberlin at redhat.com
Fri Mar 9 07:34:08 GMT 2001


To reply to both emails at once:

Roland Krause <rokrau at yahoo.com> writes:

> I've got a question, this maybe a stupid one. 
> 
Not at all.

> This is all to parse the file the user is currently working in, right?
> 
No, it's to parse the files needed to do code completion for the file
the user is working in, which is a larger set.

> So the goal is to find out what the user is trying to do. Then what
> happens if the current file is full of nonsense e.g. 
> I start a method definition:
> 
> namespace mynamespace {
> 
> myclass::mymethod (int myarg1, int myarg2) {
>    myarg2=myarg1-3
> 
> 
> and i forget to close the damn function scope and maybe also forget the
> semicolon in the function or make other errors.
> 
> Then i go on with next method
> 
> myclass::mymethod2 (int myarg1, int myarg2) {
>    myarg2=myarg1;
> 
>   
> }
> 
> 
> Will this all then work in myclass::mymethod2 at all? 
With an ANTLR based parser, it's pretty trivial to make this work.
One of ANTLR's greatest strengths is that, unlike YACC, error recovery
is very easy to implement very well, and, (The other usually cited is
that it generates code that looks almost exactly like what a human
would write. This makes it very easy to debug.)
>Will mymethod1
> also be available for code completion already? 

No because you have no mymethod1. mymethod would. :)



> 
> Roland
> 
> --- christopher j bottaro <cjb at cs.utexas.edu> wrote:
> > phew, after attempting it with regular expressions, then moving on to
> > lex and 
> > yacc, i've finally gotten somewhere after ditching all that and
> > writing an 
> > LL(k) grammer for use with ANTLR 2.0.  so far the grammer can parse
> > method 
> > definitions include parameter and local variable declarations.  the
> > grammer 
> > recognizes scope and parses to remember the scope where the var was
> > defined.  
> > for example...
> > 
> > virtual const char* SomeClass::AnotherClass::AMethod(int& x, char y)
> > {
> > 
> > 	// scope 1a
> > 	{
> > 		QString var;
> > 	}
> > 
> > 	// scope 1b
> > 	{
> > 		QPoint var;
> > 		
> > 		// scope 2a
> > 		{
> > 			QRect var;
> > 		}
> > 	}
> > 
> > }
> > 
> > the method will be parsed, including its return type, qualified name,
> > and 
> > parameters.  then codeblocks will be parsed including their begin and
> > end 
> > lines and all variable declarations enclosed in them will be parsed.
> > 
> > after the parsing, it shouldn't be too hard to determine the line
> > number that 
> > the cursor is currently on, determine what scope that falls in, then
> > search 
> > the QDict of that scope for what ever variable you want completion
> > for, get 
> > its type, look up its type in KDevelop's classstore, then populate a
> > listbox 
> > full of members.
> > 
> > so far i've just tested it by building a parser outta the grammer and
> > running 
> > it in a console program on kwview.cpp which is 70k.  it parses and
> > prints out 
> > the results to cout in less than a second on a p2 600.

And better, it's easy to debug.

> > 
> > i suppose the real challenge is resolving function calls and pointer
> > derefs?  
> > anyone have any suggestions?
Ignore them.

Don't try to play all the stupid disambiguation games necessary. Just hit
a superset.
Most of it becomes non-relevant in code completion:
For instance, disambiguating function calls is pointless:
See, if it wasn't a valid function call, when we go to lookup the
function to display the possible arguments,
we won't find that function, and won't present anythign to the user.
If it was a valid function call, we'll find something, and present it
to the user.
Now, if you were trying to make an AST that you could generate code
off of, it's very relevant. However, we only care about the text, and
as long as we have the right parts of the text associated with the
right things, everything else should fall into place.

> > 
> > also, when i compile my grammer, i get like over a page of
> > non-determinisms, 
> > but they don't seem to matter at all...  here are some of the ones
> > that 
> > bother me...
> >

What is your lookahead set to? 


> > primary_expr
> > 	:	(scope)? IDENTIFIER
> > 	|	INTEGER (DOT INTEGER)?  // non-determinism here
> > 	|	STRING_LIT
> > 	|	CHAR_LIT
> > 	|	OP expression CP
> > 	;
> > 
> > // this rule is fine...
> > postfix_expr
> > 	:	primary_expr	(	(OP (arg_expr_list)? CP)
> > 				|	(OSB expression CSB)
> > 				)?
> > 				(	((DOT|PTR) postfix_expr)
> > 				)?
> > 	;
> > 
> > // this whole thing apperent causes a lot of non determinisms
> > unary_expr
> > 	:	(unary_op|INCR|DECR)* postfix_expr
> > 	|	"sizeof" OP returntype CP
> > 	|	"sizeof" postfix_expr
> > 	;
> > 
> > cast_expr
> > 	:	(OP returntype CP)? unary_expr  // non determinism here
> > 	;
> > 

What are they non-deterministic on? (IE paste the output).

Remember, ANTLR is LL(k), but it uses linear approximation, which means some
things seem non-deterministic to it that really aren't
non-deterministic for LL(k), for a given k.



> > anyways, antlr is great, maybe the class parser should be rewritten
> > using it. 
> >  its cool how rules can have return types and take arguments and
> > throw 
> > exceptions and such.  its a lot more pleasureable writing an LL(k)
> > grammer 
> > also.  god, i wish i hadn't have dropped my automata theory class.
> > 
> > tell me what you all think...potentially useful?

See above about ANTLR.

I used to work on the C++ generator for ANTLR to improve the speed of
the generated code.

But i've always been a large fan.


> > 
> > christopher
> > 
> > -
> > to unsubscribe from this list send an email to
> > kdevelop-request at kdevelop.org with the following body:
> > unsubscribe »your-email-address«
> 
> 

-
to unsubscribe from this list send an email to kdevelop-request at kdevelop.org with the following body:
unsubscribe »your-email-address«



More information about the KDevelop mailing list