c++ code completion status report

Fri Jan 4 19:40:05 UTC 2002

Ah, looks like we have lot's of experts around here ;-)

---
From: Richard Dale
> [...] If translating to english is a problem, [...]

No English ain't no problem. Definitely.

The reason why we didn't have our discussion
in public was that we hoped to be faster this way
(I assume). But maybe it was all just convenience ...

> I can't find any bison grammar in cppsupport. I see two flex grammars with
> minor differences:
>
> tokenizer-cc.l  tokenizer.l

No, the grammar was not yet put to public. These are the
lexer used by the ClassParser by Jonas Nordin which
was written in pure C++ and could only read classes.

On my local machine I have now an (almost) complete
lexer (i.e., these two are obsolete).
The complete grammar also rests on my local machine.
It's the one that Stroustrup wrote in his book on C++.
It's just a superset of the actual language and hence not
sufficient. There also is another grammar file to find at
ftp://ftp.iecc.com/pub/file/c++grammar.tar.gz.

> If you only want these tokens to be parsed in a 'more detailed' parse
phase
> why not have one flex grammar and use the 'REJECT' macro to reject these
> tokens if doing the standard 'less detailed' parse for the class tree, as
> oppposed to code completion?

I think this won't suffice.

Ok, I'll show you my idea of the CC parser:
(The UI I will keep for another day :)

First, Here's an example the parser must be able to handle:
|  struct S1 { int x, y; };
|  struct S2 { int a, b; };
|  class C { public:
|   S1 a(int i);
|   S2 a(double d);
| };
| /* ... */
| int j, k;   C c;
| j = c.a(2). // list members of S1
| k = c.a(2.0). // list members of S2

So the parser has to be able to evaluate an arbitrary
expression concerning it's result type to determine what
to do/show. My idea was to build up a complete parse tree
(doesn't matter if through DOM or any other interface - I'm
working on one that is as small as possible). In that tree we
then look for the node we have to determine the type (this
can be a complete sub-expr.) and decide what to do.

The best method (I think (at the moment)) would be to parse
the current source file up to the cursor and examine the node
created last. We have to consider all locally and globally
defined variables and types (maybe even labels) and then
gain our information.

Since this parser examines only the current file it's pure
memory access and thus should be very fast. A simple
optimation would be to skip all blocks (compound statements)
that close before the block the cursor is at (but only code
blocks, not declarations).

From: Victor Röder
> > store was last written to, and only parse those?
> When you want to do code completion you need the source informations of
the
> used library/-ies (e.g. Qt). And these informations have to be *preparsed*
> and stored so that cc is fast. Another advantage of a persistant class
store is
> a the "high-speed" project loading.

I vote FOR a persistant CS, for I don't know how much
memory lot's of libraries would cost. And to end up this
XML or DOM or whatever discussion: It may be possible
to use XML or DOM as the communication layer between
the front-end (tree of libraries) and the back-end (any sort
of database or file). Otherwise I don't know what to use it
for. Ain't this what XML want's to be - a framework for
structured information interchange?

From: Richard Dale
> The current class store is doing the job for multiple languages ok as far
as I
> can see, because C++ is a 'superset' of everything else. C++ has multiple
> inheritance, class variables, instance variables etc etc.
AFAIK it doesn't support 'using'-directives and 'using'-declarations.
Furthermore I yet don't know how to handle preprocessing directives best.

From: Andrea Aime
> Ah, just as a sidenote, has someone considered
> ANTLR? www.antlr.org. This parser builder has the
> built-in ability to create the AST of the parsed sources...
> seems to be an interesting alternative to flex/yacc...

I think we should now settle one parser. Also this seems
to be JAVA, ain't it?

Ok that's 'nough by me now.
Let's hear your comments. ;-)

- Thomas