c++ code completion status report

Sun Jan 6 17:46:02 UTC 2002

On Saturday 05 January 2002 10:10 pm, Thomas Schilling wrote:
> > Yes, it does. Read http://www.acm.org/crossroads/xrds7-5/bison.html It
> > properly shows that Bison is arcane. I'm currently working with
> > flex/bison at university, and from a theoretical point of view, bison
> > uses a parsing algorithm that is pretty fast and yet powerful enough for
> > most languages (LALR(1)).
>
> Hm, interesting article.
Yes, a good read. I downloaded the 'accent' parser generator and tried it out. 
In order to get the bison grammar for gcc working with accent I had to change 
the %token statements to single line like this:

%token FOO, BAR, BAZ;

Where the tokens are separated by commas, and the list is terminated with a 
semi-colon.

I had to remove all instances of %right, %left and %nonassoc from the preamble 
before the main grammar. In the grammar itself, the only thing I needed to 
change was to remove all '%prec' directives. I assume you can add accent 
annotations to the rules themselves to replace this functionality - I haven't 
read the docs enough yet.

Also accent doesn't have a built in 'error' rule like bison.

> > However - it is not very intuitive, as you have to avoid
> > left-recursion and other shift/reduce and reduce/reduce-errors.
>
> Hm, well it's not that bad, I think. Currently i've only 168 shift/reduce
> conflicts and only ~40 reduce/reduce conflicts. I don't know if this is
> much but i suppose so. (it's due to "expr: expr '+' expr" and this for
> every operator but it works, although i use both left and right recursion).
The reduce/reduce conflicts are bad in the sense that you can't use the 
grammar until you've gotten rid of them. Most of the shift/reduce conflicts 
probably won't matter, but it isn't regarded as 'good form' to leave too many 
of them. Left recursion is more efficient than right recursion in bison, but 
I don't think that matters much here.

> > Algorithms
> > like GLR can handle those easily, but have a slightly higher complexity,
> > which doesn't really matter according to the author of the document cited
> > above.
>
> Sure performance doesn't matter, not really. The problem is that we have
> do minimize parser calls. I'd suggest to do it only when special
> characters were typed (such as '.' or '->') and waiting (max.) half a
> second for
> the completion shouldn't be too much trouble.
> Nevertheless I'm currently working on a concept to reuse information
> from earlier parser calls.
Should the user have to type a function key after typing '->' or '.'  to 
invoke the code completion widget? I wouldn't want it invoked every time I 
just typed those two characters in the editor.

>
> > Another possibility would be to take the same bison files as the GCC
> > project. Don't know if they are any usable, though.
I think the gcc grammar should be usable as a starting point, and someone else 
has already done all the work to remove parser conflicts. A version with all 
the embedded code removed and just the grammar rules comes as a 'stress test' 
with the perl yapp parser generator. That's the one I've been using to try 
out accent.

On Saturday 05 January 2002 10:18 pm, Eray Ozkural (exa) wrote:
> Nice article. Though LR(k) parsers are pretty common place (OTOH it would
> be inexact to call such an algorithm non-deterministic), and it's not that
> difficult to write your own LR(k) algorithm! It's just a hell lot of
> book-keeping to write a bottom-up parser. I think I'd written an LR(1)
> parser, hmm, you can express much of the code with recursion which makes it
> look very intuitive.  The old dragon book tries to do everything with for
> loops and stacks. Heh.
Yes, I don't think LR(k) is suitable for implementing prolog style 
non-deterministic parsers. Accent is effectively a combination of LL(1) and 
LR(1), but still deterministic. 

It should be so quick to parse a complete source file with bison when a user 
wants to do code completion (eg 0.5s), that my original suggestion about 
LL(k) and backtracking doesn't make much sense relative to all the extra work 
involved. 

-- Richard