kdev-pg: lookahead implementation
Jakob Petsovits
jpetso at gmx.at
Mon Feb 6 12:52:04 UTC 2006
Hi Roberto, list,
We're going to have a question item which will be used to do lookaheads.
That will make rules like
( ?(declaration) declaration | expression ) SEMICOLON
-> statement ;;
possible, where we only parse the declaration when we really know that it's
going to be one, otherwise we parse the expression rule. A question item can
only hold terminals and symbols, no option items, no closures (multiplication
items) or whatever, which simplifies all that.
On the implementation side, I still have not grasped the exact difference
between lookahead and backtracking. If I employ a technique like in my
java_lookahead helper class where the class does regular parsing with the
only difference that the token is not consumed, is that backtracking or
lookahead?
Also, would such an implementation make sense for the question item (after
all, it does lookahead-parsing with LL(1) characteristics too, and can do
lookahead-in-lookahead) or do we want to go for a more complicated solution?
What should we do with semantic actions (code blocks) while doing lookahead?
After all, at least my Java grammar relies on some status variables (mainly
ltCounter, but also tripleDotOccurred) to be in a correct state.
If we abandon semantic checks (code conditions) within lookahead itself, we're
getting incorrect lookahead results, and if we abandon code blocks, the
semantic checks are getting incorrect. On the other hand, if we allow code
blocks, they could mess up the parser structures if used for anything else
than parser states.
I'd like to put all user-specified instance variables (and constructor) into a
common superclass which is then subclassed by the parser class and the
lookahead helper class. I think it should be possible to copy those variables
to the lookahead class and keep the ones from the parser class untouched by
using the assign operator. Of course, that only works if the variables are
only values, not pointers, or if the user also provides an operator=()
method. What do you think about that?
Finally, I'd like to extend the syntax to "?[int]( items )" so that
"?[2]( RBRACE )" would do a LA(2).kind == RBRACE check and
"?( LBRACE )" means "?[1]( LBRACE )".
So much for lookahead, expect my thoughts and questions on other topics soon,
Jakob
More information about the KDevelop-devel
mailing list