New parser branch (Was: Dumping the source DOM?)
Richard Dale
Richard_Dale at tipitina.demon.co.uk
Wed Jul 13 14:41:06 UTC 2005
On Wednesday 13 July 2005 13:55, Roberto Raggi wrote:
> On Wednesday 13 July 2005 13:43, Sylvain Joyeux wrote:
> > Do you think it would be possible, though, to use gcc-xml (or whatever
> > "static" parser) to parse external dependencies and build persistent
> > datastores ? It would workaround the problems you have to parse C++ in
> > real-time (which is far from being simple) when advanced functionalities
> > are not needed.
>
> it's not about parsing. We already have it. It is about store symbols.
> Think about it. You have your C++ source file parsed.. and now? well now
> you have to store the result of the parser in a suitable form for code
> completion and class browsing(and quick lookup).. I hope you're not
> thinking to use XML for that. The first version of my parser was using XML
> as intermediate representation(3 years ago).. and was stupid and slow. So I
> wrote Catalog and CodeModel. What we should do is to improve Catalog and
> CodeModel and add things like templates, operators, local scope, etc.
Ashley Winters is suggesting using an xml translation unit dump for the next
version of the Smoke bindings library. And he is also suggesting doing the
runtime introspection via xpath, on xml files as text inside .so libraries,
one per class. Hmm, he has done some sizing and performance testing and it
didn't come out too badly.
You lose any comments with a translation unit dump, and they are needed for
both bindings and IDEs. Also you don't know which include file was associated
with which class, which is needed if you are going to generate .cpp code for
a language binding.
We had a discussion about this sort of thing at the Kiev conference. I've
started work on using the bison grammar that is part of ruby for the next
KDevelop 3.3 class browser instead of regular expressions. The only
definition of the ruby grammar is the bison grammar, and it wouldn't be very
easy to go straight to using Roberto's parser generator. So I do the bison
first, and then the new LL(1) grammar for KDE 4/Qt 4.
I had these questions to ask him about how the top down recursive descent
parser compared with a bottom up ruby one:
- Speed?
Roberto has measured his parser and it is faster than bison.
- Ease of use?
It has nearly all the features of bison except associativity and precedence
hints in the grammar. You can't use left recursion, and so for ruby a small
number of grammar rules about lists of method arguments would need to be
changed.
- Error recovery?
Apparently easier with the new LL(1) generator. Although bison seems fine to
me and I will just have to add a few more 'error' rules to skip to the next
valid token. But top down parsers have a 'better idea' of what their
currently doing, than bottom up ones.
- Language independent?
The parser generator and refactoring engine will be language independent,
and I would like to use ruby as a test case to ensure they are. Roberto is
keen on Java and I think he will ensure their is nothing C++ specific that
won't work with a Java parser. It would be nice to have an access type of
'package' as well as the usual 'private', 'protected' and 'public' in the
language independent parts.
-- Richard
More information about the KDevelop-devel
mailing list