What is going on in language part land

Sat Aug 5 13:39:32 UTC 2006

On Saturday 05 August 2006 07:01, Jakob Petsovits wrote:
> On Friday, 4. August 2006 19:00, Adam Treat wrote:
> > Hi All,
> >
> > I just wanted to layout a summary of discussions that are going on with
> > the parsers.  Roberto was recently in the #kdevelop IRC channel and a
> > number of us have been talking via email about how to proceed with the
> > various parsers.
> >
> > The current situation:
> > 1. kdevelop-pg generated C# and Java parsers that do not have a
> > codemodel. 2. Hand written C++ parser that does have a codemodel, but one
> > that is lacking.
> >
> > Because of the difficulty in coding a codemodel for every language parser
> > by hand, the thought was to see if kdevelop-pg could be amended to
> > generate them.
>
> For reference, I'm currently working on a generator (about halfway done)
> that produces codemodels like the current C++ one. I decided to get an
> exact replication of the C++ codemodel, because I've got no idea how the
> improved one should look like. But once it's there, we can easily change it
> to fit our new needs. (Like, referencing the AST instead of storing stuff
> by itself.)
>
> > Roberto has discussed this in a number of ways.  I think the summary is
> > that a codemodel is used in addition to the AST for three reasons:
> >
> > 1.  Performance and memory usage.  The AST can be resource hungry and
> > memory intensive.
> > 2.  The AST does not contain *scope* and *type* information.  The
> > codemodel does.
> > 3.  The codemodel's API makes more sense to developers and can be easier
> > to use and manipulate.
> >
> > I made a suggestion that perhaps storing the AST *wouldn't* be such a
> > huge burden in terms of memory.  If this is so, then perhaps it makes
> > sense to put aside #1 and see what we can do about #2 and #3.
> >
> > #2. is the real bear to me.  I don't know what would be involved with
> > modifying kdevelop-pg to include scope and type information.  I also
> > don't know how it would affect the DUChain that Hamish has been working
> > on.
>
> I guess it would be possible to modify kdevelop-pg for including scope and
> type information, but I would like a more detailed definition of what that
> information essentially is.
>
> For scopes, I could imagine that it should be possible to access the parent
> scope from any (deeply nested) AST member further below. Maybe with the
> scope AST items containing an additional compulsory "name" field and a list
> of child scopes. Would that be it?
>
> For type information, I have no idea what's needed in addition to what's
> already in the AST. Well, a toString() method maybe, and an equality
> operator. What would you define as "type information"?
>
> > #3. is also a bit of a mystery.  Perhaps we can write some convenience
> > functions that would abstract the esoteric parts of the AST, but still
> > use the AST as the datastore, rather than copying that information into
> > another structure like we do with the codemodel.
>
> Agreed.
>
> > Anyway, if we _can_ solve these problems then I think we should.  Hand
> > coding a codemodel for each language part just increases the amount of
> > work for an already beleagued group of maintainers.
>
> Even the current codemodel is a big pile of code monkey work.
> The way it looks now, it seems that one codemodel definition file (for my
> new codemodel generator) with a little more than 300 lines can nearly
> exactly generate the existing C++ codemodel with 3 files of 700, 900 and 80
> LOC (approximately). That seems like an improvement even if we wouldn't
> change all the codemodel stuff.
>
> > Another thing that I want to keep an eye on is Roberto's suggestion that
> > we should think about writing a C++ grammar file for kdevelop-pg.  If all
> > of the parsers, including C++, could be using the same generator, well
> > that'd be a real boon.
> >
> > However, we have no volunteers for this and it would likely be a
> > difficult task.  Roberto seems to think that kdevelop-pg is in a state
> > that could handle it though.  It is good to keep in mind.
>
> Hm, ...let's see:
> * We have a pre-processor and a lexer, neither of which needs to be
> replaced * We have a parser that uses just the same paradigms and solutions
> that kdevelop-pg also uses (er, ...why is that? ;)
> * The parser is complete, works, and just needs to be transcribed from
>   manually-written C++ to its kdevelop-pg representation.
>
> I mean, it can't be _that_ hard, right?
> Seems like it's important enough to try it out.
> (Should I do it soon? What about completing my SoC project first?)
>

I think you should complete SoC first, since that's on a deadline, and your 
successful completion of that project has more consequences involved in it. 
Most of them are political in nature, but the money you get out of completing 
the deal can't be too bad. :)

> The question is rather:
> "Do we want kdevelop-pg to produce camel-cased code
> instead of c_style_underlines?"
> Otherwise the parser will look a lot more ugly than before ;)
>

If we can provide the option, that would be best.

Thanks
--
Matt