KDev codemodel/refactoring API

Fri Aug 4 15:07:29 UTC 2006

On Friday 04 August 2006 10:33 am, Roberto Raggi wrote:
> Hi!
>
> On Friday 04 August 2006 15:58, Adam Treat wrote:
> > Roberto, how much of an error would it be to just store the actual AST
> > with line information and type information rather than the codemodel?  I
> > know it would probably be a hit to memory and it is not as nice to
> > interact with, but it has two MARKED advantages that I can see:
> >
> > 1.  Our parsers already generate it and we don't need to worry about
> > integrating different languages AST.  They are generated from the parsers
> > with perhaps a slight modification to make the AST objects inherit
> > KDevCodeModelItem.  Problem solved.
>
> I tried, but it's too big :( We have to find a way to simplify it, or we
> have to find a better way to represent it (we can't keep it in memory).. I
> think eclipse uses a persistent AST (but I'm not 100% sure)
> http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.jdt.core/dom/org/eclip
>se/jdt/core/dom/
>
> > 2.  All information that is generated by the parser is kept around.  We
> > don't have any binding process except adding on type information.  We
> > don't have to worry about what the codemodel should look like.
>
> would be really nice.. we just have to find a nice and efficient way to
> store these informations.
>
> > The biggest problem is the hit to memory, but we already keep around the
> > AST for the current file for a limited time.  I wonder how much of a
> > memory difference it would make.
>
> *a lot* the AST is about 10x bigger than the original source code.

Heh, well, check this out:

I've been working optimizing startup time in KDevelop 4 for large projects.

So, as a test I'm opening every single C++ file in trunk/KDE/kdevelop on 
project startup.  That is ~1230 files if you include all of CMakeLib.

That means the C++ parser is getting a good workout.  It generates a codemodel 
AND an AST for every translation unit.  All ~1230 files.

What's more, I'm storing them in memory!  Mostly as a fluke, but also to see 
how big of a memory hit it is.

Guess what... it takes between 1 and 2 minutes to do all this on my system.  
It also takes about 32% of my memory according to 'top'.  I only have 512MB.

I'd say that is pretty good since we're storing all the codemodels AND AST's.

> > > For instance, we use something very similar to kdev-pg-replacement in
> > > Trolltech's qt3to4 (the Qt4 porting tool). Search for "TextReplacement"
> > > in http://websvn.kde.org/trunk/qt-copy/tools/porting/src/
> >
> > I wonder if your C++ parser in kdevelop has forked very badly from the
> > one you use for qt3to4, qt-jambi generator and the LSB standard
> > generator... ?
>
> qt3to4 is not mine :-) but it's true qt3to4 is using an old version of the
> kdevelop C++ parser. I did that version in a KDE meeting and then we
> decided to use it in qt3to4.. the one in KDevelop is a lot faster and nicer
> :-) so I always try to have the *best* version in KDevelop :-) in Jambi
> we're using the 1st version of the current KDevelop4 parser (the one we had
> in branches/work/kdevelop4-parser) and my STL-ish C++ preprocessor.. I'm
> not using the current version in kdevelop because i want to be sure I own
> the code (you know the copyright is pretty important :-).. I hope to change
> this in future, but that means I have to change the license of the C++
> engine to MIT, but I don't know how to do it. Because I can change the
> license _only_ for the code I own (== the initial version of the new
> kdevelop parser).. But I'm positive we have changes in the current
> repository that are not mine :-) so maybe i will release a _new_ special
> edition of my C++ parser under MIT, but this version will have _only_ my
> changes. but i don't know, it doesn't sound like a good idea. hints?

I'm willing to change my miniscule contributions under an MIT license.  The 
number who've contributed to kdevelop/languages/cpp/parser I can count on my 
fingers.  Would everyone else be willing to release their stuff under an MIT 
like license?  If so, then if Trolltech is willing we can change the license.

> > Either way, we need to figure out what we are going to do about the
> > codemodel's for Java and C# as well as what we're going to do with the
> > codemodel/binder in C++.
>
> I know.. I like Jakob's idea to generate the code model.. and you know he
> is really cool :-) so I'm sure we will have something pretty soon!!

I really like it if we can just use the AST and add type information plus 
line/col information as needed.  I'll do some more testing to verify all of 
this, but if we're only using that much memory for every file in kdevelop... 
I think it'd be acceptable.

Adam