New parser branch (Was: Dumping the source DOM?)
Vladimir Prus
ghost at cs.msu.su
Wed Jul 13 15:03:06 UTC 2005
Roberto Raggi wrote:
> #include <my-cool-header.h>
>
> int main()
> {
> my_cool_function("ciao\n");
> }
>
> because the IDE doesn't where is the file my-cool-header.h.. KDevelop
> works just fine in this case. gccxml will fail. THIS IS NOT ACCETABLE!
Since any reasonable project manager will allow you to specify include
paths, what's the problem?
>
>
>> One possible approach I had in mind was to make parser restartable. First
>> you run g++ parser on the code till the first token it cannot parse. As
>> you
>
> wrong!
>
> 1) gcc takes about the 50% of your CPU. it is a bit too much for a
> background parser(== you can't type in KDevelop and compile your
> project
> with the "real" gcc at the same time)
.....
> 3) import the source code of a project will take almost the same time to
> compile it. So you have to wait about 1 hour before load the KDevelop
> project, 2 hours for kdelibs/kdebase, and so on..
To begin with, you numbers are wrong. Here's an output from gcc on a certain
file:
cfg construction : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
cfg cleanup : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.50 ( 1%) wall
trivially dead code : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
life analysis : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
life info update : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
preprocessing : 0.71 ( 1%) usr 0.26 ( 6%) sys 1.00 ( 1%) wall
lexical analysis : 0.22 ( 0%) usr 0.44 (10%) sys 2.00 ( 2%) wall
parser : 3.83 ( 4%) usr 0.75 (18%) sys 5.00 ( 5%) wall
name lookup : 1.74 ( 2%) usr 2.17 (51%) sys 3.00 ( 3%) wall
expand : 0.52 ( 1%) usr 0.02 ( 0%) sys 1.00 ( 1%) wall
varconst : 0.17 ( 0%) usr 0.02 ( 0%) sys 0.50 ( 1%) wall
integration : 0.13 ( 0%) usr 0.01 ( 0%) sys 0.00 ( 0%) wall
jump : 0.07 ( 0%) usr 0.01 ( 0%) sys 0.00 ( 0%) wall
flow analysis : 0.02 ( 0%) usr 0.01 ( 0%) sys 0.00 ( 0%) wall
mode switching : 0.11 ( 0%) usr 0.01 ( 0%) sys 0.00 ( 0%) wall
local alloc : 0.22 ( 0%) usr 0.03 ( 1%) sys 0.50 ( 1%) wall
global alloc : 0.56 ( 1%) usr 0.01 ( 0%) sys 0.00 ( 0%) wall
flow 2 : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
shorten branches : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
reg stack : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
final : 0.25 ( 0%) usr 0.04 ( 1%) sys 0.50 ( 1%) wall
symout : 77.09 (89%) usr 0.45 (11%) sys 77.00 (85%) wall
rest of compilation : 0.32 ( 0%) usr 0.04 ( 1%) sys 0.00 ( 0%) wall
It spends 85% of all the time doing what? Basically, first output code for
all the template function there are, and then outputting debug info for the
myriad of template classes from Boost. The actual parsing takes mere 15%.
> 2) it will not help the code completion.. in fact you will not be able
> to
> perform any code completion. Because, the code is *UNFINISHED* (please
> read it again IT IS UNFINISHED).. gcc will not produce any abstract syntax
> tree, you will not populate the code model, and you will not have the code
> completion
I'm afraid you are wrong again. The gcc parse is just a recursive descent
one, and each function returns a value of type 'tree' -- which is just your
AST. In general, in not even possible to parse C++ without maintaining
correct symbol tables at *parse time*. Consider this:
template<class T1>
struct Outer {
template<class T2> void foo();
void bar();
};
int main()
{
Outer<int> v;
v.foo<int>();
}
It's only possible to parse call to 'foo' correctly if you know the type of
'v' and can look into 'v's scope to determine that 'foo' is a function
template, and not something else.
The fact that gccxml will not produce a parse tree unless you need a
complete translation unit to it, does not mean that gcc parser does not
build AST.
>> type more tokens you feed them to the parser. If you go to the beginning
>> of the file and start typing there, you rewind parser state and start
>> parsing again.
>
> I think I will stop here. This thread starts to be annoying. I'm sorry
> Vladimir, but I don't think you know what you're talking about. Anyway,
> good luck with your project. Maybe it is me that I don't see your point
> and maybe you're right and gccxml and parse *only* valid source code is
> the right solution for KDevelop.
I'll stop here too and will hope we won't get yet another parser that can
parse the easy 90% of C++.
- Volodya
More information about the KDevelop-devel
mailing list