Small (un-representative) benchmark on sqlite with blobs
Roberto Raggi
roberto.raggi at gmail.com
Mon Jun 25 08:49:37 UTC 2007
Hi,
Il giorno 25/giu/07, alle ore 02:40, Andreas Pakulat ha scritto:
> Hi,
>
> David, Kris and myself had a (short) discussion about how to persist
> duchain data. Especially with multi-projects in mind we might get
> quite
> some data.
My problem with SQL is its textual representation. In general the
engines are pretty fasts, but unfortunately you have to parse and
generate SQL statements for every single operation. Think about the
result SQL statement when you ask for all the symbols in the global
namespace ;-) That takes quite a bit of resources, especially in a
real time environment. KDevelop is not a compiler, you don't need to
store a lot of information in its persistent storage. I think you can
use an approach similar to the per project PCS file I did for
KDevelop 3. The project's PCS file is a dump of the Code Model,
KDevelop uses it to speedup the project loading. My feeling is you
just need a better file format for PCS file. What do you need to
store in the PCS file? for sure:
* The macro definitions
* The type table
* The name table
* The file table
* The symbol table (variables, functions, classes, uses, ...)
* The scope chain
You can encode the type table as a vector of unique type ids. In
general the type table is very compact. You have a fixed set of
primitive types(less than 20), the function signatures(I think in Qt
we have about 2000 different signatures), the class definitions, and
then a sequence of arrays, references, pointers, and pointer to members.
The symbol table is just a sequence of symbols, so you can store the
array of symbols. Some thing for the name and file table.
The scope chain is more interesting. A Scope in C++ has stack-like
access. and for each symbol you need to know the original scope of
the symbol, and the shadowed symbols. You can pretty much encode the
scope with a pointer to the previous scope, an array of the symbols
introduced in the scope, and a array of buckets (linked list of
symbols stored in reverse order). You find your bucket using the name
id's hash value. The first symbol in the bucket with name `id' is the
visible definition of `id'. The other symbols in the bucket with name
`id' and the symbols in the previous scope with name `id' are
shadowed symbols.. hmm, two arrays and a pointer to the previous
scope. You can store that ;-)
Now, the whole thing here is in the order of thousands of elements.
You don't need an SQL engine for that ;-) The data structures used
are trivial (structs, arrays, and pointers), you just dump this stuff
to a file by replacing pointers with ids(like we do in kdev3's PCS
files) and you are in a pretty good shape for KDevelop4 :-)
One last thing. You want to be able to load and unload PCS(and maybe
part of PCS) files on demand, so you can write a little cache of PCS
files ;-) In case you want to load 4-5 projects.
ciao robe
More information about the KDevelop-devel
mailing list