Dependencies of parse-jobs

Mon Feb 13 13:05:48 UTC 2012

On Monday 13 February 2012 09:54:02 David Nolden wrote:
> 2012/2/12 Sven Brauch <svenbrauch at googlemail.com>:
> > Hi there,
> > 
> > PovAddict and I were tracking a bug (in the python plugin) a few days
> > ago which turned out to be caused by the order in which documents are
> > parsed to be wrong (as in, not like I wanted it to be, heh).
> > Currently, I'm doing this: If document A is parsed and an "import"
> > (same like #include in c++) is encountered, a parse job is created for
> > the imported document using
> >    DUChain::self()->updateContextForUrl(...)
> > with a priority that has a lower value than the current parse job (and
> > thus a higher priority (who invented this?)). Then the current
> 
> This one was my fault, I followed the linux priority logic, which
> later showed to suck hard. :-)
> 
> > document is marked as "needs update", and after the parse job
> > finished, it's registered for a re-parse again (that's done inside
> > parsejob::run) with a lower priority than it had before (I'll use the
> > word "priority" as everybody expects it to be used from now on):
> >  
> >  KDevelop::ICore::self()->languageController()->backgroundParser()->addDo
> > cument(...) Now, that roughly works, but not quite; sometimes, the
> > documents are not parsed exactly in the order their priorities suggests.
> > I guess that is to be expected -- but what else is there except
> > priorities that can be used to "sort" parse jobs?
> > 
> > Here's the TL;DR version: What's the "good" way of forcing document A
> > to be re-parsed as soon as there's a top-context for B available which
> > satisfies the required $minimum_features? How does C++ do this? I
> > looked into the code, but couldn't quite find anything related to that
> > (it's a huge project...).
> > 
> > Any help would be appreciated. :)
> > 
> > Greetings,
> > Sven
> 
> In C++, included documents may depend on other documents included
> earlier, and thus they must be parsed right in-place.
> 
> You could do the same, by simply creating a new parse-job in-place and
> starting it recursively. You just need to care about preventing
> infinite recursion and multiple running parse-jobs für the same
> document. The question is what would be the correct way to deal with
> such recursion.
> 
> In a language where such recursion is allowed, the correct solution
> would probably be a multi-pass parsing: When you encounter an include,
> parse it in a "simplified" mode (just creating declarations for the
> symbol-table, but without building uses and/or types etc.), and later
> re-parse it in full mode after the "simplified" mode has been finished
> for all includes.

I've done this for PHP and it's simply a workaround with its own set of bugs. 
The way Sven does it (or tries to do it) is the only right way. I.e.: Don't 
even try this imo. For PHP I want to get rid of the "simplified" mode and use 
the "proper" multipass as well.

So, if the priorities alone won't help then I assume it's a kind of a race 
condition. See, as long as you have one parse thread, the priority queue 
should be all that is required to keep things in order. Now with more threads, 
the dependency could be started in Thread A and the re-parse in Thread B, 
still failing to find a context for the dependency (it's still parsed in 
Thread A)...

I'm unsure how to handle this properly, a quick'n'dirty version would be to 
figure out "is dependency still parsing?" and - if so - waits on the 
UrlParseLock until the dependency has finished...

A cleaner solution - and faster since less threads need to wait - oh and also 
less error prone, the above can easily deadlock on recursive includes ;-) - 
would be to use the addDocument() callback function. I.e. write some kind of 
class that mapps file -> dependencies and adds the latter to the background 
parser. It then also listens to the callback and creates a new job once all 
dependencies have finished parsing...

What do you think?

Bye

-- 
Milian Wolff
mail at milianw.de
http://milianw.de