KDevelop 5 too slow?

Mon May 29 10:43:42 BST 2017

On Monday, May 29, 2017 11:34:54 AM CEST René J.V. Bertin wrote:
> On Monday May 29 2017 10:52:00 Milian Wolff wrote:
> 
> Hi,
> 
> >> I'm not very familiar reading this kind of call graph, but if you follow
> >> the graph down clang_getFileLocation shows up which accounts for almost
> >> 24% of the processing time on my system. That could correspond to disk
> >> access, no?>
> >No, CPU profilers do not account for disk access. I'm not familiar with
> 
> My bad, I didn't mean to claim that it does any analysis of disk access
> itself. As far as I understand this sort of thing that would only exclude
> the time the CPU spends waiting for I/O operations. Would a CPU profiler
> exclude the CPU cycles spent on I/O just because it's I/O?

No, a kernel switches out a process that is waiting on something, i.e. 
sleeping or doing I/O. And a switched-out process does not spent any CPU 
cycles, these will be attributed to whatever other process gets switched in 
(which may also be a pseudo-"idle"-process).

> IOW,
> clang_getFileLocation() is doing something that takes 24% of all spent CPU
> cycles, and given the name of the function that is related to I/O in some
> way.

What, no?! This is a purely CPU-bound mapping function. Given a file offset, 
it builds a line/column cursor.

> >Perf or VTune are just as capable.
> 
> But most Linux systems don't come with the required kexts installed, do
> they?

What are you talking about?!

> >Also, to make sure: Did you compile KDev*
> >and everything else you are profiling in RelWithDebInfo mode? If not, then
> >this profile output is completely useless.
> 
> I build with "-O3 -g" if that's what you're asking, everything KF5 and Qt5
> itself. Specifically, I use Debuntu's approach of building with a custom,
> non-predefined BUILD_TYPE and then set the desired compiler options in
> CFLAGS and CXXFLAGS.

Sounds good.

> >There's N + 1. N for background parsing and one for parsing the active
> >document, if needed.
> 
> So I ought to do this kind of profiling with only a single document open?
> That should actually be more representative of what happens when you make
> changes to the current document.

I don't know what exactly you are trying to profile. If you want to figure out 
why the main thread gets blocked every now and then, you'll need a profiler 
that gives you per-thread granularity and then look at what the main thread is 
doing. This means doing both on-CPU and off-CPU profiling. With VTune that is 
trivial and should show you what's going on. I bet Instruments has similar 
capabilities. Learn the tool and report back.

> FWIW, it's still a bit confusing. The current document is also parsed in the
> background, so what the option in the settings controls is actually the
> number of threads to use for parsing "background documents"?

The N in the N + 1 above.

> >>  if KDevelop is slower running
> >> 
> >> from one than as/from a regular install the 1st explanation one thinks of
> >> is "something related to the bundling".
> >
> >Before doing such a claim, back it up with hard numbers. Profiling and
> 
> Eh? I didn't claim anything that needs backing up with hard numbers, unless
> you mean the number of people who actually decide to check if that 1st
> explanation is the right one? :)

I mean it's a waste of time to muse about what could possible happen. Measure 
it and see what's happening instead.

> > Can you use a
> > different instruments profiler configuration to find wait time in the
> > code?
> > That would allow us to get a more accurate image of what's going on. Also,
> > make sure to profile the same thing.
> 
> There's a whole bunch of preconfigured "Instruments", I'll have to see if I
> can figure out which would be the appropriate one for wait time.
> > I.e. I suggest you clear the duchain
> > cache before starting KDevelop with the profiler.
> 
> I can't clear the cache while KDevelop is running, can I? That means I'd be
> measuring KDevelop's start-up too meaning loads more data or less
> fine-grained sampling. I'm also not convinced that measuring the cost of
> rebuilding the entire cache is what we're interested in here, that
> shouldn't be what causes the reported slowness, right? As to measuring the
> same thing: I make sure to force the parsing step 2 or 3 times before
> sampling the operation. I don't expect it makes a difference if I do that
> 2x or 3x but it'd probably be better indeed to standardise.

Just make sure you are measuring something reliably and reproducibly. The 
profile you shows simply tells us that most of the time is spent parsing the 
file, which isn't suprising if you parse a file...

Bye

-- 
Milian Wolff
mail at milianw.de
http://milianw.de