topcontexts

Milian Wolff mail at milianw.de
Mon Apr 9 15:48:46 UTC 2018


On Monday, April 9, 2018 3:54:50 PM CEST René J.V. Bertin wrote:
> On Monday April 09 2018 13:23:12 Milian Wolff wrote:
> >Without a profile, this is very far fetched to me. mmapping a small file is
> >super fast,
> 
> I'd hope so, but it cannot be not faster than opening the file and reading
> its contents. That's my point: there are filesystems where file
> creation/opening performance drops drastically when the number of files in
> a directory grows. FAT (FAT16?) was an infamous example where you could
> easily wait 20-30s to open or create a file above a couple hundred file
> (don't ask me how I know this...). Who cares about FAT you might say, but
> know that HFS is basically about just as old underneath all its modern
> features and facelifts - and it'll be unavoidable on Mac for at least a few
> more years (and will remain so on HDDs as long as the new APFS is
> SSD-only).

I quite frankly don't care. Sounds like HFS is a broken file system then.

> >if you compress the dir and do other stuff to it, then you are on
> >your own and have to live with the caveats of these approaches.
> 
> I'm not doing anything to the topcontexts directories, other than throwing
> them away regularly. Before applying HFS compression to the rest of the
> cache because with those files present I get at most 20% CPU usage in my
> multi-threaded compressor utility.

Then don't compress them. You wouldn't compress a folder with compressed 
binary data either, like images, videos, music, ... or do you?

> >> the only way I can think of not to use those topcontexts files is to
> >> empty
> >
> >and  write-protect the topcontexts directories.
> >
> >Wich would of course completely break KDevelop.
> 
> Actually, no. Not at all in practice (or else features I never use). The
> code seems designed to handle write failures with no more than a warning (1
> per context index thingy), and from what I understand the writing to disk
> is only necessary to persist the information. I can't even say that a
> subsequent reparse is inacceptably slower because of information that has
> to be regenerated instead of simply read from disk. (In fact, I've already
> considered implementing a "trash cache on exit" option which would probably
> be a good enough solution to all my gripes with the duchain cache.)
> 
> I searched the tests, didn't find one to check/benchmark the topcontexts
> feature, but did notice there's a global switch that turns off persisting
> the data to disk.

Well, you use KDevelop in a completely unsupported manner: You don't want to 
cache anything. And you seem to suffer from HFS issues with compression which 
we all don't use.

> >support. No for performance related reasons (the current approach, while
> >old, is *exactly* what the DUChain code built around it needs - no wonder
> >it beats the other generic solutions).
> 
> I wouldn't argue that if we were talking about a handful of items/files. But
> we're dealing with at least "thousands" of files here (e.g. 17Mb of files
> that are a few to about 30Kb small)

Yes, one per encountered file.

> so while there's none of the overhead
> that comes with those "other generic solutions" there might be performance
> penalties that come from the filesystem or even the use of so many mmapped
> files. I'd want to see a quantitative comparison, preferably from code I
> can run myself.
> 
> I'm making progress with the LMDB backend I mentioned earlier. I'll put up a
> WIP diff on phab when I think it works.

LMDB is afaik unsafe to use on NFS. Was this fixed? And does it work on 
Windows nowadays?

-- 
Milian Wolff
mail at milianw.de
http://milianw.de




More information about the KDevelop-devel mailing list