How and what does (plasma theme) caching operate? / KSharedDataCache issue?
    Thomas Lübking 
    thomas.luebking at gmail.com
       
    Mon Sep 17 22:22:22 BST 2012
    
    
  
Am 17.09.2012, 01:41 Uhr, schrieb Michael Pyne <mpyne at kde.org>:
> 1) KSharedDataCache is supposed to be a convenient way to put data into  
> shared
> memory. No disk access is actually ever made beyond using  
> posix_fallocate (if available)
That actually seems to be it - i (still) have (had) /var on reiserfs  
(which was the suggested FS for many small files - "then") but (according  
to google) only for Ext4, XFS and BtrFS posix_fallocate() is more or less  
NOOP.
I moved the cache to an Ext4 partition and the "long delay and much I/O  
ops" are gone (esp. for follow up PIDs and even after removing the file) -  
it behaves much more as expected now.
(I also can rule out fragmentation for the (old) /var partition)
Whether there's overhead compared to loading a compressed file into RAM  
i'll now have to stat.
The measure i used yesterday was "how often do i breath until the frame is  
initially generated"
Has anybody ever measured posix_fallocate overhead on the likely more  
frequently used ext3?
I'll mark "revert #245173 patch and move /var back to reiser" to remember,  
to ensure this is the only issue with that FS type (but whatever it is, it  
means the FS is no good candidate for .kcache, so either /var now finally  
moves somewhere else or .kcache has to)
  ========== Semi OT talk ======================
> 2) One of the intended use cases are for caching values that take a
I do not question the reason of a binary cache.
> most papers I've read (e.g. Maged Michael)
slfdma?
> Er, even *more* crashes than the current code.
All bleachbit's fault. Period. ;-)
> Using quadratic chaining to verify that an entry is actually not present  
> can require up to 7 (IIRC) lookups into the index entry table, and
But that would happen in RAM or even CPU cache - i was talking about  
things MUCH slower than that.
> collisions in
> the key hash must be double-checked by looking at the actual data page  
> itself (where the full name is recorded).
Unless there's an extremely stupid "int hash(const char *c) { return 4; }"  
hash algorithm in use, i doubt this is a problem in this context (but a  
stupid key pair would of course slow down things repeatedly - in case  
there's not, there should probably be a warning whenever there's a  
collision "please use key2_1 instead key2 because key2 hash collides with  
key1 hash")
> Thiago Macieira (in the context of the 0-copy patches) has recommended  
> perhaps using a data structure where entries can be added but not  
> individually removed (and therefore don't require defragmenting and can  
> be packed as tightly
As long as the file does not need to be read entirely, on disk space usage  
is hardly an issue in that dimension (but for raising the on disk  
fragmentation risk, of course)
A tight "grows only" structure might however be interesting when storage  
size is limited and esp. there's low or no update frequency on the cache  
(but mostly only real append)
One could simply follow the FAT approach then (violating a dozen patents,  
for sure) and just update the index to the appended data, keep track of  
the junk size and whenever it's too large or the entire cache approaches  
its limit, actively defrag it (ie. rebuild it, dropping the junk - less  
charming part would be to detach the clients iff one wanted to reuse the  
inodes to prevent disk fragmentation)
Regarding the cache file in question:
up to 3:C430 there seems an index, followed by a "FF" barrier until 5:0050  
followed by sparse data until 14:50E0 and then the file is "00" all the  
way up to 505:0050 (what is likely an outcome of the fallocate) - ie.  
there is really MUCH dead space, what's probably because the cache size is  
set to the theme's maximum demand - that's why i suggested splitting that  
up (if it was necessary to reduce the filesize for performance reasons) so  
that "much unused zero space" would not bloat the cache for no reason.
Cheers and sorry for the noise,
Thomas
    
    
More information about the kde-core-devel
mailing list