[PATCH] Request review for KSharedDataCache
Michael Pyne
mpyne at kde.org
Sat Apr 10 05:54:51 BST 2010
Hi all,
Most of you are probably familiar with crashes that have occurred due to
KPixmapCache. I've tried to make it better, and although it's actually better
designed than I first thought, some of the design choices and the fact that
it's needlessly specific to pixmaps led me to come up with a different
implementation of a shared-memory cache, that can support arbitrary data
types, hopefully with more performance and much more stability. I've attached
a compressed patch for your review.
I've worked on this off-and-on probably since about last fall. The basic
overview is that KSharedDataCache creates a file-backed shared memory segment
that can be used to quickly share data between multiple processes. To
coordinate access to the shared data, a pthread_mutex_t is used (I used
pthreads instead of Qt since pthreads natively supports inter-process mutexes
instead of just inter-thread. It would be possible to do a Qt-native spinlock
type of mutex without too much trouble I think).
The cache itself has to be split up somehow. KPixmapCache chose to use two
separate files, one for the index and one for the data, with the index using a
binary tree with offsets into the data file. This made removing entries kind
of a chore, so I went with a hash table arrangement. In my case I also
combined the index and data into a single file (the extra space used for the
index is not counted toward the cache size).
The non-cache portion (the index) is split into the cache header, the index
table (giving information on each cache entry such as size, hash code, pages
used), and the page table (giving information on each page of the cache and
whether it's allocated or not).
Because the memory is fixed-size (more-or-less), there is no usage of buckets
with chaining. Instead a consecutive set of pages is used to store new
entries, so it's still possible to get fragmentation, but it's much easier to
recover from.
Some of the design principles:
* It should be possible to leave the cache in a sane state without having to
do any special un-initialization. This means that I don't have anything like
an "attached client count" or anything similar. To detach from the cache the
process just unmaps the memory segment.
* Accessing memory should be done in as safe a manner as possible since errors
like unaligned memory access can crash a process on lots of architectures.
* When in doubt, kill the cache. I try to simply erase the cache and start
over if ever a problem is developed. This relies on the fact that it's
possible to unlink a file even if it's already mapped into shared memory
without causing a crash. It works for me, but if that's not true on some POSIX
systems out there I'd appreciate a heads up. ;)
* Arbitrary data should be allowed. For this I just store QByteArrays. (I've
included KImageCache, which is a simple wrapper that converts to/from QImage).
* It should be possible to remove cache entries without needing to generate a
whole new cache file. It should be possible to discard the cache without any
special magic. (Using hashing enables this).
* The cache may live a long time, so there should be a way of automatically
cleaning out very old entries if necessary. The way I implement this is, if a
hash collision occurs, the current load factor is examined. If the cache is
more than 50% full, then the use count of the currently existing entry is
divided by 2. The probability goes up as the load factor goes up. If the use
count dropped to 0 then the existing entry is replaced by the new one,
otherwise a quadratic probing procedure is continued. This way old entries get
booted eventually if they are interfering with new entries in a full cache,
but that doesn't happen if the cache is nearly empty anyways.
* There should be a fallback if shared memory doesn't work. (My only fallback
is to still use mmap, but with an anonymous private mapping instead of shared
memory).
* For troubleshooting the cache usage I've added an itemModel() method to the
cache, but I don't think it's good to have for the public API so I'll probably
remove it in the final version unless it's needed for a usecase someone can
think of.
The other major part of this patch is that I've converted the kdelibs users of
KPixmapCache to use KImageCache or KSharedDataCache instead. There's also
users in kdebase/runtime that I have patches for. I haven't looked elsewhere
yet.
Anyways, this is getting kind of long-winded. Although there's a couple of
issues I can think of that I forgot (references to KSharedPixmapCache, stray
references to icons, .h comments and API docs need improved, CMake checks for
pthreads, probably could use msync() after defragmenting or adding items,
etc.) I think it's at least in good enough shape to get some eyeballs on it
other than mine.
Please let me know what you think. Assuming this is good for KDE SC 4.5 I
intend to deprecate KPixmapCache (or at least strongly warn against it in its
API docs).
If you do try it and it freezes your session somehow, you can manually remove
the files by deleting /var/tmp/kdecache-$USER/*.kcache for most systems
($(kde4-config --path cache)/*.kcache in general).
I like mailing lists better for whatever reason but if it's helpful I can put
this on ReviewBoard instead.
Regards,
- Michael Pyne
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kdelibs-kshareddatacache.patch.bz2
Type: application/x-bzip
Size: 18217 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20100410/6cbccf76/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20100410/6cbccf76/attachment.sig>
More information about the kde-core-devel
mailing list