Proposal to introduce a KDE::stringCache
Dr. Juergen Pfennig
info at j-pfennig.de
Thu Feb 26 18:17:01 CET 2004
Hello List,
I am locking for discussion - but I am not (yet) a subscriber of this list. If
you find that it is architectural stuff I could re-send it to kde-core-devel
## The story:
This morning kde-core-devel had some discussion about memory consumption by
kfm when reading very large folders (there was actually an out-of-memory
problem). The thread-starter said that each file entry uses 600 bytes.
How can this be? Answer: an entry uses lots of QStrings and each QString
has 2 (sometimes 3) blocks of memory allocated. Such strings are: FileName,
Group, Owner (maybe more). Each of the mentioned strings uses maybe 50 bytes
x 3 = 150. Here we can help.
## Similar things:
When a KDE app starts it is likely that a kdDebug or kdWarning causes the
kdebugrc file to be loaded. That file often contains tens of entries like:
[7102]
InfoOutput=4
Each "InfoOutput" ends as an allocated QString and is kept loaded as long as
the app runs. This can easily allocate a couple of kBytes.
## Proposal:
In the above examples each QString was created separately - in other words
even if they contain the same data like "root" or "InfoOutput" this data is
not shared. Each QString has it's own copy. The information is redundant, we
increased the entropy without need.
For such situations we should create a little cache that tracks maybe the last
64 QStrings that were created. If we try to create a new QString having data
that is still known by this cache, the cached QString-data should should be
referenced (QString works internally with references). This saves one memory
block for each match. Example:
atom.m_str = KDE::stringCache(source_unicode)
or even better:
atom.m_str = KDE::stringCache(source_latin, length)
The implementation could use a fast hash algorithm to locate the cached string
data. Often QStrings are created from latin1 (most KIO-Slaves do this) - the
hash should be made such that it can be calculated from latin1 or from
Unicode (use the lower 7 bits only). The cache might also store the latin1
data (implement carefully - don't call latin1 of the QString). As a result
the cache could not only save memory but also speed up string creation (no
need to convert latin1 -> Unicode, one malloc less).
In situations to use repeated local data like parsing configuration files or
directories (KIO-Slaves) such a cache could not only save lots of memory but
would also give a speed improvement.
Ok, how can now prove that I am an idiot?
Yours Jürgen
More information about the Kde-optimize
mailing list