KFileItem (Re: Jenkins build became unstable: kdelibs_frameworks_qt5 #982)

Frank Reininghaus frank78ac at googlemail.com
Fri Aug 23 10:57:15 UTC 2013


Hi,

2013/8/22 David Faure:
> On Thursday 08 August 2013 13:17:18 Frank Reininghaus wrote:
>> Hi David,
>>
>> 2013/8/7 David Faure:
>> > On Tuesday 06 August 2013 20:53:05 Frank Reininghaus wrote:
>> >> OK, I see now that it uses pointers to be able to modify the actual
>> >> KFileItems in KDirListerCache (if it would just keep KFileItems and
>> >> modify these, I guess that they would just detach from the ones inside
>> >> the cache, leaving those unchanged).
>> >
>> > Yes.
>> >
>> >
>> > I suppose a solution would be to use a QLinkedList in KDirListerCache.
>> > This
>> > would basically cancel the Q_MOVABLE_TYPE for that particular list (since
>> > it would need to allocate nodes), but every other KFileItemList out there
>> > would still benefit.
>>
>> But it would also slow down things when KDirListerCache has to convert
>> its internal data to KFileItemLists when emitting signals. Hm, I'll
>> try to find some time to think about it and see if there is a good
>> solution. I was going to have a close look at KDirListerCache anyway
>> because I have the impression that the remaining sources of O(N^2)
>> behavior in Dolphin when adding/removing many items to a directory in
>> weird ways are in this class.
>
> Yeah, conversions would not be great. But I don't see where you think
> conversions would happen. The KFileItemList emitted by "itemsAdded" is created
> item by item anyway, in addNewItem(). It's not like we can take the list that
> is used as storage in KDirLister and emit that, since we only want to emit the
> *new* items, not all of them (everything is async and incremental).

OK, you're probably right. I had thought that copying the items from
one KFileItemList to another KFileItemList is cheaper than iterating
through a linked list, but that's probably wrong if the KFileItemList
only keeps pointers to the actual items. I'll try to find some time to
check that at some point.

>> >> It looks like a solution for this problem is more complicated than I
>> >> thought, so maybe I'll just revert the commit to make Jenkins happy
>> >> again. However, I still think that making KFileItem a Q_MOVABLE_TYPE
>> >> might be a desirable long-term goal because it saves quite a bit of
>> >> memory (32 bytes per KFileItem on my system).
>> >
>> > 32 *bytes* ? Are you sure? I think you meant 32 bits?
>>
>> Yes, I am sure, see below.
>>
>> > In fact I'm surprised. sizeof(KFileItem) = sizeof(void*), right? So QList
>> > should already lay out the kfileitems next to each other in memory, the
>> > only issue is that insertion makes copies, but I don't see where a memory
>> > saving happens. I thought QList only wasted memory for types bigger than
>> > a pointer (in which case it had to allocate nodes) ?
>>
>> AFAIK, QList only puts items of type T next to each other in memory if
>> sizeof(T) <= sizeof(void*) *and* T is a Q_MOVABLE_TYPE.
>
> Seems right.
>
>> If that is not
>> the case, QList<T> only stores T* pointers next to each other in one
>> block of memory.
>
> Yes, which means taking the size of one pointer more, i.e. 8 bytes per item.
> Not 32.

But the memory allocator needs a considerable amount of memory for
internal use, see below.

>> I guess the reason for this design decision is that
>> this permits QList<T> to share the same core code (which uses memcpy
>> to move around items in memory), no matter what T is.
>
> It's mainly an optimization: as you saw, when items are small enough,
> replacing pointers with the item itself is a saving, not only in memory usage,
> but also in speed since the data is much closer together (=less memory pages
> to load). But if the data is bigger than a pointer, then it doesn't fit in
> what was originally sized to contain a pointer.
>
> For putting large items in contiguous memory, there's QVector.
>
>> Also my experimental findings in
>> https://git.reviewboard.kde.org/r/111789/ support that. According to
>> massif/massif-visualizer, a simple program that creates a
>> KFileItemList with 1 million Q_MOVABLE null KFileItems needs about 8
>> MB (is to be expected because that's what you need for 1 million
>> pointers on my 64-bit system). However, without Q_MOVABLE_TYPE, i.e.,
>> with the current master or frameworks branch, it needs 16 MB because
>> the list only stores KFileItem* pointers in the 8 MB block, and the
>> memory for the items themselves is allocated with 1 million calls of
>> "new KFileItem" -> 8 MB more.
>>
>> However, this is only the net memory usage. In reality, the memory
>> allocator also needs some memory for its own bookkeeping, and it thus
>> adds a bit of overhead to any memory allocation with new or malloc.
>> With KSysGuard, I found that the difference in memory consumption for
>> my test program with/without Q_MOVABLE_TYPE is a bit more than 30.5
>> MiB, and if you take into account that 1 MiB = 1024*1024 bytes, and my
>> test used 1 million = 10^6 KFileItems, this looks pretty much like
>> every "new KFileItem" occupies 32 bytes. And this is a bit too much
>> for my taste ;-)
>
> It's not that I don't trust you or these tools, but I would like to see an
> explanation of the "32 bytes" from the code rather than from high-level
> measurements. I can understand "losing" 8 bytes per item, but not 32.

OK, I'll try to explain it with code. If I'm not mistaken, the memory
for the items in a QList is allocated using malloc whose source can be
found here:

http://code.woboq.org/userspace/glibc/malloc/malloc.c.html

A comment in that file [1] already says that the minimum allocated
size on a system with 8-byte pointers is 24/32 bytes (see below for an
explanation why it's not one fixed value). It defines a MIN_CHUNK_SIZE
[2] as

offsetof(struct malloc_chunk, fd_nextsize)

where the struct malloc_chunk is defined as [3]

struct malloc_chunk {

    INTERNAL_SIZE_T prev_size; /* Size of previous chunk (if free). */
    INTERNAL_SIZE_T size; /* Size in bytes, including overhead. */

    struct malloc_chunk* fd; /* double links -- used only if free. */
    struct malloc_chunk* bk;

    /* Only used for large blocks: pointer to next larger size. */
    struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
    struct malloc_chunk* bk_nextsize;
};

INTERNAL_SIZE_T is a size_t by default (8 bytes on a system with
8-byte pointers). The comments say that it can be redefined to a
4-byte int to reduce the amount of overhead (thus the choice between
24/32 bytes of mininum allocated size).

On a 64-bit system, this means that MIN_CHUNK_SIZE is 32 bytes, unless
the defaults are changed.

Finally, another comment [4] says that "The smallest size we can
malloc is an aligned minimal chunk" and defines

#define MINSIZE  \
    (unsigned long)(((MIN_CHUNK_SIZE+MALLOC_ALIGN_MASK) & ~MALLOC_ALIGN_MASK))

MALLOC_ALIGN_MASK is 0xff on a 64-bit system with default
INTERNAL_SIZE_T if I read the code correctly and makes sure that
malloc_chunks have an alignment of 16 bytes (which is fulfilled anyway
because MIN_CHUNK_SIZE is 32 bytes).

If I have not made any mistakes in my analysis, this proves that "The
smallest size we can malloc" is 32 bytes on my system (and those of
many other users).

I hope that you are convinced now ;-)

Cheers,
Frank

[1] http://code.woboq.org/userspace/glibc/malloc/malloc.c.html#108
[2] http://code.woboq.org/userspace/glibc/malloc/malloc.c.html#1232
[3] http://code.woboq.org/userspace/glibc/malloc/malloc.c.html#malloc_chunk
[4] http://code.woboq.org/userspace/glibc/malloc/malloc.c.html#1234


More information about the Kde-frameworks-devel mailing list