KDirModelV2, KDirListerV2 and UDSEntryV2 suggestions

Mark markg85 at gmail.com
Tue Feb 5 10:05:35 UTC 2013


On Tue, Feb 5, 2013 at 10:19 AM, David Faure <faure+bluesystems at kde.org> wrote:
> On Tuesday 05 February 2013 09:01:21 Mark wrote:
>> The thing i'm puzzling most with right now is how i can optimize
>> UDSEntry. Internally it's a hash and that very visible in profiling.
>> Also in KFileItem one part that i find a little strange is this line:
>> http://quickgit.kde.org/?p=kdelibs.git&a=blob&h=6667a90ee9e1d57488bb7e085167
>> 658f2fb9f172&hb=533b48c610319f3ad67e6f5f0cbb65028b009b8f&f=kio%2Fkio%2Fkfile
>> item.cpp (line 290). That line is causing a chain of performance penalties.
>> Which is very odd because i'm testing this benchmark with 500.000
>> files, not directories. It should not even end up in that if.
>
> You're reading the if() wrong.
> When used via KDirLister, KFileItem is constructed with a base URL and a
> UDSEntry. The base URL is the url of the directory (so urlIsDirectory is true)
> and the UDSEntry contains the filename (from the kioslave). So
> m_url.addPath(m_strName) is done, in order to construct the full URL to the
> file.

Ahh oke. It's not that obvious from the code. Thank you for clarifying that one.
>
> The thing is, KDirListerCache keeps all KFileItems in cache, for faster
> directory browsing (of already-visited dirs). So if you want to reduce memory
> usage, implement a LRU mechanism in KDirListerCache, to throw out the oldest
> unused dirs. Would help in real life -- not really in your testcase though
> (one huge directory).

My intention is to make it fast enough to not even need a cache.
Though i'm guessing that goal won't be reached since caching will
still be useful for slower media.
>
>> Or am i reading massif wrong..? Massif shows me that line for KUrl
>> data consumption.. One of the highest memory consumptions.
>
> Well, the kfileitems are kept around, and each kfileitem has a KUrl in it, which
> is kept too. I'm surprised that this would be the main use of memory though.
> Well, it's the biggest field in KFileItem, indeed.
>
> We could of course construct this KUrl on demand (so that the "directory" part
> of it is shared amongst all KFileItems, via QString's implicit sharing)...
> This would shift the balance towards "more CPU, less memory", so one would
> have to check the performance impact of such a change.

Just wondering - since this will likely be KF5 material when patched -
will this be any better with QUrl in Qt5? Or is QUrl just as "heavy"
as KUrl?

Also, lets discuss the memory usage a bit since that really shocks me.
I'm having a folder with 500.000 files (generated). All 0 bytes. The
filename is like this:
a000000.txt - a500000.txt
with the path:
/home/mark/massive_files/

Now if we do a very rough calculation that means one complete full url
looks like this:
file:///home/mark/a000000.txt
That line is 29 characters thus lets say 29 bytes as well. Lets say we
need a bit more then that for bookkeeping in QString, perhaps some
other unexpected stuff so lets make it 48 bytes (just to be generous)
If we multiple that by 500.000 we get:
48 * 500.000 = 24000000 bytes (22.8 MB)

Now lets be very generous and say that we need 5x the data to store
all other fields and bookkeeping.
22.8 * 5 = ~115MB (rounded)

This is uncompressed and seems very generous in terms of space. That
would be the memory usage i would expect for an unoptimized KDirModel
and friends. But as you can see from the screenshot, it's using a lot
more memory then this. What i'd like to know is if my logic from above
is what could (and should) be expected or am i missing something that
is eating up a lot more memory?


More information about the Kde-frameworks-devel mailing list