KIO::ListJob::entries emits twice for folders with 100+ entries. Why?

David Faure faure at kde.org
Fri Nov 23 16:33:28 GMT 2012


On Friday 23 November 2012 16:26:59 Mark wrote:
> 1. SlaveBase::listEntry is really using wrong names and types to get
> something done. For instance, d->totalSize is a KIO::filesize_t type
> and is being used to measure a list count. It's in reality no
> "filesize" at all! That just adds to the confusion.

I understand your confusion. This actually documented, though:

    /**
     * Call this in get and copy, to give the total size
     * of the file
     * Call in listDir too, when you know the total number of items.
     */
    void totalSize( KIO::filesize_t _bytes );

Well, it was 2000 or so, someone thought reusing a method was a good idea :)
 
> 2. this is where things get really interesting! The actual
> file_unix.cpp listdir function seems wrong to me. [1]
> The function FileProtocol::listDir is basically having two distinct
> directory listings.
> 1 for the case where details are 0 (if)
> 1 for the case where the details are not 0 (else)

Right. And I'm to blame for the details==0 code.

Hmm, now I see the problem already: I forgot to call totalSize() in that code 
path. For speed reasons in that code, I didn't want to iterate multiple times, 
so I emit the stuff as it comes out of readdir...

> Both cases use a different "tactic". If the details are 0 then a
> folder is read, file by file, and a listEntry(entry, false); is done
> for each file. So here you do have possible IO time.
> Now if the details are not 0 then the tactic is different. The
> folder is read in it's entirety inside a list (line 397 of [1]). Till
> that moment it hasn't done any calls to listEntry(...) yet! Rather
> iterates through the list again and builds up the UDSEntry stuff on
> line 430 (also [1]).

Yes (because it wants to stat() each file, to provide full details).

> While the step for details = 0 is faster (visually) it does include IO
> time and is therefore probably slower in time compared to details > 0.

No, that is definitely wrong. They both start with the same I/O (readdir), and 
then for details > 0 you get a lot more I/O (stat for each file).

> The details > 0 is only using CPU time and is probably faster thus
> that's probably why i'm not seeing a second signal in the "details >
> 0" case.

Isn't it rather that the buffering in SlaveBase gets confused because it 
doesn't know the total number of items to come (the famous "totalSize" int), 
so it emits the items more often than in the details>0 case, where the number 
of items is known?

> My preference is restructuring FileProtocol::listDir to one of the two
> cases that it's doing now and adjust FileProtocol::createUDSEntry to
> handle details = 0 properly.

Moving code to createUDSEntry wouldn't give us much, since the whole point is 
"no stat if details==0", so basically it would be an if at the beginning of 
the method -- no behavior change overall.

If you want to let SlaveBase know about the total number of items to come, 
then you need to create a QList of UDSEntry in the details==0 code path, add 
the entries to the list, and then emit them. I don't like this, it uses more 
CPU and memory, all this just to give a total to SlaveBase.

I would prefer a fix that makes SlaveBase handle better the case where the 
total number of items isn't known.

Or I would just leave it as it is now, because honestly I don't see a big 
problem with entries() being emitted in two chunks, it's perfectly valid.

> What's your opinion here?
> Note: These issues that i find never stay on the surface do they..
> This one is diving very deep in KIO, the shortcuts one was diving very
> deep in KDE internals and even Qt..

Well, this is 2012. The easy bugs were fixed long ago, only the interesting 
ones remain :-)

-- 
David Faure, faure at kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5





More information about the kde-core-devel mailing list