KIO directory listing - CPU slows down SSD

Wed May 28 22:21:56 UTC 2014

On Wednesday, May 28, 2014 21:12:43 Mark Gaiser wrote:
> You've written that with the assumption of backwards compatibility.
> It's a nice idea, but why should we even try to remain backwards
> compatible?

The question should be inverted: why *should* we break compatibility?

Least change always wins, because least change means:

* least effort
* retaining knowledge (of existing code people know)
* lower change of introducing new bugs

The question is never "why shouldn't we change things" but "why should we 
change things"...

> It's all internal to SlaveBase so we can change whatever
> we want if that suits a more optimized design.

If it is more optimized, yes.... which is why I did a bunch of measurements on 
the seek-back-and-set method to see if it was slow. (It isn't ...)

> Consider this patch (lives for a month):
> http://p.sc2.nl/pgfto3npy
> 
> I don't see any harm in introducing a second path for slaves to send
> their entries to the client. This new path from the patch above
> completely eliminates the UDSEntry object need from a SlaveBase point
> of view. It's then up to the slaves themselves to make use of it.

Yes, this much makes sense. It allows us to preserve the wire format while 
making a significant optimization...

preserving the wire format is particularly desirable as the format is not 
formally described anywhere ("the code is the documentation") and so i would 
consider it highly fragile.

> With some work the current listEntries(const UDSEntryList &list) could
> even be adjusted to the streaming based mechanism (aka, inject a
> "startEntry" to the stream right before it does "stream << entry"
> which would make all slaves use the new path.

Do you mean listEntry? listEntries is already taking a batch of UDSEntry 
objects so not much is won there. Modifying listEntry, however, to use the 
streaming method would be a win.

> Thinking about this again, i won't even need a "endEntry" call as long
> as each new entry starts with a "startEntry".

The last entry can be closed in the listEntries(const QByteArray&) method. 
Which means endEntry() is unnecessary even with the field count.

> I can then use that to
> detect the start of a new entry.

Not a great idea, imho. It means checking EVERY field read. It also means 
having to read the next field as if it *was* "startEntry" and then if it isn't 
re-reading it as a "normal" field. That means double deserialization .. but 
even worse, if a field happens to start with the same bit sequence.. ouch. No, 
field count is a good idea.

> also removes the need to know how many fields are present in a given
> entry and conveniently also removes the need to move back in a
> datastream :)

Moving back in the datastream is stupidly cheap. That was the entire point of 
the benchmarking I did :) There's no reason to try and avoid it ...

> Something else that comes in mind here is the current batching. Right
> now each batch is 200 entries max. But if we know the actual byte size
> (which we do when filling a stream) then we might as well change the
> batching to not exceed X number of kb.

Yep.

> That might be more efficient
> batching. We obviously keep the timer. Send all entries currently
> collected once 300ms has passed.

Agreed ... in fact, for the *first* buffer I'd make the timer even shorter to 
give the initial impression of speed. It's a complete hack and wouldn't really 
be any faster (the contrary), but I bet it would be noticeable in the GUI. 

(perhaps even a back-off algo -> first time-out is 50ms; if a buffer gets sent 
due to timeout then back-off to 100; repeat until you hit 300 .. easy to 
implement and should hopefully get first items to the client quicker)

> > soooo.. i think performance wise this should be just fine, seeing as one
> > would probably be sending buffers on the order of some 100s of KB rather
> > than 100s of MB ;)
> 
> I can always increase the benchmark from 500.000 files to even more if
> needed ^_-

Heh .. but you'd never send 500k files at a time ... :) This is all *per buffer 
sent*, so if you send 400MB of data but do so in 250k chunks this extra 
seeking still won't show up in the callgraph.

On the other hand, hopefully you can make this code fast enough that you'll be 
tempted to raise it to 500k files to REALLY push it. ;)

.. you'll also find some perhaps useful cleanups and optimizations in the 
aseigo/cleanups branch in kio. feel free to cherry-pick.

-- 
Aaron J. Seigo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-frameworks-devel/attachments/20140529/2bf45bcb/attachment.sig>