KIO directory listing - CPU slows down SSD

Fri Jun 6 10:42:50 UTC 2014

On Fri, Jun 6, 2014 at 11:10 AM, Aaron J. Seigo <aseigo at kde.org> wrote:
> On Monday, June 2, 2014 20:54:11 Mark Gaiser wrote:
>> On Mon, Jun 2, 2014 at 6:42 PM, Aaron J. Seigo <aseigo at kde.org> wrote:
>> > On Thursday, May 29, 2014 16:32:28 Mark Gaiser wrote:
>> >> On Thu, May 29, 2014 at 12:21 AM, Aaron J. Seigo <ase
>
>> But don't you just move logic to the slave that way?
>
> yes and no.
>
> sorting and grouping are easily abstracted. sort/group currently happens
> client side and doesn't care where the data comes from. so while moving the
> logic to the slave, it should be achieved in a slave-neutral fashion.
>
>> And lose flexibility in the apps using the slave (like dolphin?
>
> if they need to sort / group in some way not supported by KIO they can request
> the results unsorted and then you have exactly the same situation we have now.
> there would be no loss of flexibility.
>
> and really: how often do new sort / group methods appear in KDE file listings?
> very, very rarely. new methods could be also be moved to the ioslave side
> later on where they would be usable by all clients.
>
>> Oh and
>> complicating kio "a bit" to pass a sorting and/or grouping style.
>
> yes, it would make the protocol more complex.
>
>> Also, for a slave to give you the n items in a sorted way requires the
>> slave to fetch _all_ items to do the sorting.
>
> not all slaves, no. some protocols offer server-side functionality.
>
> the file slave (among others) would indeed need to fetch all items, yes .. but
> that is *precisely* what happens currently on the client side anyways. so
> whatever mechanisms are employed client-side could be done in the slave, minus
> the IPC overhead and with whatever benefits can be gained from using lower-
> level API in the slave.
>
>> All that you will save is IPC traffic. It might
>
> not necessarily; common use cases like "sort by name" may end up significantly
> faster if the slave can quickly gather file names and process them into order
> before stat'ing.
>
>> not even be faster. Take a slow ftp for example. Without "slave side
>> sorting" you would get your first results after 300ms, guaranteed.
>
> slaves dealing with slow data sources could do incremental updates. done well
> the worst case should be equal to the current typical case.
>
> i used the word "stream" and perhaps that was not the best word to use. it's
> more like synchronized models where population is informed by
>
> * sorting preference (inc "none")
> * grouping preference (inc "none")
> * offset (if any)
> * hard and soft limits (if any)[1]
>
>> With server side sorting you will have to wait however long the slave
>> takes to fetch _all_ items. Then you can do a (partial) sort and send
>> it to the client.
>
> if you want perfect sorting (e.g. no re-layouts) then this is true in the
> current case too: the client has to wait for ALL results to be able to sort
> items correctly.
>
> so let's assume that incremental sort is allowable and the streamed data can
> be updated by the slave. in that case one can view it as an incrementally
> updating window onto a dataset.
>
> example: 1000 items to be sorted by name with a normal distribution across all
> values in the names. the client is interested in at least the first 100 items.
> the slave fetches the first 100 entries, sorts them, and they are immediately
> streamed to the client.
>
> the slave fetches the next 100 entries and adds them to the sorted set. 10
> items (statistically that should be about the rate) are no longer in the first
> 100 items in the set; those 10 items are now streamed across with their index
> and updated indexes are sent for items at the previous indexes 89-99.
>
> rinse / repeat until listing is complete.
>
> now if sorting is done by name, this could result in even more speed-ups: only
> the items in the current window of results being sent would need to be stat'd.
> the rest of the items could be stat'd lazily (or when the window shifts)
>
> obviously, for a slave listing youtube results that would be entirely
> unnecessary: simply query the youtube API for the exact result set desired.
>
>> > KIO listing is all-or-nothing batch oriented; a stream-based approach that
>> > supports seeking through listings that are pre-sorted/grouped in the slave
>> > process would be moderately gorgeous. it would prevent more IPC than
>> > necessary and allow the slave to use any&all service-specific features
>> > for pre-sort/group of entries.
>>
>> So you would save the stream in the slave side?
>
> that would depend partially on the slave (a youtube video slave wouldn't need
> to do this; the file slave might). the client is also going to have a model in
> its process with the data that is populated, so there would always be data
> client-side.
>
>> But how would you
>> change the sorting if you just have a stream? Parse all data, sort it
>> and put it in a stream again?
>
> hard to say without some experimentation. my first thought would be to divide
> this into two cases: completed listing and listing in progress.
>
> in the case of a completed listing it would happen entirely client-side.
>
> in the case of a listing in progress, perhaps immediately re-sort/group the
> data already on the client side (for fast response) and then incrementally re-
> populate from the slave which may elect to hold on to all results until
> listing is completed (for quick re-sort/group if needed)
>
> a file manager might approach the streaming this way:
>
> * if the directory has fewer than N entries, stream entries until the end (can
> be done by simply requesting from the stream until N entries arrive, and only
> go beyond that number if required). for MOST directories this would result in
> a fast listing where items don't move around in the UI and allow changes in
> sort/group to happen client-side
>
> * if the directory has more than N entries, then seek around in the stream as
> needed and if a re-sort / re-group is requested then process what has been
> received already and wait for more responses. this is an edge case, however,
> as MOST directories should fall under the N cap
>
> a file manager for mobile devices would have a smaller N and a desktop file
> manager would have a bigger N.
>
>> > that's easy to do with QML. we have numerous examples of this in plasma
>> > active, in fact. the real trick is ensuring that your entries come to you
>> > pre- arranged so you don't show them moving around to the user.
>>
>> If you have a view - where you don't scroll - then you're right. Then
>> it's possible.
>> If you have a view where entries flow in _while_ you are scrolling..
>> well.. then you're screwed. The current listview just stutters when
>> you do a trick like that (and yes, my pc is fast enough). I'm guessing
>> it's doing a lot of re calculations and repaints when entries flow in
>> and you move through them..?
>
> or that the fetch / processing is happening in the same thread that is
> populating the QML. one of the common tricks for QML apps doing fluid image
> galleries from slow data sources is to fetch the image data in a separate
> thread and supply it as it comes in to the QML. then things very smooth.

We're drifting in awesome ideas now :)

Oke, one more case where i really wonder how that will work in the above ideas.
How would you do a slave side grouping and sorting?

To elaborate on that.
I want to be able to group my files (by type) and then have the
ability to sort each individual group. Dolphin has this partly. It can
group and sort, but all groups would be sorted in the same way (aka by
name). I have it in accretion exactly how i want where i can sort each
individual group, but that requires a bit of bookkeeping. But how do
you envision a sort like that on the slave side?

I even want to make that kind of grouping the default in accretion (or
at least for me as a user) simply because i find that a very
convenient way of displaying files.