KIO experimental work report, RFC

Andreas Hartmetz ahartmetz at gmail.com
Fri Nov 21 00:52:24 GMT 2008


Hi all,

long time no post from me, eh...
During, and as a continuation of, my SOC project I did some work in KIO that 
didn't make it for 4.2 due to the vagarities of networking and the need for a 
large amount of testing and also bugfixing that I didn't put the necessary 
time into, really. The changes I'm going to describe not only expose but also 
create bugs in the sense that things that were maybe not nice are flat out 
broken with them. Most importantly KIO users can exhibit stalls or errors (no 
idea where the errors come from ATM) if the number of ioslaves is effectively 
limited.

- HTTP pipelining for Konqueror, using a class called PipelineScheduler that 
  is a  KIO::SimpleJob and manages a list of jobs to be pipelined. After much
  debugging of the required functionality in the HTTP ioslave and in the 
  PipelineScheduler class itself... it turns out that some servers are broken 
  as hell. static.flickr.com is the worst example, it drops TCP packets 
  without comment if there is more than one HTTP request header in a packet.
  Probably a crap load balancer.
  Also (and this is somewhat surprising) a speed advantage only seems to exist 
  on high latency connections, so at least it's useful for mobile devices. And 
  yes, this is also true when using several pipelined connections to a server. 
  I'm saying this because I'd like to get one of the free N810, see :)
  More seriously, I don't know if pipelining is not as useful today or if my
  implementation is suboptimal. There are not that many tunables though.
  I've asked the Mozilla guys and they said they have dropped pipelining 
  because, paraphrasing here, it fills their bugtracker. Oh well...

- optional hard limits on number of jobs in KIO. Currently a job can be
  - scheduled: it won't cause more ioslaves to be created than the per   
    application per protocol limit allows. it will be scheduled only when
    there are no unscheduled jobs waiting.
  - unscheduled: the job *will get a slave* (if necessary a new one) and run 
    at the next opportunity. This is the default behavior if a job is not
    explicitly scheduled using KIO::Scheduler::scheduleJob().

  What's good is that there are two priorities. KHTML uses this to schedule
  e.g. important stylesheets for immediate transfer. What's bad is that 
  forking lots of ioslaves is not free and can cause slight hangs. Above some
  number around ten adding more ioslaves does not usually seem to improve
  network performance either.

  To fix the unlimited number of slaves problem -in participating apps, which   
  means Konqueror ATM- I created KIO::Scheduler::prioritySchedule() for 
  high-priority scheduling with limited slave creation. I've also made the 
  number of slaves per protocol *and* per host tunable, overriding .protocol 
  files. Note that per host limits are completely new.
  My modified khtml::Loader uses prioritySchedule() and I've also created a 
  simple KControl module for the two tunables, plus an "enable pipelining" 
  checkbox.

Note that real per user per remote host connection limits are recommended by 
the HTTP spec but almost no one implements them, so they are not really 
necessary and there is even more potential for coding bugs and other 
problems. That would mostly be an interesting programming exercise.

None of this stuff works reliably enough for wide release. Especially 
pipelining is barely usable due to the many ways servers can screw it up...
FWIW, dot.kde.org works great while our friendly competition is far too 
dependent on flickr's services :)
I'm hopeful about connection limits but I'd definitely like to hear more 
opinions, ideas and general input (see subject line!).
A sensible course of action could be to merge connection limits into trunk 
after 4.2 branching and be ready to revert them if too many problems abound 
or no benefit is seen. Pipelining can be merged as soon as it works well 
enough to make sense for somebody - it's disabled by default.
If an Opera developer wants to tell me their secret of pipelining problem 
avoidance, just drop me a mail :P [One part I've already noticed: Opera 9.5 
does not use pipelining if connections per host are not limited to less than 
~3.]

Patches will come in the next 24 hours, when I'm at home and feel like doing 
the necessary legwork.

Cheers,
Andreas




More information about the kde-core-devel mailing list