Review Request: Fix for connecting to FTP sites through HTTP proxy

Andreas Hartmetz ahartmetz at gmail.com
Thu Mar 10 10:44:58 GMT 2011


On Thursday 10 March 2011 08:14:04 Dawit A wrote:
> On Wed, Mar 9, 2011 at 6:32 PM, Andreas Hartmetz <ahartmetz at gmail.com> 
wrote:
> > On Wednesday 09 March 2011 23:43:23 Dawit A wrote:
> [snipped]
> 
> >> Yes, it is, but not for those reasons. That entry was not put there
> >> simply because we had an HTTP proxy implementation, but because FTP
> >> requests can be routed through an HTTP proxy server just like HTTP and
> >> HTTPS requests. IOW, what do you do when the user specifies a HTTP
> >> proxy server as the FTP proxy. It is legal and valid to do that. What
> >> is happening in KDE now is that functionality is not supported for
> >> some unknown reason.
> > 
> > Do you mean CONNECT? CONNECT is implemented in TcpSlaveBase via Qt
> > sockets.
> 
> Nope. I am not saying FTP over HTTP is like HTTPS. I was only saying
> that just like you can obviously send HTTP requests through HTTP
> proxies, you can do the same with FTP requests. Yes, you can send ftp
> requests through an HTTP Proxy server. The request line header will
> look some like: GET ftp://ftp.kde.org/ HTTP/1.1. The proxy server will
> then speak to the FTP server on the client behalf and return the
> response as HTML.
> 
> > CONNECT is entirely unlike real HTTP proxying (see below), it's very much
> > like SOCKS - it's application-level port forwarding and IIRC not even
> > standardized in a real Internet standard.
> > The HTTP ioslave only handles actual HTTP proxying, where you talk HTTP
> > to the proxy server and the proxy returns the data as if it was the
> > originating server.
> 
> First I am aware of how the proxy stuff works because largely I am
> responsible for its creation. ;) Second and more seriously, SOCKS v5
> is an Internet standard defined through RFC 1928. Though I have not
> read it, I am sure it
> 
Yeah, SOCKS is standardized, but CONNECT isn't AFAIK.
I'll repeat here how it works, for the three other persons reading this :)
CONNECT is a pseudo-HTTP request that you send to an HTTP proxy to make it 
forward a port, SOCKS-like.
After the CONNECT command the proxy just forwards data to (and from) the 
remote server until the connection is closed, it's not really HTTP anymore.

> > I don't see how the HTTP ioslave could help the FTP ioslave somehow
> > unless there is some scheme in which the client talks HTTP to the proxy,
> > which talks FTP to an FTP server, while the client still says it's FTP
> > to the user / sets up the whole thing as described if the user requests
> > FTP and an HTTP proxy is set.
> 
> See  my explanation above. HTTP ioslave does not help the FTP ioslave.
> The ftp request is sent to and handled by the HTTP ioslave. The FTP
> ioslave does not come into the equation for FTP over HTTP proxy cases
> at all. See the documentation on FTP support in your favorite open
> source http proxy server, e.g. Squid.
> 
Hm? That sounds exactly like the scenario I described. Note that I didn't 
mention ioslaves, just the protocols. Client talks HTTP to proxy, proxy talks 
FTP to FTP server.

> > If such a scheme exists a one-off hack somewhere would be appropriate.
> 
> That is exactly the wrong way to go about it for several reasons:
> 
> #.1 Assumption is a mother of all f??? ups, especially when it comes
> to core libraries like KIO. There is no way you can guarantee that FTP
> over HTTP proxy is the only thing that would require such
> functionality to make a one off exception. That is why it was
> implemented the way it is in the first place. If another protocol
> requires a similar change which would be easier to do ? Change code to
> add yet another one-off hack or simply add a "ProxiedBy=" entry into
> the ioslave's protocol file ?
> 
Well okay, it makes no sense to change something that already works / only 
needs some fixing. However I don't know of another such constellation in which 
an application protocol is transported over another application protocol.

> #2. Why change something that does not break or mess up anything in
> the first place ? You do understand that removing that parameter from
> the protocol file only takes away the optimization that was added to
> speed up the discovery of whether a request with one protocol (ftp)
> has the potential to be handled by an ioslave of different protocol
> (http). Its removal does not change the fact that the scheduler will
> still send the ftp request to the http ioslave. That decision is made
> by KProtocolManager::slaveProtocol. The property entry in the protocol
> file was meant to actually avoid calling latter slower function unless
> it is necessary to do so, alas the changes in
> KParts::BrowserRun::scanFile.
> 
> #3. If a user only has access to the Internet through a HTTP proxy,
> how would you suggest they access and download stuff from an ftp site
> ? I am sure you are not advocating that we implement HTTP proxy
> handling in the FTP ioslave.
> 
I was thinking of using CONNECT to talk real FTP. That doesn't seem to be the 
canonical way to do it, though...

> >> Since there is no input for specifying SOCKS proxy in the proxy
> >> configuration dialog, then I assume that the user is supposed to enter
> >> a non standard url, "socks://<host>:[<port>]", for specifying SOCKS
> >> proxy, correct ? If so, then ensuring FTP over SOCKS is not treated as
> >> FTP over HTTP is a simple one line fix in
> >> KProtocolManager::slaveProtocol.
> >> 
> >> BTW, as it stands right now the ftp protocol does not support FTP over
> >> SOCKS.
> > 
> > SOCKS proxy support should be transparent to ioslaves.
> 
> If that were the case, then FTP over SOCKS should be working right now
> and the HTTP protocol should not be setting the SOCKS address in
> HTTPProtocol::resetSessionSettings. You remove that and HTTP over
> SOCKS would not work either. Even then SOCKS support for HTTP will
> only work for those users that somehow figured out where and how to
> specify the SOCKS proxy server address.
> 
> > FTP has the unique problem that it needs two ports to work, but
> > TCPSlaveBase only allows one network connection per ioslave. I think
> > this should be solved by using a KTcpSocket for the data connection, if
> > necessary with manual tweaks to use the right proxy settings, in the FTP
> > ioslave code.
> 
> That is probably why the FTP protocol does not inherit from SlaveBase
> instead of TCPSlaveBase and uses KSocketFactory to handle its own
> connections.
> 
That might be a problem in the SOCKS case. I haven't touched KSocketFactory to 
make it use KDE's proxy settings.

> > > proxiedBy should probably be per proxy protocol now.
> >> 
> >> Sorry, but that would only serve to complicate proxy handling much
> >> more than the big mess it already is.
> > 
> > It's not that messy: network-level proxies are handled by TcpSlaveBase,
> > application protocol-level proxying (of which there is only HTTP) is
> > implemented in protocol ioslaves.
> 
> Well let us see...
> 
> * The fact that some types of proxies are handled by Qt's networking
> code, e.g. CONNECT & SOCKS, already have caused some issues. For the
> former case, some people want to spoof the user-agent string sent to
> their proxy server because it filters connections based on it.
> However, since we use Qt's networking code that is not possible and
> the trolls have no desire of fixing that.
> 
They'll probably take merge requests for that. This shouldn't be hard.

> * FTP over anything else except an FTP proxy is completely broken.
> 
Yeah, I believe that. You explained why it's broken over HTTP, and it's broken 
over SOCKS because KSocketFactory doesn't use the proxy settings.

> * Unless you are one of those users that like figuring out things,
> HTTP over SOCKS might as well be broken. kio_http uses uses
> QNetworkProxy::setApplicationProxy to set proxy information instead of
> setting proxy per socket basis.
> 
> * Because all of the proxy related classes in Qt were designed with
> QNetworkAccessManager in mind, neither HttpCachingProxy nor
> FtpCachingProxy will work on a socket
> level. That means a mix and match of proxy setups in HTTP ioslave.
> 
HttpCachingProxy is when you talk HTTP to an HTTP proxy, and if an ioslave 
talks HTTP it will be the HTTP ioslave, so it's in our control.
As discussed above, this is or should be the case for both HTTP and FTP over 
HTTP proxy.
So we're not using HttpCachingProxy anyway in ioslaves.
A "non-caching" HTTP proxy is an HTTP proxy with CONNECT.
I've actually looked this up in the QNetworkProxy documentation, so unlike 
most of my other drivel it's based on facts :)

> Actually that last reason and the fact that applications are allowed
> to alter proxy setups through KIO's metadata system is what prevented
> me from actually attempting to implement my own QNetworkProxyFactory,
> hooking it up to use KProtocolManager and setting it using its static
> setApplicationProxyFactory function. That way none of the ioslaves
> have to deal with proxy management. They simply have to make sure they
> connect to the appropriate authentication signals from the socket and
> listen to those. Oh well...
> 
> > After writing the above questions and answers I think proxiedBy should
> > stay dead.
> 
> That would be wrong. See item #2 from from my response above. Anyhow,
> I will probably get to around to fixing FTP over SOCKS as well since
> the authentication it requires
> is not as bad as that of the HTTP protocol.
> 
Yeah, I was wrong.

> Regards,
> Dawit A.




More information about the kde-core-devel mailing list