Using KIO to retrive HTTP Headers [GSoC student help request]

Aish Raj Dahal dahalaishraj at gmail.com
Fri Jul 13 17:44:11 BST 2012


On Fri, Jul 13, 2012 at 8:18 PM, Dawit A <adawit at kde.org> wrote:
>
>
> On Fri, Jul 13, 2012 at 2:41 AM, Dawit A <adawit at kde.org> wrote:
>>
>>
>>
>> On Thu, Jul 12, 2012 at 11:16 PM, Aish Raj Dahal <dahalaishraj at gmail.com>
>> wrote:
>>>
>>> On Thu, Jul 12, 2012 at 9:09 PM, David Faure <faure at kde.org> wrote:
>>> > On Wednesday 11 July 2012 11:53:51 Aish Raj Dahal wrote:
>>> >> 1) Case One : When mimetype signal emitted by KIO::TransferJob is used
>>> >>
>>> >> In order to clarify more, let me take an example file
>>> >>
>>> >> https://github.com/ardahal/kio-learner/blob/ard-dev/metalinkHttp/metalinkHtt
>>> >> p.cpp . The given file uses the mimetype signal (at line 44) to get
>>> >> the
>>> >> headers as soon as the mimetype is emitted. The catch is, since we do
>>> >> no want the redirected HTTP headers but instead want the original HTTP
>>> >> headers, setRedirectionHandlingEnabled has been set to false. This
>>> >> program when run, does not emit the mimetype signal as all, and as a
>>> >> result the qDebugs at line 51 and 52 are never executed . This
>>> >> behavior is seen not only for URLs which redirect (like
>>> >> http://www.example.com ) but also for URLs which have no redirection
>>> >> (like http://www.google.com.np) .
>>> >
>>> > This is the part that makes no sense to me ;-)
>>> >
>>> > redirectionHandlingEnabled is a KIO::SimpleJob setting, the slave has
>>> > no idea
>>> > about that setting. If there's no redirection, then none of the code in
>>> > simplejob that checks for redirectionHandlingEnabled actually runs.
>>> > So it can't possibly make any difference for a URL without redirection.
>>> >
>>> > I think your testcase is a bit wrong: http://www.google.com.np
>>> > redirects. I
>>> > can see it in the konqueror debug output:
>>> >
>>> >  KonqRun::slotRedirection: KUrl("http://www.google.com.np") ->
>>> > KUrl("http://www.google.com.np/")
>>> >
>>> > So if you want to test a URL that doesn't redirect, add the trailing
>>> > slash
>>> > upfront.
>>> >
>>> > If you can confirm this, then we'll be down to: no http headers emitted
>>> > when a
>>> > redirection happens, which would be a kio_http issue. Dawit?
>>> >
>>>
>>> Thanks a lot for the heads up about the test case :-)
>>>
>>> It does indeed run well as expected with
>>> KUrl("http://www.google.com.np/") as the test URL. However for those
>>> URLs that do have redirection, no headers were emitted.
>>>
>>> Once again, thanks a lot.
>>
>>
>> I will try and clarify some things as much as I can:
>>
>> #1. Without some changes in kio_http, you will never see redirection
>> headers received from HTTP server. This can probably be addressed by
>> delaying the redirection request until after the HTTP headers have been set.
>> However, the last time I attempted to fix this, it caused a regression. See
>> bug#150904.
>>
>>
>> #2. When a redirection is requested, kio_http will never emit mimeType
>> signal because it is not yet known. This should be very obvious because a
>> redirection request is the server telling us the actual location of the
>> content we just requested. As such connecting to KIO's mimeType signal for
>> such circumstances is of no use.
>>
>> #3. If you do setRedirectionHandlingEnabled(false) in order to handle
>> redirections yourself, instead of KIO, then you have to connect to KIO's
>> redirect signal and retrieve the redirect URL. IOW, you have to do the same
>> thing you are doing in your "output" function from the slot connected to the
>> redirection signals.
>>
>> However, I suspect what you want to do is get any and all headers
>> including those that have to do with redirection requests. If so, then we
>> have to find a way for kio_http to set the HTTP headers before sending the
>> redirection request without causing a regression.
>
>
> Actually I take back what I said in #1 and the last paragraph. I just did
> some testing and you can indeed retrieve the redirection headers by simply
> connecting to KIO::TransferJob's redirection signal if you disable the
> internal handling of redirections. All you have to do is check for the
> "HTTP-Headers" meta-data in the slot connected you connected to the
> redirection signal.
>
> Also, although there are two redirection signals, "redirection" and
> "permanentRedirection", you need only connect to the "redirection" unless
> you want to keep track of permanent redirections.
>


Thank you very much for looking into the issue.

Now, as you've suggested using KIO::TransferJob's redirection signal
to connect to a slot and then query for "HTTP-Headers" metadata , I've
faced an issue.
Before I get to the issue here is the pastebin of what I'd doing to
test it http://paste.kde.org/517220/42196913/ .
The issue is that although I am able to verify the redirection URL (if
the site was redirecting of course), querying the KIO::Job for
"HTTP-Headers" metadata keys left me with a QString("") for result.
I've tried testing several URLs and had the same issue (which is quite
strange). I've also tried by setting as well as unsetting the
PropagateHttpHeader metadeta keys, which again had no effect on the
result.

I hope you'll look into this matter and provide your valuable guidance.

Thanks once again.

Regards,
Aish Raj Dahal




More information about the kfm-devel mailing list