Using KIO to retrive HTTP Headers [GSoC student help request]
Aish Raj Dahal
dahalaishraj at gmail.com
Sat Jul 14 02:00:44 BST 2012
On Sat, Jul 14, 2012 at 12:34 AM, Dawit A <adawit at kde.org> wrote:
> Ahh... My fault. I should not have taken back what I said in point #1. It is
> indeed the case that the "HTTP-Headers" meta-data is NOT set when the
> redirection signal is emitted by kio_http. However, that is not a problem.
> Since internal redirection handling is disabled, you get a result signal
> right after the redirection signal. That is where you need to look for the
> headers. See the changes in attached files.
>
>
> On Fri, Jul 13, 2012 at 12:44 PM, Aish Raj Dahal <dahalaishraj at gmail.com>
> wrote:
>>
>> On Fri, Jul 13, 2012 at 8:18 PM, Dawit A <adawit at kde.org> wrote:
>> >
>> >
>> > On Fri, Jul 13, 2012 at 2:41 AM, Dawit A <adawit at kde.org> wrote:
>> >>
>> >>
>> >>
>> >> On Thu, Jul 12, 2012 at 11:16 PM, Aish Raj Dahal
>> >> <dahalaishraj at gmail.com>
>> >> wrote:
>> >>>
>> >>> On Thu, Jul 12, 2012 at 9:09 PM, David Faure <faure at kde.org> wrote:
>> >>> > On Wednesday 11 July 2012 11:53:51 Aish Raj Dahal wrote:
>> >>> >> 1) Case One : When mimetype signal emitted by KIO::TransferJob is
>> >>> >> used
>> >>> >>
>> >>> >> In order to clarify more, let me take an example file
>> >>> >>
>> >>> >>
>> >>> >> https://github.com/ardahal/kio-learner/blob/ard-dev/metalinkHttp/metalinkHtt
>> >>> >> p.cpp . The given file uses the mimetype signal (at line 44) to get
>> >>> >> the
>> >>> >> headers as soon as the mimetype is emitted. The catch is, since we
>> >>> >> do
>> >>> >> no want the redirected HTTP headers but instead want the original
>> >>> >> HTTP
>> >>> >> headers, setRedirectionHandlingEnabled has been set to false. This
>> >>> >> program when run, does not emit the mimetype signal as all, and as
>> >>> >> a
>> >>> >> result the qDebugs at line 51 and 52 are never executed . This
>> >>> >> behavior is seen not only for URLs which redirect (like
>> >>> >> http://www.example.com ) but also for URLs which have no
>> >>> >> redirection
>> >>> >> (like http://www.google.com.np) .
>> >>> >
>> >>> > This is the part that makes no sense to me ;-)
>> >>> >
>> >>> > redirectionHandlingEnabled is a KIO::SimpleJob setting, the slave
>> >>> > has
>> >>> > no idea
>> >>> > about that setting. If there's no redirection, then none of the code
>> >>> > in
>> >>> > simplejob that checks for redirectionHandlingEnabled actually runs.
>> >>> > So it can't possibly make any difference for a URL without
>> >>> > redirection.
>> >>> >
>> >>> > I think your testcase is a bit wrong: http://www.google.com.np
>> >>> > redirects. I
>> >>> > can see it in the konqueror debug output:
>> >>> >
>> >>> > KonqRun::slotRedirection: KUrl("http://www.google.com.np") ->
>> >>> > KUrl("http://www.google.com.np/")
>> >>> >
>> >>> > So if you want to test a URL that doesn't redirect, add the trailing
>> >>> > slash
>> >>> > upfront.
>> >>> >
>> >>> > If you can confirm this, then we'll be down to: no http headers
>> >>> > emitted
>> >>> > when a
>> >>> > redirection happens, which would be a kio_http issue. Dawit?
>> >>> >
>> >>>
>> >>> Thanks a lot for the heads up about the test case :-)
>> >>>
>> >>> It does indeed run well as expected with
>> >>> KUrl("http://www.google.com.np/") as the test URL. However for those
>> >>> URLs that do have redirection, no headers were emitted.
>> >>>
>> >>> Once again, thanks a lot.
>> >>
>> >>
>> >> I will try and clarify some things as much as I can:
>> >>
>> >> #1. Without some changes in kio_http, you will never see redirection
>> >> headers received from HTTP server. This can probably be addressed by
>> >> delaying the redirection request until after the HTTP headers have been
>> >> set.
>> >> However, the last time I attempted to fix this, it caused a regression.
>> >> See
>> >> bug#150904.
>> >>
>> >>
>> >> #2. When a redirection is requested, kio_http will never emit mimeType
>> >> signal because it is not yet known. This should be very obvious because
>> >> a
>> >> redirection request is the server telling us the actual location of the
>> >> content we just requested. As such connecting to KIO's mimeType signal
>> >> for
>> >> such circumstances is of no use.
>> >>
>> >> #3. If you do setRedirectionHandlingEnabled(false) in order to handle
>> >> redirections yourself, instead of KIO, then you have to connect to
>> >> KIO's
>> >> redirect signal and retrieve the redirect URL. IOW, you have to do the
>> >> same
>> >> thing you are doing in your "output" function from the slot connected
>> >> to the
>> >> redirection signals.
>> >>
>> >> However, I suspect what you want to do is get any and all headers
>> >> including those that have to do with redirection requests. If so, then
>> >> we
>> >> have to find a way for kio_http to set the HTTP headers before sending
>> >> the
>> >> redirection request without causing a regression.
>> >
>> >
>> > Actually I take back what I said in #1 and the last paragraph. I just
>> > did
>> > some testing and you can indeed retrieve the redirection headers by
>> > simply
>> > connecting to KIO::TransferJob's redirection signal if you disable the
>> > internal handling of redirections. All you have to do is check for the
>> > "HTTP-Headers" meta-data in the slot connected you connected to the
>> > redirection signal.
>> >
>> > Also, although there are two redirection signals, "redirection" and
>> > "permanentRedirection", you need only connect to the "redirection"
>> > unless
>> > you want to keep track of permanent redirections.
>> >
>>
>>
>> Thank you very much for looking into the issue.
>>
>> Now, as you've suggested using KIO::TransferJob's redirection signal
>> to connect to a slot and then query for "HTTP-Headers" metadata , I've
>> faced an issue.
>> Before I get to the issue here is the pastebin of what I'd doing to
>> test it http://paste.kde.org/517220/42196913/ .
>> The issue is that although I am able to verify the redirection URL (if
>> the site was redirecting of course), querying the KIO::Job for
>> "HTTP-Headers" metadata keys left me with a QString("") for result.
>> I've tried testing several URLs and had the same issue (which is quite
>> strange). I've also tried by setting as well as unsetting the
>> PropagateHttpHeader metadeta keys, which again had no effect on the
>> result.
>>
>> I hope you'll look into this matter and provide your valuable guidance.
>>
>> Thanks once again.
>>
>> Regards,
>> Aish Raj Dahal
>
>
I'm successfully able to retrieve the headers now.
Thanks a lot. :-)
Cheers,
Aish
More information about the kfm-devel
mailing list