[Kget] the content fetcher plugin proposal

Lukas Appelhans l.appelhans at gmx.de
Mon Mar 31 21:54:33 CEST 2008


Am Sonntag, 30. März 2008 20:54:29 schrieb Ningyu Shi:
> Hi everybody,
>     Just done with the proposal, any comments for the last minute update?
Looks cool to me :) Hope you get accepted :)

Lukas
> --------------------------
> =Project Details=
>
> ==The Kross Framework==
>
> In order to fetch specific content from a web page/website, we need a
> design to allow user to extract the content URLs from the web page and
> return it to KGet for downloading. Considering the complexity level of
> analyzing web pages, a user script system should be a good choice.
> Kross is a modular scripting framework that provides a complete
> framework to embed scripting interpreters like Python, Ruby and
> JavaScript transparently into native applications. In this project,
> Kross will be used as a bridge between script developer and KGet core
> which transfer URL to user script and transfer the analyzing result
> back to KGet.
>
> ==The Transfer-Plugin==
>
> KGet handles different kinds of download session via a transfer-plugin
> system. Once a plugin is selected based on the URL provided by user,
> it will be used to handle the specific download session. In this
> project, a transfer plugin handling content fetching will be
> implemented. The plugin will determine which user script to use based
> on a user supplied function inside each script, then run the chosen
> script to extract the content URLs and add them to the download list.
> All these will be done using the Kross framework on a standalone
> thread in case the user script may be doing something complicated and
> block the whole program.
>
> ==Post-treatment==
>
> After the user script extracted the content URLs, the script should
> add these URLs to download queue by calling certain addDownload(url)
> function. An Class will wrap the KGet::AddTransfer() call to provide
> the function and enable user options like download all URLs to a
> certain directory or let the user choose case by case. We propose this
> "call function" scheme because simply return the URL list to KGet may
> suffer several problems: A download session may contain tens of URLs
> and waiting for the end of script will be unacceptable. If we have
> another thread waiting for the script update the list variable,
> lock/synchronization issue will raise and add the complexity of the
> solution. So let the script call theaddDownload() function gives the
> developer more freedom to do their job easily.
>
> =Timeline=
>
> Phase 1, May 26 to June 29
> Coding up the plugin structure and make a dummy example works.
>
> Phase2, June 30 to July 14
> Implementing a Youtube/Google-Video flash movie downloader within the
> framework as an example.
>
> Phase3, July 15 to August 3
> Implementing the GUI stuff and a simple script management system.
>
> Phase 4, Remaining time
> Debugging, testing and polishing the project. Writing documents and
> maybe more example scripts if time permitted.
>
> =Personal Info & Experience=
> I'm a graduate student majored in ECE. My research is about electronic
> device simulation. I'm familiar with C++ and have done several
> nontrivial projects one of which is an device simulator which heavily
> use OO. I have read some tutorials on Qt and went through some toy
> examples which make me feel quite comfortable. I've written a tiny
> multi-thread content fetcher which use wget to do the download job.
> That project is written in python with an analyzing thread, several
> downloading threads and a GUI thread to update the status using PyQt.
> That project have certain week points. Wget can't download files
> concurrently and only allow one thread per download session, having
> bunches of wget processes is quite expensive. Python's multi-thread
> library is not efficient enough so sometimes the analyzing thread
> makes the whole program slow. However, these problems are well solved
> in KGet.KIO handles the download session so we can have multiple
> concurrent downloads. We have QThread to handle the thread stuff
> efficiently. Most importantly, KGet has much better user interface and
> is easy to extend.
>
> I can work for this project 10-12 hours per week, given that I still
> have to do my research during summer. I'm pretty comfortable to work
> with a mentor through emails and IM/IRC chats. I'm a native Chinese
> speaker and quite fluent in English, so communication should not be a
> problem.
>
> Among the various downloader project under Linux, I find KGet to be
> most promising which has a well designed object model and a decent
> codebase. KGet is under active development, so I can work with the
> developer team and the community to get this job done.
> ---------------------------------------
> Thanks




More information about the Kget mailing list