recursive archive reading

Jos van den Oever jvdoever at gmail.com
Mon Apr 10 09:41:03 BST 2006


On Sunday 09 April 2006 23:30, David Faure wrote:
> On Thursday 06 April 2006 14:22, Jos van den Oever wrote:
> > When thinking about a client to read the files after indexing, I came
> > across the class QAbstractFileEngine in Qt 4.1. This class has
> > functionality similar to that in kio.
>
> Not really since KIO's async and QAbstractFileEngine is sync, but ok :)
That's an important difference yes. One that is expecially obvious when 
browsing a directory with a large archive in there.

> This sounds good. This solution for nested archives allows direct seeks
> (unlike the old idea of ioslave chaining), which is good for performance.
> The path above is much like what kio_tar (the kioslave used for tar:/,
> zip:/ and ar:/) does (except that it doesn't support archive nesting at the
> moment).
Actually backward seeks are only partially supported. QAbstractFileEngine 
defines a isSequential() which must be implemented as {return true;} for the 
archives. This means forward seeking is ok, but backward seeking is not fully 
possible. You can make sure that you can go back a predefined amount by 
calling mark(readLimit).
The question of url to file mapping is a detail that the usability guys may 
work out. I'm partial to using simple addresses and letting the computer 
figure what file is actually there on access. It also makes the urls less 
ugly. This is sensitive topic that I'm not keen on discussing though, because 
it can be changed easily anyway.

> Well kio_tar uses tar:/tmp/code.tar.gz/src/main.cpp, but it's the same idea
> of course. I guess it makes sense to keep using a special protocol for it,
> because in some cases applications want /tmp/code.tar.gz to be a file, and
> in some other cases (e.g. the directory tree in konqueror),
> /tmp/code.tar.gz is seen like a directory that can be listed. So we could
> keep the idea of "an optional redirection from file:///tmp/code.tar.gz to
> tar:///tmp/code.tar.gz" as already implemented.
In fact, QAbstractFileEngine allows an entry to be a file _and_ a directory. 
This is the only way I can get the current QFileDialog to browse files and I 
think it makes sense and is easy. The context menu would be the sum of the 
menus for read-only directory and a read-only file. So both isFile() and 
isDirectory() are true. The remaining issue is then: what to do on 
left-click? This is something for the usability group to fight over and a 
discussion on that issue should not impede adding this functionality.

> Oh, this doesn't use KArchive but reimplements the support for Zip etc.?
> How about porting it to KArchive? Well, that's almost like a rewrite except
> for the qt-file-engine part of it. KArchive already has the notion of
> "input stream" since it can read from a QIODevice.
Yes, tar and zip support in this implementation are new code. I did look into 
using the kio code, but this code is not useable for streams. A QIODevice is 
a mix of a stream/sequential and a random access device. The KArchive 
implementation uses it as a random access device (seek()). This is not 
possible with a stream since reverse seeking is limited by the buffer size. 
For access to files in a nested archive you can rely only on streaming 
functionality only.

I'd be in favor of keeping two implementations: one with read/write 
functionality that relies on block access and one with read-only 
functionality that relies on streaming access. Of course some code might be 
shared. Since the code in the folder 'streams' will also go into clucene, it 
cannot have Qt dependencies (except for the unit tests). This is only a minor 
limitation, however.

For the random-access, the current KArchive code could be used.

> But in fact a real solution, for being able to read from archives in all
> apps and not just when opening the archive in konqueror first (to get the
> tar: or zip: protocol), would be the QFSFileEngine solution indeed, if the
> issue of "is foo.tgz a file or a directory" is solved somehow. But I don't
> know how it should be solved. In KFileDialog, when clicking on foo.tar.gz,
> how would we know if the user means "I want to select this archive as a
> whole", or "I want to enter this archive as a directory in order to select
> a file inside it". In directory tree views there is no problem (regular
> files don't appear anyway), but in iconviews it's more tricky.

As a reminder, the initial reason for writing the recursive streams was the 
need to index nested files. This requires two things: quick access to the 
file content and a way of addressing the file upon a hit in a query. For 
indexing the streaming implementation is ideal and I got it to work with the 
QAbstractFileEngine relatively easily.

The question remains if KDE4 should use something similar to 
QAbstractFileEngine and how this would fit with the current KIO 
implementations. I can imagine that one could write an async wrapper around 
QAbstractFileEngine .

Cheers,
Jos




More information about the kde-core-devel mailing list