mimetypes for zipped files

Marc Mutz Marc.Mutz at uni-bielefeld.de
Sun Apr 7 00:31:24 BST 2002


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Saturday 06 April 2002 23:32, aleXXX wrote:
<snip>
> > Exactly. Maybe using a Content-Transfer-Encoding-like approach might
> > help? BTW: On ietf-822 there's currently a discussion about this (in the
> > context of mail, that is). Either we'll get a gzip CTE or a "compression"
> > parameter for the CTE header field.
<snip>
> Can you please explain a bit more, I dont understand.

MIME realm:

rfc2045 defines the cte and the corresponding cte header field for internet 
messages. The cte is used to indicate the encoding of the data. So, while 
currently only base64, quoted-printable and identity (as 7bit, 8bit, binary) 
are defined, there is discussion about defining a "gzip" cte. The problem is 
that you still need to specify the classical cte (in most cases, base64), and 
that would then be implicit in the "gzip" cte. So, actually
Content-Transfer-Encoding: gzip
would mean: first un-base64 it, then gunzip it.
The solution for this problem would be to introduce a "compression" parameter 
to the cte header field, ie. e.g.
C.-T.-E.: base64; compression=gzip
instead of the above. Problem here is that this would change the syntax of the 
cte header field, since it currently doesn't allow parameters.
My proposal, for which there is no reply yet, would look like this:
C.-T.-E.: gzip+base64
This keeps the original syntax (since the cte value is defined to be a "token" 
and '+' does not separate tokens), and solves the first problem. Only, it's 
ugly.

KDE realm:

The basic flaw of using mimetypes for identifying file types is that they 
recognize only the "outer" filetype. If you hava a tgz, you get 
application/x-gzip, because that's what it is.
Theoretically, you could defines a new mimetype application/x-tgz (and that 
has already been done - Ugh.), which says: "This is a gzip compressed tar 
archive."
But it's not. It's an tar file that happens to be compressed. Whether it's 
compressed with gzip or bzip or zip or gzip -2 or gzip -9 doesn't mattter for 
an application. It's still the same type of file. It's just _encoded_ 
differently. Just as we don't have text/plain-utf-8 and 
text/plain-iso-8859-1, but "text/plain; charset=utf-8", we should have
"app/x-tar; compression=gzip". At least, that's what I was planning for the 
registration of the KOffice mimetypes[1].

But that doesn't work b/c of backward compat. issues. Imagine a mailer trying 
to display a gzip encoded text/plain to the user.

So it has to go into the cte. And that is what is missing from KDE's mimetype 
handling. A CTE-like parameter that says: OK. this is a tar file, but it's 
compressed/encypted/signed with foo.

> I have it working here, konqy recognizes application/troff-gzip and loads
> the corresponding troff part, which unzips the file correctly.

And what do you do with bzipped/zipped/lha'ed/arc'ed/compress(1)ed 
html/xml/kword/manpages/jpeg/png/gif/xpm/whatever files? Define
text/html-gzip
text/html-bzip
text/html-zip
text/html-lha
text/html-arc
text/html-compress
text/xml-gzip
text/xml-bzip
text/xml-zip
text/xml-lha
text/xml-arc
text/xml-compress
application/x-kword-gzip
application/x-kword-bzip
application/x-kword-zip
application/x-kword-lha
application/x-kword-arc
application/x-kword-compress
application/x-man-gzip
application/x-man-bzip
application/x-man-zip
application/x-man-lha
application/x-man-arc
application/x-man-compress
image/png-gzip
image/png-bzip
image/png-zip
image/png-lha
image/png-arc
image/png-compress
image/jpeg-gzip
image/jpeg-bzip
image/jpeg-zip
image/jpeg-lha
image/jpeg-arc
image/jpeg-compress
image/gif-gzip
image/gif-bzip
image/gif-zip
image/gif-lha
image/gif-arc
image/gif-compress
image/x-xpm-gzip
image/x-xpm-bzip
image/x-xpm-zip
image/x-xpm-lha
image/x-xpm-arc
image/x-xpm-compress
? ;-)

Marc

[1] Unfortunately, last time I looked (a few months ago), the KOffice DTD's 
were not properly namespaced with something like 
xmlns:kword="http://www.kde.org/DTD/2001/koffice/kword-1.1.1.dtd",
so registering a mimetype with IANA against a moving and undocumented (in the 
above sense) DTD would just result in the world laughing at us, so I gave up 
on it for the moment. Seeing the vast number of obsolete (since at least 
1996) "x-" mimetypes being added to kdelibs/mimetypes, I also get the 
impression that the KDE project isn't interested in registering it's 
mimetypes with IANA under e.g. the vnd.kde.* tree.
E.g. KWord's mimetype should have been application/x-vnd.kde.kword+xml from 
the moment on where it was first used, and after registration with IANA and 
_before_ the first stable release, application/vnd.kde.kword+xml. See rfc2048 
and rfc3023 for more.

- -- 
Marc Mutz <mutz at kde.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8r4VM3oWD+L2/6DgRAsMcAKDf0x/Xa7owgbnN5BHokR3UC3cBbwCgzf9N
lm/i05wRRIacem8wQaHid4M=
=e6u/
-----END PGP SIGNATURE-----





More information about the kde-core-devel mailing list