mimetype guessing is fooled by extension

Allan Sandfeld Jensen kde at carewolf.com
Sun Jul 25 12:39:31 BST 2004


On Sunday 25 July 2004 13:14, Allan Sandfeld Jensen wrote:
> On Wednesday 21 July 2004 16:25, Luciano Montanaro wrote:
> > I created a very big file to test the file plugins (I noticed there were
> > problems earlier this year), and I have found that, at least, the c++ and
> > diff file plugin are tricked in a tight loop by it. I think this kind of
> > plugins should bail out on files of unreasonable length, however, another
> > issue is that the file was wrongly identified as a c++ file, while it
> > does not even qualify as a text file (I don't think '\0' a valid
> > character in a text file).
> >
> > "file prova.cpp" correctly says the file is a "data" file.
> > Can't the mime identification be made smarter, using the file extension
> > as an additinal hint instead of the only way to identify the file?
>
> Yes, by setting X-KDE-PatternAccuracy to <100.
> Notice that if you open the properties for the file, it will detect the
> content-mimetype more accurately.
>
> I will make take a look at the issue.
>
Oops. One major problem. The magic(content) detection code can correctly 
detect diff, c++ and c-files. Diff will work fine by setting 
X-KDE-PatternAccuracy as suggested above, but C and C++ is detected as 
"text/x-c++" and "text/x-c" which does not exists as mimetypes in KDE (has 
"text/x-csrc" and "text/x-chdr"). What is worse is that the magic-code 
_cannot_ detect the difference between headers and source, so we end up in 
situation where a combination of patterns and magic is needed to do proper 
detection. There is currently no way to do that.

A partial fix would be to add "text/x-c" and "text/x-c++" as valid mimetypes 
and let the "text/x-chdr"-type of mimetypes inherit from them. It would mean 
though  that a thourough mimetype detection (with magic) would leed to less 
accurate results than a fast mimetype detection (only with patterns).

`Allan




More information about the kfm-devel mailing list