mimetype for files with a wrong extension?

Boudewijn Rempt boud at valdyas.org
Tue Apr 10 12:42:05 BST 2012


On Tuesday 10 April 2012 Apr, David Faure wrote:
> On Monday 09 April 2012 12:41:52 Boudewijn Rempt wrote:
> > I'm trying to figure out why kmimetype returns image/jpeg for a png file
> > that was renamed to have a jpg extension (which apparently happens a lot
> > for real users).
> 
> Yes, we trust the extension, because this allows user to be in control, for 
> the case where magic-detection is incorrect.
> 

I see... For image files, the magic tends to be pretty robust and the users to be not always totally competent...

> If you want to detect misnamed files for mimetypes where magic can be trusted 
> (JPEG/PNG are indeed in that case), you can call findByContent and compare with 
> findByUrl, and warn the user if they differ (or just use the magic result).
> 

Okay, I've done that for Calligra now.


> > In the dox for KMimeType::findByUrl it's said that if only the filename is
> > used to determine the mimetype, accuracy is set to 80 -- but it's not, it's
> > set to 100.
> > 
> > diff --git a/kdecore/services/kmimetype.cpp b/kdecore/services/kmimetype.cpp
> > index 955bf62..74f371d 100644
> > --- a/kdecore/services/kmimetype.cpp
> > +++ b/kdecore/services/kmimetype.cpp
> > @@ -211,6 +211,7 @@ KMimeType::Ptr KMimeType::findByUrlHelper( const KUrl&
> > _url, mode_t mode, kWarning() << "Glob file refers to" << selectedMime <<
> > "but this mimetype does not exist!"; mimeList.clear();
> >                  } else {
> > +                    accuracy = 80;
> >                      return mime;
> >                  }
> >              }
> > lines 1-12/12 (END)
> 
> Commit that if you want [after running kmimetypetest] -- but I removed the 
> whole concept of the "accuracy" number in KF5.

Okay :-)

> 
> > When doing this I can at least see that only the filename was used. But I'm
> > wondering why checking the magic numbers for jpg, png and so on from the
> > content isn't given a higher accuracy, since if do a findByContent, I get
> > an accuracy of 50 for my file, but the magic numbers for image files are
> > pretty much completely reliable.
> 
> shared-mime-info says <magic priority="50"> indeed, which is the default 
> value. Higher values are only used to order magic things relative to each 
> other. This shows another reason why I want to get rid of these numbers in the 
> public API.

Hm, yes, I see. Thanks for the explanation!

-- 
Boudewijn Rempt
http://www.valdyas.org, http://www.krita.org, http://www.boudewijnrempt.nl




More information about the kde-core-devel mailing list