Should fast mime detection use a stat call?

David Faure faure at kde.org
Mon Jun 17 15:13:19 BST 2013


Le lundi 17 juin 2013 09:54:02 Mark a écrit :
> -- sending to kfm-devel as well --
> 
> Hi,
> 
> By now some of you have likely seen my reviewrequest [1] that greatly
> speeds up fast mime detection.

Sure, let's discuss the same thing in reviewboard, and on two different 
mailing-lists.... Can we pick one mode of communication and one channel, for 
any given issue?

> Issue 1: We overwrite the user specified "is_local_file" [2] line 321
> If you look at the code from line 321 you see that we reset the user
> provided "is_local_file" if we manually detected that
> "url.isLocalFile()" equels to true. Now i wonder if that's the
> intended behavior since we are resetting a user provided value. A
> result of this (for local files) is that findFromMode is called from
> line 192 (same file as [2]). That function is executing a stat call on
> line 92 (again same file). So the question is: Should we want this
> when the user explicitly asks us for fast results by setting
> "fast_mode" to true and setting "is_local_file" to false. Even though
> the files might be local ones? 

This is a very old optimization from the time where isLocalFile() was slow, 
due to calling gethostname. So is_local_file=true meant "you can skip calling 
isLocalFile", while false meant unknown, hence the call to isLocalFile().
I cleaned that up for KF5.

> Issue 2: Folders without a trailing slash are not detected as folders.
> Oke, this issue is imho as it should be. If you provide something
> like: "file:///home/youruser/somefolder" then it is impossible to
> detect "somefolder" as a folder without doing a stat call.

Correct. So keep the call, except when the mode_t argument contains S_IFDIR or 
S_IFREG already (as answered on reviewboard).

> I guess
> that's why the above "Issue 1" exists in the first place since adding
> a stat call will "fix" it. Fast mime detection with a trailing slash
> "file:///home/youruser/somefolder/" for folders proved to be quite
> accurate and very fast in my testing.

Well, sure, if you do all the work before calling the method, the method has 
nothing to do anymore. But that's cheating, isn't it? In practice, all the 
URLs known by KIO, KDirLister, and so on, don't have a trailing slash.

> It's just that the user has to
> provide that trailing slash. Something which the file chooser dialog
> is not doing so i suppose more apps don't do that. My question now is:
> can i expect the user applications to provide a trailing slash for
> folders thus prevent the stat call and be ~13x faster then the current
> fast mime detection? Or should i expect the user to be stupid and
> neglect the trailing slash which means that i have to do a stat call
> which drops the speedup to "just" ~5x faster? Doing the first means
> fixing up apps that's don't provide a trailing slash for folders.

This isn't only about users (but also about the large amount of existing 
code), and it's definitely not about stupid. The app developers will call 
*KMimeType* stupid if it can't figure out that /home is a directory.

Fast means "do not open files to read their contents".
It doesn't mean "no stat call at all".

> What's your opinion on this? My intention is to have fast mime lookup
> be really fast and don't do any stat calls at all which is the patch
> in [1]. Stat calls even seem like a waste because you are very likely
> to have just done a stat call just to get the file list. 

Sure, so let's ensure that the mode_t is passed all the way down from 
KFileItem (which knows the mode) to findByUrl (which can take the mode).

That particular bit has changed "for the worse" (if we can call it that) in 
Qt5/KF5, due to the lack of portability of mode_t, but let's fix one issue at 
a time.

> However, that
> stat data isn't send to KMimeType so another way would be to require a
> "KFileItem" which knows more about a file instead of a KUrl (QUrl in
> kde frameworks). But then again, that requires app side changes...

No, just ensure that the calls to KMimeType that you're looking at, get the 
mode_t from the KFileItem. I.e. don't change KMimeType itself (for that 
particular issue), change the callers to provide more information, and get 
more speed out of it. This way, other callers won't break.

-- 
David Faure, faure at kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5





More information about the kfm-devel mailing list