Should fast mime detection use a stat call?

Mark markg85 at gmail.com
Mon Jun 17 14:13:36 BST 2013


On Mon, Jun 17, 2013 at 4:13 PM, David Faure <faure at kde.org> wrote:
> Le lundi 17 juin 2013 09:54:02 Mark a écrit :
>> -- sending to kfm-devel as well --
>>
>> Hi,
>>
>> By now some of you have likely seen my reviewrequest [1] that greatly
>> speeds up fast mime detection.
>
> Sure, let's discuss the same thing in reviewboard, and on two different
> mailing-lists.... Can we pick one mode of communication and one channel, for
> any given issue?

Sorry for the spam. The intention was:
- Keep the reviewrequest for the code review.
- send a mail to this list + kfm (because of dolphin)

Sorry if that's a bit too much...
>
>> Issue 1: We overwrite the user specified "is_local_file" [2] line 321
>> If you look at the code from line 321 you see that we reset the user
>> provided "is_local_file" if we manually detected that
>> "url.isLocalFile()" equels to true. Now i wonder if that's the
>> intended behavior since we are resetting a user provided value. A
>> result of this (for local files) is that findFromMode is called from
>> line 192 (same file as [2]). That function is executing a stat call on
>> line 92 (again same file). So the question is: Should we want this
>> when the user explicitly asks us for fast results by setting
>> "fast_mode" to true and setting "is_local_file" to false. Even though
>> the files might be local ones?
>
> This is a very old optimization from the time where isLocalFile() was slow,
> due to calling gethostname. So is_local_file=true meant "you can skip calling
> isLocalFile", while false meant unknown, hence the call to isLocalFile().
> I cleaned that up for KF5.

Ahh, that clears things up. Thank you for the explanation.
>
>> Issue 2: Folders without a trailing slash are not detected as folders.
>> Oke, this issue is imho as it should be. If you provide something
>> like: "file:///home/youruser/somefolder" then it is impossible to
>> detect "somefolder" as a folder without doing a stat call.
>
> Correct. So keep the call, except when the mode_t argument contains S_IFDIR or
> S_IFREG already (as answered on reviewboard).
Thank you :)
>
>> I guess
>> that's why the above "Issue 1" exists in the first place since adding
>> a stat call will "fix" it. Fast mime detection with a trailing slash
>> "file:///home/youruser/somefolder/" for folders proved to be quite
>> accurate and very fast in my testing.
>
> Well, sure, if you do all the work before calling the method, the method has
> nothing to do anymore. But that's cheating, isn't it? In practice, all the
> URLs known by KIO, KDirLister, and so on, don't have a trailing slash.

yes/no :) The data is already there on the client requesting the
mimetype. It's just never used and send to KMimeType. Probable because
KMimeType has no function to do such a thing.
>
>> It's just that the user has to
>> provide that trailing slash. Something which the file chooser dialog
>> is not doing so i suppose more apps don't do that. My question now is:
>> can i expect the user applications to provide a trailing slash for
>> folders thus prevent the stat call and be ~13x faster then the current
>> fast mime detection? Or should i expect the user to be stupid and
>> neglect the trailing slash which means that i have to do a stat call
>> which drops the speedup to "just" ~5x faster? Doing the first means
>> fixing up apps that's don't provide a trailing slash for folders.
>
> This isn't only about users (but also about the large amount of existing
> code), and it's definitely not about stupid. The app developers will call
> *KMimeType* stupid if it can't figure out that /home is a directory.

True..
>
> Fast means "do not open files to read their contents".
> It doesn't mean "no stat call at all".

Right, that's basically an answer to the title of this mail. I wasn't
sure if fast should do one or the other.
>
>> What's your opinion on this? My intention is to have fast mime lookup
>> be really fast and don't do any stat calls at all which is the patch
>> in [1]. Stat calls even seem like a waste because you are very likely
>> to have just done a stat call just to get the file list.
>
> Sure, so let's ensure that the mode_t is passed all the way down from
> KFileItem (which knows the mode) to findByUrl (which can take the mode).

That would indeed be the solution to go for (Frank, you're reading
this, right?). Just setting that one mode_t thing will likely speed up
the existing mime detection quite a bit. I don't know if dolphin or
any app is actually setting that flag, but i bet they don't.
>
> That particular bit has changed "for the worse" (if we can call it that) in
> Qt5/KF5, due to the lack of portability of mode_t, but let's fix one issue at
> a time.
>
>> However, that
>> stat data isn't send to KMimeType so another way would be to require a
>> "KFileItem" which knows more about a file instead of a KUrl (QUrl in
>> kde frameworks). But then again, that requires app side changes...
>
> No, just ensure that the calls to KMimeType that you're looking at, get the
> mode_t from the KFileItem. I.e. don't change KMimeType itself (for that
> particular issue), change the callers to provide more information, and get
> more speed out of it. This way, other callers won't break.

Will do so.

@Frank, do you happen to know what dolphin does here? Is it passing
this mode property?

For the rest of this patch. I - sadly - didn't think about looking in
the frameworks branch to see if this would still be valid for that.
Which it isn't due to QMimeType (awesome work btw, david! Really good
to see that stuff in Qt!) so that means this patch is only usable for
the KDE 4.x cycle. The patch is a bit invasive and requires quite
thorough testing of apps that use mime types. So it won't be accepted
for 4.11 which is a long term KDE release. Is it worth it to simplify
the patch for some easy speed improvements or should i just close it?
Is there even going to be another release after 4.11?

I'm obviously going to benchmark QMimeType, just out of interest :).
>
> --
> David Faure, faure at kde.org, http://www.davidfaure.fr
> Working on KDE, in particular KDE Frameworks 5
>




More information about the kfm-devel mailing list