Making KFileMetaData a framework

Tue Aug 12 18:37:16 UTC 2014

On Wednesday 06 August 2014 16:09:19 Milian Wolff wrote:
> On Wednesday 06 August 2014 16:05:27 Vishesh Handa wrote:
> > On Tuesday, August 05, 2014 11:19:16 PM Kevin Ottens wrote:
> > > Hello,
> > > 
> > > On Tuesday 05 August 2014 18:36:24 Vishesh Handa wrote:
> > > > I would appreciate it if everyone could review the code once before it
> > > > gets
> > > > into frameworks.
> > > 
> > > metainfo.yaml should have "release: false" until it's part of a KF
> > > release.
> > > Also it should be integration and not functional (relies on plugins).
> > 
> > Fixed
> > 
> > > The "test" folder should be named "tests" (see directory structure
> > > policy).
> > 
> > Fixed
> > 
> > > Public headers should use <> style include. Also it seems you're not
> > > following the k convention but the namespace convention for your
> > > classes, then the includes in public headers should be namespaced as
> > > well (e.g. <kfilemetadata/foo.h>).

For the record: not fixed yet (just checked).

> > > I'm not sure why ExtractorPluginManager is exported. A function in a
> > > namespace would be enough, there's no point in instantiating a manager
> > > by hand from the client code perspective, at best looks like leaking an
> > > implementation detail.
> > 
> > We could just expose a function which loads all the Extractors, but then
> > the client side would need to iterate over all of them and figure out
> > which ones match the mimetype they are targeting. I was hoping to avoid
> > that.
> > 
> > Currently, ExtractorPluginManager just provides a
> > fetchExtractors(mimetype) function. This cannot just be made a function as
> > one doesn't want to load all the plugins each time. Hmm, or maybe we just
> > save all the plugins in a static variable.
> 
> Nah, imo having the manager load the plugins is just fine. Otherwise you'd
> leak the plugins etc. pp. Imo, the manager is OK.

Right, good point. Harder to get right in a multithreaded context too. My 
comment there can be safely ignored. :-)

I don't like the "Manager" name in public API though, it's generally a sign of 
fuzzy responsibility (or opens the door to greatly dilute the responsibility 
of the class).

What about ExtractorCollection?

> > > Similarly the ExtractorPlugin naming is odd in the public API as it
> > > states it is necessarily a plugin (implementation detail). I'd rename it
> > > ExtractorInterface, I'd drop the suffix altogether or I'd keep
> > > ExtractorPlugin for plugin implementors while they'd be wrapped in
> > > Extractor instances on the client code side (I think that's actually my
> > > favorite solution).
> > 
> > Just to clarify -
> > * ExtractorPlugin - All plugins would derive from this. We could also call
> > it ExtractorInterface (I do like this name better).
> > * We have an Extractor class which is just a wrapper on top of the
> > ExtractorInterface which does ... ?
> 
> No, it's either Extractor or ExtractorInterface afaics.

Well, probably badly worded but I proposed two options which are kind of 
orthogonal.

Basically:
a) Rename ExtractorPlugin to ExtractorInterface;
b) Separate between the plugin side API and the client side API. In which case 
you'd have Extractor used by client code, wrapping subclasses of 
ExtractorPlugin (or ExtractorInterface, the name matters less in that case) 
inherited by plugin implementors.

You question was "which does ... ?", in first instance Extractor would 
probably relay 1:1 methods from ExtractorPlugin. The value of such a thing 
would be more "over time". Such a class would probably have no virtual method 
so you can extend its interface at will, while the one carrying pure virtual 
is basically frozen (can't add virtuals). If in the future you'd need more API 
it could be added right away on Extractor, and you could provide a 
ExtractorPluginV2 inheriting from ExtractorPlugin.

That'd mean plugin implementors would need to know which interface to target 
while client code wouldn't need to know, it is more future proof that way.

> > > AFAICT there's no reason for ExtractorPlugin to inherit from QObject at
> > > that point. Same thing for ExtractorPluginManager.
> > 
> > Fixed ExtractorPluginManager
> > 
> > ExtractorPlugin - This now requires every plugin to derive from both
> > ExtractorPlugin and QObject. This might be less friendly.
> 
> See below, making them QObjects with proper signals would make it easier to
> run them in a background thread.
> 
> > > The whole API is synchronous which we probably don't want. It'd be
> > > better to have an async API (much better to build up sync on top of
> > > async than the contrary).
> > 
> > Hmm, how would we do async? I thought people could just run it in another
> > thread / process if they want.

That opens the door of allowing people to make mistakes.

> > The only thing I can think of changing is that the plugins return the
> > result later via a signal instead of doing all the work in one function.
> 
> This would be good since it makes it trivial to use it then in a background
> Thread - essentially you'd put a worker object into some thread/task which
> then emits a signal with the data when its ready. This could be the Manager
> (which then automatically finds the correct extractor for the given
> mimetype), or a certain Extractor, if you know which one should be used.
> 
> Though note that the above does not mean that a call to extract would be
> async. It would still be sync. But you could use QFile and its readyRead
> signals internally to process stuff asynchronously and then emit the data
> once the end of the file is reached. If the signals are there (see above)
> this could be done without changing the API and thus could be done later.

Yep. Make the interface "async ready" so to speak. Don't let client code 
micromanage that themselves and get it wrong.

Note it probably means also introducing examples and tests for both cases. ;-)

> > > What about the thread safety? At a glance I would say
> > > ExtractorPluginManager is not
> 
> This should be investigated. Note that it would be fine to load the plugins
> in the ctor (which by definition is never thread safe).

Good point.

> Just the usage later to match a given data set/file against the extractors
> and to get some result out of that should be made thread safe.

Definitely. I wonder if that could even be made reentrant... with the presence 
of ExtractionResult it might even be possible.

But yeah, definitely the main points to investigate IMO as between async API 
and thread safety the API can be impacted greatly.

Regards.
-- 
Kévin Ottens, http://ervin.ipsquad.net

KDAB - proud supporter of KDE, http://www.kdab.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-frameworks-devel/attachments/20140812/be00ec98/attachment.sig>