[Kde-pim] akonadinext update: entity processing pipelines in resources

Christian Mollekopf chrigi_1 at fastmail.fm
Wed Dec 17 14:06:35 GMT 2014


On Wednesday 17 December 2014 14.49:53 Milian Wolff wrote:
> On Wednesday 17 December 2014 11:39:10 Aaron J. Seigo wrote:
> > hey :)
> > 
> > hows that for a subject line full of jargon? ;)
> > 
> > in the akonadinext repo, sychnronizer processes are approaching some sort
> > of early-stage completeness. they currently:
> > 
> > * accept connections from clients
> > * inform clients when the revision in the store changes[1]
> > * take commands from clients and respond with a task completion message[2]
> > * load Resource plugins to deal with storing data and synchronizing with
> > the source
> > * manage processing pipelines for entities
> > 
> > it's that last part that i'm writing about here, actually. a pipeline is
> > zero or more processing units (soon to be plugins) that (currently) sit in
> > a chain and do some processing on entities[3] whenever they are created,
> > modified or deleted. we will be using this to populate indexes, trigger
> > full text indexing, applying client-side filters, spam/scam detection,
> > etc.
> > etc.
> > 
> > anything that can be / should be done to a given entity when it appears,
> > changes or is removed will happen in these pipelines.
> > 
> > things left to do:
> > 
> > * make PipelineFilter pluggable so that it is easy for people to add new
> > filters (including ones we don't ship ourselves)
> > * generate a configuration scheme which Pipeline can use to populate
> > pipelines at runtime according to the user's wishes
> > * write a few PipelineFilter plugins that do some actually useful things
> > that we can use in testing
> > 
> > currently, pipelines are just a simple one-after-the-other processing
> > afair. It is set up already for asynchronous processing, however.
> > Eventually I would like to allow filters to note that they can be
> > parallelized, should be run in a separate thread, ??? ... mostly so that
> > we can increase throughput.
> 
> This, imo, will kill user-configuration. You do not want to burden the user
> with a GUI where he can define dependencies etc. pp.
> 
> Also, I cannot think of any common use-case of mail filtering that could be
> parallelized for a single mail:
> 
> a) first, "move" filters are checked, such as spam filters which either
> discard the mail or put it into a sub folder, or mailing list filters, which
> put the mails into a folder for the given list. when any filter is met
> here, the chain is stopped, thus it cannot be parallelized
> b) if the final place is found, and it is a folder that we want to have
> indexed, we feed it over to e.g. baloo. again, not something you can do in
> parallel, as you don't want to index spam mails, and also want to know the
> final place of the mail.
> 
> What other, _common_ usecase do you think of that would benefit from the
> additional design overhead?
> 

"filter" in this context are not what we currently have as client-side 
filtering. It's rather a "processor" if you will, that get's processed as new 
or modified entites are processed.

See also:
https://community.kde.org/KDE_PIM/Akonadi_Next/Terminology

Filter could be used for:
* indexing
* detecting spam
* client-side filtering (which is what you meant I think)
* ....

I basically allows us to plug in pieces of functionality that we can guarantee 
get processed before an entity officially enters the system.

So most of these filters will be fixed by the configuration shipped with the 
resource, not something the user can adjust. Some filter may be optional or 
react to user configuration such as client-side filtering, or optional full-
text indexing.

Cheers,
Christian

_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/



More information about the kde-pim mailing list