Why using Nepomuk as a contact store is probably not a good idea

George Goldberg grundleborg at googlemail.com
Wed Jul 20 22:04:55 CEST 2011


On 20 July 2011 20:40, Paolo Capriotti <p.capriotti at gmail.com> wrote:
> Hey everyone,
> as a new contributor to the project, I probably shouldn't be starting
> off on such a controversial topic, but well...
>
> I know the subject has been discussed over and over, but I feel there
> are some points that haven't been touched, and I'd like to offer my
> take on the issue, and hopefully convince you that there are
> alternatives worth considering.
>
> The issue with using Nepomuk as a contact store is that it is just not
> the right tool for the job. Let me explain.
>
> The idea of RDF and related technologies is to have a unified format
> to express any sort of knowledge, with no limitations in scope and
> structure, so that an automated tool can reason about it, process it,
> and extract information.
>
> The basic assumption that makes such an abstraction worthwhile is that
> the data maintained in RDF *can* actually be processed by tools that
> are agnostic to the particular type of data that is stored there.
>
> That is not the case for contacts. The way we can use contacts data
> stored in an RDF store is by writing *specific* contact-related APIs.
> That completely negates the purpose of using a generic data store in
> the first place.

I think you need to go chat with the Nepomuk developers to better
understand the aims of that project. It is not quite as simple as just
what RDF was intended to be about.

>
> That said, one might argue that even if Nepomuk isn't exactly designed
> to solve this particular problem, its generality allows to adapt it for
> the purpose. Let me explain why this is not a good idea:
>
> 1) extra complexity

Perhaps, but entirely subjective

> 2) unavoidable hard dependency on Nepomuk, which is highly undesirable
> for many people (me included, as a user)

Well, this is the way KDE is going. If you don't like it, I suggest
you go to kde-core-devel and suggest that the direction of KDE as a
whole is completely changed.

> 3) need to perform back and forth synchronization between the data
> store and telepathy (and possibly future data sources, if we want to
> add them)

And this is a problem because? Note that as I explain further down,
nepomuk data is only synced to nepomuk, there is no writeback.

> 4) a sort of impedance mismatch: RDF is designed as a format to express factual
> truths, usually relatively static and not frequently subject to
> change; using it for data as dynamical as a live contact list is very
> questionable

Sounds like you aren't that familiar with the goals of Nepomuk, even
if you know far more than me about what the designers of RDF intended
it for. Please go talk to the Nepomuk devs more about what they are
trying to acheive.

> 5) concurrency issues (think of clients running on different machines)

Please elaborate. I don't understand what you are implying here.

> 6) all the usual consistency problems of using an intermediate layer
> as cache, but without the freedom of manipulating it any way you want,
> because other applications can access it

But with all the advantages of making information available to all the
other KDE applications which might want it...

> 7) tons of potentially unsolvable performance problems, usually
> related to point 4

Such as? Please don't just spout the same old "nepomuk is shit" stuff.
Substantiate these kind of claims if you want them to be taken
seriously.

> 8) all the use cases I can think of can be covered anyway: if
> there is a need to have contact data available in nepomuk for
> "semantic" applications to consume, there's nothing that would prevent
> to export it even if we're using another data layer internally. I'm
> talking about a simple one-way synchronization mechanism that doesn't
> need to be real-time or efficient. However, I doubt there would be any
> rational reasons for an application to prefer an RDF API to the domain
> specific API that we would provide.

So, this seems to indicate a misunderstanding of what KDE-Telepathy is
trying to acheive with Nepomuk integration. At this stage, there are
*no* plans to have any kind of writeback via nepomuk to Telepathy. If
you want to write data, use TpQt4. If you just want to consume it, use
Nepomuk.

>
> That said, I know that each one of those point is potentially
> addressable. I'm not saying that this approach isn't possible. I'm
> saying that all those pain points combined create trouble. Trouble
> will come in the form of slow development pace, weird and hard to
> track bugs, and performance issues.

I strongly disagree. I think that since pretty much every point there
is invalid, the sum total of the valid problems you have indicated is
far outweighed by the benefits of Nepomuk integration.

> So here's what I propose as architecture:
>
> - Storage-agnostic API at the top level. Candidates: QtContact
> (http://doc.qt.nokia.com/qtmobility-1.2/contacts.html), or roll our
> own
> - Contacts store. Candidates: libfolks
> (http://telepathy.freedesktop.org/wiki/Folks), or roll our own.
> - TelepathyQt4 at the bottom.
>
> Advantages of this approach (assuming we go for QtContact/libfolks and
> not roll our own solutions):
>
> 1) code reuse

You want to NIH Nepomuk (which is what the rest of KDE is working to
support), and you justify it with the code reuse argument?!

> 2) lots of work already done, tested, and used in production in other project

I find that pretty offensive given I've been working on Telepathy in
KDE for 4.5 years now in it's various incarnations.

> 3) no lock-in within any of the API: each component of the chain is
> (in theory) replaceable.

This is a valid point, but we decided that it wasn't worth it -
Nepomuk is here to stay as part of KDE, so it's not worth the extra
work having *another* abstraction layer.

> 4) no synchronization issues, or any of the other issues previously mentioned

Addressed previously.

> 5) no user-visible or "politically-loaded" dependency

Not interested in getting into an argument about this. Just look at
things like plasma active, KDE PIM etc and tell me that we're going
out on a limb here.

>
> Well, that's all for now. Hopefully this will spark some interesting
> discussion. I've tried to keep each individual point brief, but I'm
> happy to delve into those that are unclear or unconvincing.

In summary, I think that part of your concerns are addressed by a
misunderstanding of the way KDE Telepathy and Nepomuk interact - it is
read only from the applications point of view. They should be using
TpQt4 if they want to modify stuff (at least at this stage, although
we haven't considered any kind of write-back from Nepomuk yet). Also,
I think you need to read a bit more about how Nepomuk is being used in
KDE, whether or not this fits with the original intended goals of RDF
is not really relevant. Also, please remember that there are a lot of
advantages to us integrating with Nepomuk that you have completely
omitted to acknowledge - see my various blog posts over the years, and
e.g. Martin's Summer of Code project, and the other work going on in
KDE PIM at the moment.


--
George


More information about the KDE-Telepathy mailing list