Why using Nepomuk as a contact store is probably not a good idea

Paolo Capriotti p.capriotti at gmail.com
Wed Jul 20 21:40:43 CEST 2011


Hey everyone,
as a new contributor to the project, I probably shouldn't be starting
off on such a controversial topic, but well...

I know the subject has been discussed over and over, but I feel there
are some points that haven't been touched, and I'd like to offer my
take on the issue, and hopefully convince you that there are
alternatives worth considering.

The issue with using Nepomuk as a contact store is that it is just not
the right tool for the job. Let me explain.

The idea of RDF and related technologies is to have a unified format
to express any sort of knowledge, with no limitations in scope and
structure, so that an automated tool can reason about it, process it,
and extract information.

The basic assumption that makes such an abstraction worthwhile is that
the data maintained in RDF *can* actually be processed by tools that
are agnostic to the particular type of data that is stored there.

That is not the case for contacts. The way we can use contacts data
stored in an RDF store is by writing *specific* contact-related APIs.
That completely negates the purpose of using a generic data store in
the first place.

That said, one might argue that even if Nepomuk isn't exactly designed
to solve this particular problem, its generality allows to adapt it for
the purpose. Let me explain why this is not a good idea:

1) extra complexity
2) unavoidable hard dependency on Nepomuk, which is highly undesirable
for many people (me included, as a user)
3) need to perform back and forth synchronization between the data
store and telepathy (and possibly future data sources, if we want to
add them)
4) a sort of impedance mismatch: RDF is designed as a format to express factual
truths, usually relatively static and not frequently subject to
change; using it for data as dynamical as a live contact list is very
questionable
5) concurrency issues (think of clients running on different machines)
6) all the usual consistency problems of using an intermediate layer
as cache, but without the freedom of manipulating it any way you want,
because other applications can access it
7) tons of potentially unsolvable performance problems, usually
related to point 4
8) all the use cases I can think of can be covered anyway: if
there is a need to have contact data available in nepomuk for
"semantic" applications to consume, there's nothing that would prevent
to export it even if we're using another data layer internally. I'm
talking about a simple one-way synchronization mechanism that doesn't
need to be real-time or efficient. However, I doubt there would be any
rational reasons for an application to prefer an RDF API to the domain
specific API that we would provide.

That said, I know that each one of those point is potentially
addressable. I'm not saying that this approach isn't possible. I'm
saying that all those pain points combined create trouble. Trouble
will come in the form of slow development pace, weird and hard to
track bugs, and performance issues.
So here's what I propose as architecture:

- Storage-agnostic API at the top level. Candidates: QtContact
(http://doc.qt.nokia.com/qtmobility-1.2/contacts.html), or roll our
own.
- Contacts store. Candidates: libfolks
(http://telepathy.freedesktop.org/wiki/Folks), or roll our own.
- TelepathyQt4 at the bottom.

Advantages of this approach (assuming we go for QtContact/libfolks and
not roll our own solutions):

1) code reuse
2) lots of work already done, tested, and used in production in other projects
3) no lock-in within any of the API: each component of the chain is
(in theory) replaceable.
4) no synchronization issues, or any of the other issues previously mentioned
5) no user-visible or "politically-loaded" dependency

Well, that's all for now. Hopefully this will spark some interesting
discussion. I've tried to keep each individual point brief, but I'm
happy to delve into those that are unclear or unconvincing.

Sorry for the long mail. :)

BR,
Paolo


More information about the KDE-Telepathy mailing list