creating a content system
Aaron J. Seigo
aseigo at kde.org
Wed Aug 10 02:16:58 CEST 2005
hi..
so we have kat which has lots of code.
we have tenor which has lots of design.
what we need is a content system; something that can provide a back end to
things like:
kfind
a content manager app for kde4
media applications (think of all the context stuff in amarok)
content-centric applications (kpdf, kword, etc)
there are four layers to be considered, from bottom to top:
0. storage
1. API
2. population
3. user interface
i think where kat shines right now is that it is addressing #2. the version in
svn right now is a lot better it seems than the previous released versions
i've tried. that's good. i've said right from the start that #2 is a valid
project in and of itself, really, and something that all of these types of
systems need. i'd like to see us collaborate start with the population
mechanism.
there are many problems with the current population mechanism in kat. these
include this like:
- catalogs don't have individual stop folders (at least not that i can find)
- it searches hidden folders by default
- it apparently doesn't take into consideration FD.o conventions such as
thumbnail directories (correct me if i'm wrong on that one?)
- it only works on local files?
- it relies on a lot of helper apps; i wonder at the overhead of that
- i'm not sure how things like scheduling work, though i'm of the suspicion
it could be better
- which leads me to: it needs documentation. i will not support such a
complex system that does not have extensive documentation for its design. API
docu is not enough, though it is VERY nice to see extensive API docu
available.
it doesn't take over the CPU quite like previous versions did for me, however,
and that's a nice stride forward.
this leaves us with the other pieces:
0. storage
sqlite is not a good solution here, IMHO, because:
- it's too slow for doing anything resembling an interesting query
- it's not network aware
the schema should be context centric, not content centric
- my original schema proposal, which seems to have been swept aside in the
last tenor update into playground divided the database into two sets:
- contextual linkage
- content indexing
i think we'd be best served by having each of these separate since each
require slightly different semantics when it comes to processing and
subsequent searching
1. API
the Kat API needs work. the Tenor API as it was shaping up was really far more
interesting. searching is far, far more than "look for this blob of text" and
API design is a bit of art. Scott is quite good at this (cf taglib). with a
dual context/content storage facility, it should be quite possible to design
a search and navigation api that maps both to kat's idea of searching and
tenor's idea of contextualization. i'd take Kat as a prototype consumer of
search and create the Tenor API in a manner which services it.
this means the Kat author(s) need to clearly state their goals for search.
i've seen various terms bandied about, e.g. computational linguistics, which
need to be well scoped for this part of the project.
3. user interface
this can wait. 0-2 need to be done first.
--
Aaron J. Seigo
GPG Fingerprint: 8B8B 2209 0C6F 7C47 B1EA EE75 D6B7 2EB1 A7F1 DB43
Full time KDE developer sponsored by Trolltech (http://www.trolltech.com)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.kde.org/pipermail/klink/attachments/20050810/558f98db/attachment.pgp
More information about the Klink
mailing list