creating a content system

Wed Aug 10 23:04:05 CEST 2005

> moreover, consider when an email is deleted or an address book entry is 
> modified. should the indexer re-index that file? consider if that email is in 
> an mbox file rather than maildir. this is very, very inefficient compared to 
> the application simply saying "ok, this email is now gone."

yes, it is inefficient, but in absence of a standard mechanism to do
this, it's at least a thing to consider.  Even if there were a standard
mechanism, you won't have much success incorporating this kind of
mechanism in, say, Pine.  So you still need to provide a fallback method
for this corner (and lotsa other corner) cases.

> > >  - it only works on local files?
> >
> > it works on every media you can mount. we would like to extend it to NFS
> > and other protocols as well
> 
> as zack said, the idea here is to let the indexing happen on the NFS server 
> and then bridge between those indexes.

yeap, otherwise you'd have completely clogged NFS servers wherever your
indexer is deployed.

> i'd suggest using poplar, the new xpdf rendering lib. i think most things can 
> be dragged in via libraries.

nono, please don't drag things as libraries.  This brings dependency
hell into the mix.  Either do things with popen(2), or use .... damn,
the name of that thing escapes me... it's a dynamic linker that lets you
add soft library dependencies to your applications.  Basically you can
add any dependency you want, and the linker will try to load all
libraries and tell which ones did not load.  Sorta like a proxy library.

>  no need for source code tree duping or branches. 
> for html it will be interesting to look at tapping kdom. now, this won't work 
> in every case, but i think it can work a lot more than it currently is. for 
> simple formats like RTF, using an external app also seems a bit gratuitous. 

Just as a quick tip: these things I solved with regexps.  HTML parsing
is much faster and accurate that way.  In theory using KDom may seem to
be the correct route.  In practice, KDom will need to incorporate
intelligence to parse broken HTML files, a thing that's way simpler to
do with regular expressions.

> 
> but this isn't a design issue, it's an implementation issue. and i completely 
> understand why it is as it is right now: it's pragmatic and quick to get 
> going. fortunately, implementation issues are orders of magnitude easier to 
> address than design issues ;) and we can work on improving the fulltext index 
> plugins over time. i just wouldn't want us to consider them done because they 
> happen to work =)
> 
> > >  - i'm not sure how things like scheduling work, though i'm of the
> > > suspicion it could be better
> >
> > The actual scheduler sucks :-D
> > Our team mate Praveen Kandikuppa is working on its replacement based on
> > real load control.
> > This is a part of development where we would like to receive help.

what method are you using to detect and adjust system load?

> 
> ah.. can Praveen start a thread on this list discussing his start on this?
> 
> _______________________________________________
> Klink mailing list
> Klink at kde.org
> https://mail.kde.org/mailman/listinfo/klink
-- 
Manuel Amador                   <rudd-o at amautacorp.com>
http://www.amautacorp.com/            +593 (4) 220-7010