creating a content system

Thu Aug 11 23:23:46 CEST 2005

El jue, 11-08-2005 a las 07:59 -0600, Aaron J. Seigo escribió:
> On Wednesday 10 August 2005 03:19, Manuel Amador wrote:
> > this is kind of hairy, but plain impossible with per-user daemons.
> 
> it's certainly possible with per-user daemons on the local system.

Oh, well, on the local system it is possible.  But what if the local
daemon is relaying a query to an NFS remote search server?  Is the
remote search server actually running?  Remember: this is per-user...
there is no sysvinit script starting that daemon.

>  they just 
> authenticate against a non-local, non-per-user daemon remotely. the "remote 
> daemon" could simply be an SQL server that responds with data which the local 
> search system then interprets.

I don't think the right answer would be to place everything in a remote
SQL server, and let the SQL server do the security isolation.  I toyed
with the idea, but discovered that since this is a problem with domain-
specific requirements, it was better to let search daemons work in a
peer-to-peer fashion, one daemon per system, and have them relay queries
among them when appropriate.

> 
> here are the steps necessary:
> 
> 0. storage
> 	use a proper database server and you get network access to the data
> 1. population
> 	this happens locally and gets plopped into a database
> 		given a good choice for #0 this database can be multi-user or user-specific

yes, it could be multi-user, but given the state of current database
daemon integration into the operating systems (I'm talking linux
distributions mostly) there would be no way for this to work seamlessly.

By seamlessly, I'm proposing:

0: storage
	use a local store which is most appropriate for the queries you
	are going to build.  A SQL store is definitely not appropriate
	for FTS and whatnot.  Do NOT expose the store via the network.
	Store it in /var/whatever.
1: population
	a system daemon indexes all available data.  Removable media,
	physically local mounted media.  At index time, plop as much as
	you can in the most query-efficient data.  Plop
	user-domain data separately if you choose.
2. querying
	applications contact the local daemon via a standard IPC
	mech, 	and a query language suited to the purpose.
	daemon queries the data, relaying queries to NFS servers as
	appropriate, via the same IPC mech, and relaying results 
	back. Daemons filter what users can see based on access(2),
	or using domain-specific algorithms.

It's trivial to know which servers to relay queries to: /etc/mtab.  It's
trivial to know which clients to accept: /etc/exports.  Added security?
Bolt on SSH, like the NX guys did.  Works perfectly.

Advantages:

a) one indexing, one pass, one store = efficiency
b) seamless deployment
c) network awareness
d) transparent to the user
e) zero admin overhead

Disadvantages:

a) index data all in one big blob, without consideration
   to security.  Minimize this by forking and setuid'ing
   to the owner of the file or data you are indexing.
b) search daemon vulnerable.  Should run as unprivileged
   account.

Now, as you can see, this covers both the single-user and the corporate
deployment use cases.

Doing this with a per-user SQL database would be nearly impossible.  In
your case, requiring admins/users to manually configure database
clusters and users in addition to their normal chores would be a show-
stopper.

Naturally, for a first iteration of Tenor, or Kat, or whatever, it's
good that we have a proof of concept and working technologies to address
the most common user case (single user).  But in my opinion, the above
considerations should be taken into account to serve, in later
iterations of your toolset, most of or all use cases.

> 2. index retrieval
> 	this part should be obvious by now ;)
> 
> _______________________________________________
> Klink mailing list
> Klink at kde.org
> https://mail.kde.org/mailman/listinfo/klink
-- 
Manuel Amador                   <rudd-o at amautacorp.com>
http://www.amautacorp.com/            +593 (4) 220-7010