rdbms file system

Tue Mar 25 03:51:45 GMT 2003

On Monday 24 March 2003 10:26 pm, Michael S. Mikowski wrote:
> Is anyone investigating an extensible rdbms file system for kde or linux in
> general? Here's what I mean:
>
> * rdbms       = relational database management system .
> * file system = data is broken into 'bulk' and metadata components
> * extensible  = metadata can be defined per application through means of a
>                 template.

Here is a more concrete example of the rdbms file system concept -- refered to 
here as MetaFS.  Again, please let me know if there is interest, or if this 
is redundant with existing frameworks.  Again, I have a "reference" 
implementation of this type of framework which can be reviewed today.

==== <example> =====
There is a fundamental problem in sharing data between apps.  Consider the 
email / datebook / address suite as an example: 

Today, when an app outside of the suite wants a contact, the user is often 
forced to open up the the address book to get it. In the case of KDE, 
kaddressbook might be opened for you. Or, in some cases (e.g. kpilot), the 
data file is parsed in the background (one would suspect there is a 
kaddressbook api which has been accessed in this case). 

But these all seem like hacks.  In all cases, the full addressbook must be 
parsed before any action can be taken on it. Because it is so commonly 
required, wouldn't it be better if that data were available through some kind 
of daemon? The MetaFS could provide this data at all times to all apps, 
searchable and randomly accessable. Ok, you say, we have LDAP, so that solves 
part of the problem. 

But what about other file types?  Lets say, for example, emails.  Isn't it 
silly to have an email client which presents all these messages in a 'file 
tree' format, yet one can't interact with them like normal files? In an 
abstract way, emails look like 'normal' files: metadata + bulk data (The 
metadata is the header fields like "to, from, cc," and perhaps the body text; 
the bulk data would include attachments). 

When you are done reading an email, why shouldn't you be able to put it in 
your file system? If you have extensible metadata, you can. And any other app 
could search and randomly access individual emails as required at any time. 
Imagine opening attachments in koffice directly from your email tree. Or 
interogating a file in your home directory to see who you received it from, 
on what date. 

Other examples might include a scanning application that stores scan setting 
data with the file. Or a digital photography app which includes the 
photographers name and picture date. 

Of course, one wouldn't want to get too crazy with the metadata.  As proposed, 
there would be an extensible listing of permissible metadata across the apps, 
and the MetaFS would ensure its clients provide only valid metadata 
parameters and values. Think of all the dot file kludges we use currently to 
support such metadata today -- these data are inconsistent (across apps) and 
very difficult to search.  This approach would ensure metadata fields mean 
the same thing across all apps!

The MetaFS concept should be much more efficient than current solutions. By 
spliting metadata apart from the bulk data, portions of the metadata can be 
held memory resident while all bulk data is remains on disk. Also, since we 
are sharing the MetaFS across multiple application, there only need be one 
image of this data memory resident at any given time, instead of multiple 
images held by multiple apps. Since the database is always running, there is 
no startup or parsing overhead. And, of course, one can use SQL-like calls to 
randomly answer all sorts of questsions, "natively" on the data from any 
application which uses the MetaFS. 

==== </example> ====

The MetaFS concept is a continuation of the Unix philosophy where 'everything 
is a file.' It just requires a more flexible file system to accommodate. 

Here is a mission statement: 
==========================================================================
We have a group of applications where we would like to share data. We would 
like to randomly access file data and metadata in a fast and consistent 
manner. We must be able to search by and use metadata using sql or sql-like 
calls. The metadata is not typically supported by the file system. Access to 
the data should always available through an open API provided by a daemon. 
The metadata fields must be restricted to a known (but extensible) list which 
defines the meanings of fields across appilcations. The daemon would validate 
metadata prior to storing data. Versioning of data would be supported.
==========================================================================