Fwd: follow up from our talk / responce to your paper
Scott Wheeler
wheeler at kde.org
Thu Oct 21 02:55:00 CEST 2004
Hmm, tried sending this half an hour ago and it never went through, trying
again.
-----------------------------------------------------------
Hi folks -- I'm now back from my vacation in Chile. I ended up doing one talk
on this stuff at one of the universities there, and as it turned out Ricardo
Baeza (author of Modern Information Retrieval) is a professor at another
university in Santiago (Univerciudad de Chile) and invited me out for coffee
while I was there.
He recommended one of his papers to me and asked that I respond to it. I've
put a copy of it here:
http://developer.kde.org/~wheeler/files/baeza-1.pdf
The stuff below is my responce to the paper and our conversation.
I'm just starting to get back into working on KLink and whatnot. I'd like to
set up a mailing list so that I don't have to dig out the list of CC's in the
future. The list would be public, but I don't have any plans on announcing
it as I'd like to keep it fairly quiet at the moment. Is there anyone
currently getting this that would prefer to not be added to the list?
Cheers,
-Scott
---------- Forwarded Message ----------
Subject: follow up from our talk / responce to your paper
Date: Tuesday 19 October 2004 13:25
From: Scott Wheeler <wheeler at kde.org>
To: rbaeza at dcc.uchile.cl
Hi Ricardo --
Ok -- so I finally am getting settled back into my normal life and got around
to reading your paper. As you probably expected, there are many points which
I find interesting and probably worth some commentary.
First, I'll start with what's in fact clear and common in the things that
we've been looking at. We're both looking at ways to move the desktop
towards a search centric interface -- and what some of the steps are in
making that meaningful. We're both also focusing on relationships between
sets of attributes and "data" (though you're working more on blurring those
lines). The other thing that really resonated with me was the fact that it
was repeated a few times that the current method or organizing information
completely based on arbitrary bit of a user's memory is completely broken.
I've repeated something very similar in my recent talks.
I guess there are a number of observations that seem relevant -- some
academic, some practical -- and I'm probably freely mixing concepts from
different parts of the paper. Hopefully I'll manage with enough clarity to
be useful.
=== Approaches ============================================
I think the first notable different we're starting from different places. One
of the the things that you're explicitly building on is the idea of "what if
we could throw everything away [...]". I think a lot of the ideas there
hinge on getting things right at the lower layers.
From my side I've worked backwards from the interface. I've taken the
perspective of "What do I want to see in the interface?" and worked backwards
from that question and towards building the necessary bits of technology to
make something like that interface possible.
I've also focused on the idea of relationships of information, because at
least for the set of interface ideas that I'm working with that's the most
curtial element. Of course from an implementation standpoint the system that
I'm working on maps more naturally to an AVS than an HFS, but because I can't
work from the perspective of being able to throw everything away, I'm working
at a higher level and looking at what can be bolted onto the current HFS
structures that will make contextual navigation and search more natural.
=== Domains ===============================================
Because I'm working with a web analogy -- represented as a graph -- there's
also the notion of domains, called a NodeGroup in the current API, that
represents a similar construct.
However, one of the things that I've been thinking about there -- and this
applies to domains as well is if that's best left as an emergent property of
a graph of connected information rather than an explicit grouping. I'm not
even sure if that's possible or practical in the set of applications that I'm
looking at, but insofar as I've been attempting to use the WWW as an analogy
to the infrastructure that I'm creating, I'm tempted to say that domains are
almost ready to be deprecated.
I think more of what we find on the web are more dynamic groupings based on
contextual linkage that tend to define much looser "domains" -- I think the
notion of a grouping of information is still useful in the abstract, but I'm
not yet quite certain that explicit grouping is an idea that's going to stay
around indefinitely. At least on the web at the present moment explicit
domains have largely become useless whereas tightly coupled bits of
information that emerge from a set of relationships much more often
represents a conceptual "domain" for a given set of information than an
explicit grouping.
I think our current representations of Domains (or for us NodeGroups) solve
similar problems -- the represent logical groupings of information -- we've
introduced them to solve things like mime-type associations or other
information that's necessary, but I've already been wondering if the need for
such is indicative of weaknesses elsewhere in the framework.
This is one that I'm still not sure on, but I'd be interested in your opinions
on.
=== Domains & Documents vs. Objects =======================
Another thing that I wondered about in your paper was that given the starting
point -- that everything could be thrown out -- and the desire to move away
from arbitrary constructs why there was still the inclusion of a separation
between documents and domains.
Was there a reason to not simply use a generalized "Object" abstraction where
an attribute of an object could be another object or list of objects?
This was something that my thoughts tended towards when I was thinking on how
I could map some of my own ideas to the idea of generalized domains (more on
that in a moment) and would like domain properties. So -- in a system where
all assumptions can be thrown out, what is the advantage of differentiating
between documents and domains?
=== Lack of Addressability ================================
When looking at how to build a useful search based interface framework I
stumbled across a few issues, which led to other issues, and so on.
Initially I was working with two different conceptual problems -- two things
that I thought were missing on the modern desktop: useful search and
linkage. After a bit of thought and talking to others at the first
conference where I presented some of these ideas it became clear that these
were really part of the same problem.
At least from my (admittedly small) knowledge of modern web-based search
systems, notably Google, relevance is seen as an emergent property of context
-- in a nutshell, something's more important if it has a bunch of related
things pointing to it. This idea seemed fairly natural to me and immediately
pointed to one of the other deficiencies (specifically one that I was already
interested in) on the modern desktop: there is no generalized way to link
information on the desktop even in the somewhat crude way that we're able to
on the WWW. On the other hand the idea of linked and related information
makes a lot more sense on the desktop than it even does on the WWW. So, what
do we need to get there?
Well, the first thing that was missing was an idea of resource addressability.
We needed some way to point from a specific place in one resource to a
specific place in another.
Once there was the abstraction for addressability this made it pretty easy to
build the idea of links directionally between two addresses. These will be
weighted in terms of how they're generated. The idea in such as system is
that there will be a mix of types of links -- some gathered from metadata
that points back to its source, some from explicit user or developer
connection and some gather based on usage patterns.
Once that's there a lot of other things come for free; the original idea was
to build such a system to make search more feasible, but as a side effect the
ability to link information across applications and documents comes for free.
With the goal of breaking down current interface hierarchies this is a nice
bonus.
But of course this also gives us a place to start building a search algorithm
that uses graph traversal and path lengths / weights to determine relevance.
At least that's the idea. Get all of the information we can into a graph
like structure (likely stored in a relational database, but that's just an
implementation detail) and do cool things with it.
Now, going back to the original point of this section -- it seems that the
idea of addressability is missing in your paper. This may be seen as
something that's above the layers that you're interested in, but this has
been something that in looking at how to build user interfaces based on this
kind of stuff has been pretty fundamental to my thinking.
Beyond that linkage seems to be also missing. (Again, possibly for similar
reasons.) Domains are similar in many ways -- but this is where I got back
to the idea of domains needing properties to map links as a domain. (i.e. a
domain with two members and some attributes describing the relationship)
=== Network Issues ========================================
One of the things that I think is a problem with both of our systems -- at
least as I see them is how to deal with networks and moving resources across
networks without destroying the context that they typically exist in.
In one sense, if we can assume network persistance, we could simply address
objects across the network and have references to them stored in our local
resource store. That's at least the best solution that I've been able to
come up with at the moment.
However, since I've still got the backdrop of traditional files -- there's
still *something* that the users can do with the data once its been moved
across a network, however this would seem to break down even more with the
idea of purely a "document composed of attributes" system -- especially since
those attributes aren't particular to a specific document.
I guess the entire group of related object could be sent across in some kind
of encapsulated transport or something, but here I'm just throwing around
ideas.
This is one of the rather nasty issues that's stuck around for me at the
moment; if you've got some sort of magical solution to such, I'd certainly be
interested in hearing it. :-)
===========================================================
Ok -- so those were the major groups of issues that jumped out at me. Myself
and one other person from the KDE project have worked out an informal design
document, which I'll probably formalize and kind of structure in a paper-like
way (though if I publish anything on such it would probably be in a "light"
format for the Linux-geared press, since that's really the only thing that
I'm connected to at the moment).
Anyway -- looking forward to your thoughts if you have time. Again thanks for
meeting with me in Santiago.
Cheers,
-Scott
-------------------------------------------------------
More information about the Klink
mailing list