[Owncloud] Scaling considerations
Aggelos Economopoulos
aoiko at cc.ece.ntua.gr
Sun Jul 4 15:05:25 UTC 2010
Am 29/06/2010 08:15 μμ, schrieb Tobias Hunger:
> Hi!
>> It struck me as odd that I couldn't find a discussion of the scaling
>> issues involved. Maybe I didn't look hard enough, maybe much of the
>> discussion takes place on irc or perhaps you have defined the problem
>> away :)
>
> Yeap, there could be more information on the project homepage about this
> (and any other issue:-)
Yes, definitely :) But discussing the issues is a necessary first step
in that direction.
>> In any case, I'm concerned about file sharing between friends and what
>> would happen if I was running owncloud on my home box (or even some server
>> somewhere) and some of my files became too popular for my connection. As
>> things are, I'd stand a good chance of getting (accidentally and without
>> warning) DDoS'ed out of the internet.
>
> Is that really that big a problem I wonder? You are inviting people to view
> your stuff after all, so you can just ask them to stop again:-)
Well, I invite 5 people to take a look at this file, but these people
find it so interesting that they point their friends to it as well. One
of those people posts the link on some forum somewhere. Suddenly I get a
SYN storm from people that I don't know and of course can't contact
except by responding with a static error page (if that :)
> This does look a bit different if you host contents that is
> visible for everybody... but then running owncloud is no different
> then running any other web server on a server at your home. Just don't
> get anything hosted there mentioned on slashdot:-)
Yes, true. But since I am building up trust relationships as part of
owncloud, this is a huge opportunity to make use of those relationships.
E.g. my well-connected owncloud instance could automatically offer to
mirror popular files of my immediate and/or trusted friends (or even
files of their friends etc). This could be push-based. I am not claiming
that this is the best approach and I'd be very interested in other
suggestions.
You're of course absolutely right that one can define the problem away.
However, I think that you shouldn't. After all, you're competing against
proprietary systems that offer virtually unlimited bandwidth (and mostly
sufficient amounts of space and computation power) for "free". But more
importantly, such an architecture connects the ability to reach a lot of
people with the financial ability to pay for the resources (see below).
You could argue that the existing approach of finding somebody who wants
to "sponsor" your ideas/exression/whatever and can provide the resources
is good enough, but IME there are issues. The most important one being
that when later there is a clash of ideas, it is kinda hard to move your
net presence elsewhere.
>> Secondly, I would consider it very important to allow for mobility.
>
> Moving your data is hard with a web-server based approach. Basically you need
> to move the data and then ask everybody to update their URLs.
Depends on the scenario. If I'm changing hosting providers (with my home
machine potentially being one of the providers), I could just leave
behind a (machine interpreted) redirect and people would lazily update
to my new location url. But say my account has been shut down by the
hosting provider (e.g. because of some policy rules). Then I could use
my private key and a backup of my friends list/roster to credibly notify
them of the url change (you could get more clever here).
> A more "cloudy" approach would include things like encrypting data, applying
> a global unique Id to each piece of data and uploading it into some form of
> storage network. P2P comes into this, too, considering ownClouds "everybody
> can run its own server" idea...
Well, I don't know about encrypting... my concern was mostly about
public files. If some files are encrypted for a group of people, I think
it's highly unlikely that the group is large enough and the files are
*that* big that computing/networking resources are going to be an issue.
> There are proposals on encrypted storage in the owncloud wiki, covering (parts
> of) this. Please comment on them!
We can all start proposing our favorite architectures, but IMHO that's a
bit backwards. The more immediate questions have more to do with what we
want out of the system.
Do we want it to scale without requiring the user to throw resources at
the problem?
Do we want the identity to be in the hands of the user (so that a
malicius hosting provider can only temporarily inconvenience its users?)?
If we end up using keypairs, how do we handle key management (that's
probably the most interesting question)?
Do we care about forgery?
Do we want to enable end-to-end crypto between arbitrary users or
between "friends"?
Do we want to allow multiple master copies that can get out of sync?
Personally, I'd want a system where I don't have to pay for resources in
proportion to my popularity -- if I did, this would be a big incentive
to try to monetize my popularity. I don't think I need to remind anyone
interested in owncloud that system architecture is not apolitical.
In addition, I'd like to be able to use a friend's server w/o trusting
them with my identity. Even more so, I'd like to be able to provide
hosting for friends' internet presence _without_ them having to trust me
absolutely. This does not necessarily mean that all data on the server
should be encrypted.
Not all data is equally valuable anyway and I'd like to be able to
access some of it from a machine I don't trust.
As for securely exchanging data between users, I think it's something
worth doing but I don't see that as a priority. The social problems with
pushing adoption of solutions like owncloud are big enough as it is.
/Requiring/ encrypted data exchange is not realistic at this point IMHO.
Allowing for it is a worthy goal though.
Diverging versions of the data is another "interesting" problem. Not
sure how important that is.
My point is, we need to be clear about the requirements before we
discuss specific architecture and implementation issues.
That said, my approach to encrypted file storage would be much closer to
the first proposal (sorry Tobias :)
So, what requirements do people consider necessary? Keep in mind the
obvious tradeoff between usefulness/coolness of features on the one hand
and the implementation effort and social acceptance/usability issues on
the other.
Aggelos
More information about the Owncloud
mailing list