[Owncloud] Syncing and Time Synchronization

Jono jono at foodnotblogs.com
Thu Jul 19 11:09:32 UTC 2012


I have to be honest, I dont understand either of these cases.

1. "mass overwriting"? Are you referring to modifying many timestamps?
2. Directory structures do not have non sequential branches like git
repos. Maybe you are referring to something else?

If this is a 'solved problem', can you direct me to some writtings on
the subject. For some reason all these checksums are not sitting well
with me. Sounds expensive any way you do it.

I will say I do like that RFC. It gives me another idea, though I am
not sure if it would apply to WebDav. It does not use timestamps or
checksumsaccross the network, so maybe it does not make sense, but
here it is:

In the client and server database we maintain a sync-token for each
file. The token can just be a random generated value that will be
regenerated on the server every time the file is in a new state. When
the server and client tokens differ it means the version of the file
on the server is the latest. If the token on the server and client
match, the files match.

We maintain timestamps in the client database. So when a file is
updated on the client side, we can check its timestamp against the
client database and see if it needs to be sent to the server.

I know we are steering away from using timestamps over the network,
but is it ok to use it locally? I can see problems with this when the
user decides to change their system clock, but maybe that is a
different issue.


On Wed, Jul 18, 2012 at 2:45 PM, Evert Pot <evert at rooftopsolutions.nl> wrote:
>
>> I am curious why mtime is not sufficient. My thoughts would be to
>> determine a time difference before the sync. Then use that difference
>> when determining which files to transfer and apply the time difference
>> after writing a file to the client or server as needed.
>
> Two reasons why using time is a bad idea
>
> 1. It's not accurate. Even if there's a difference of seconds, it could throw a wrench in the system; if it's hours (not uncommon for end-users), mass overwriting may ensue ;)
> 2. The latest (in time) does not always imply that it's the 'final version'. One look at an active github repository, and you notice that sequential changes in files, are not always in sequential in time.
>
> You can argue that #2 is a potential edge case, and #1 is something you can solve with workarounds and fuzzy logic.
>
> However, synchronization is a 'solved problem' and there are very robust ways to do this. If you'd do it based on the modification time instead, you reinvent the wheel, but kind of in a bad way. Your wheel is made out of paper-maché, and breaks if you go too fast with it.
>
> Evert
>
>
>>
>> -Jono
>>
>> On Wed, Jul 18, 2012 at 12:23 PM, Evert Pot <evert at rooftopsolutions.nl> wrote:
>>> On Jul 18, 2012, at 4:57 PM, Klaas Freitag wrote:
>>>
>>>> On 18.07.2012 16:30, Evert Pot wrote:
>>>> Hi Evert,
>>>>
>>>>>> We instead have to implement it as Custom webdav property which the client can query via PROPFIND, and gets the MD5 sum for every file in return.
>>>>>
>>>>> http://tools.ietf.org/html/rfc4918#section-15.6
>>>> Ah ok, that teaches us that we can get the etag as a property too. Did not know that, thanks :-)
>>>>>
>>>>>
>>>>> Related to the sync discussion; don't ignore this rfc:
>>>>>
>>>>> http://tools.ietf.org/html/rfc6578
>>>>>
>>>>> It's easily superior to all the approaches discussed here in the last little while.
>>>> Oh yes. Is it implemented with Sabre somewhere? Sounds like the overall target we should strive for.
>>>
>>> Not yet, it's planned and there's a branch:
>>> https://github.com/evert/SabreDAV/tree/webdav-sync
>>>
>>> But it's a bit hard to say when it's ready.
>>>
>>> If you want to support it in owncloud in the future, you should at least have support for:
>>>
>>> * A concept of sync-tokens. These are similar to ETags, but apply to the contents of an entire collection.
>>> * Changelogs, containing deletes and modifications of files. also for an entire collection.
>>>
>>> This should be a solid basis for whatever sync approach you take :)
>>> Evert
>>> _______________________________________________
>>> Owncloud mailing list
>>> Owncloud at kde.org
>>> https://mail.kde.org/mailman/listinfo/owncloud
>> _______________________________________________
>> Owncloud mailing list
>> Owncloud at kde.org
>> https://mail.kde.org/mailman/listinfo/owncloud
>



More information about the Owncloud mailing list