[Owncloud] special characters in filenames

Jörn Friedrich Dreyer jfd at owncloud.com
Thu Aug 2 14:40:29 UTC 2012


On 02.08.2012 15:22, Frank Karlitschek wrote:
> Hi everyone,
>
> we have an interesting question where I would love to get some more opinions.
>
> The question is how ownCloud should handle special characters in filenames. ownCloud itself should work always with UTF8 and do proper encoding so that ownCloud can work with all possible characters.
> The problems are the underlying filesystems.
>
> So the ownCloud server can run on Windows/Linux/Mac servers with lot´s of different filesystems and we have clients for Windows/Linux/Mac/iOS/Android with more different filesystems. All this filesystems have limitation for allowed characters and handling of uppercase/lowercase filenames.
Personally, I think any filesystem that does not handle utf8 filenames 
should die and rot in hell. That beeing said, I would like to see 
owncloud use a storage backend like Apacke Jackrabbit (oh, there is a 
php implementation: http://jackalope.github.com/) or git that stores 
files by their hash. But I also know that people already tried using git 
as a backend with large files which I was told did not perform well (I 
assume because the creation of hashes for large files tages ages, ~15sec 
for a 780MB mkv with md5sum -b. On the other hand, transferring 780MB 
over the wire certainly rakes longer than 15sec) And since we do not 
have background jobs with php we cannot create a cleanup job that 
creates the hashes from a temporary location an then moves the files to 
the correct location in the content repository. It would like to see a 
content repository, anyway.
> So what can/should ownCloud do if someone want´s to sync a file with a special character that´s supported on one platform but not on another?
> Should be change the filename? Or don´t sync at all?
Don't the clients already have to convert the utf8 paths used on the 
owncloud server to a system compatible version?
With regard to filesystems in the server: if we store the utf8 version 
of the path we could use the same conversion used on the clients to 
decide on a filename compatible with the servers filesystem. I agree 
with Tom that this will however produce more frustration than it leverages.

> What do you think?
I think we should write concrete scenarios with the different possible 
backends in gherkin: https://github.com/cucumber/cucumber/wiki/Gherkin

Then we can create a test that reveals what already works and which 
cases we need to take care of. The devil is in the detail here and we 
need to test for the details.

so long

Jörn

-- 
Jörn Friedrich Dreyer (jfd at owncloud.com)
Software Developer
ownCloud GmbH

Your Data, Your Cloud, Your Way!

ownCloud GmbH, GF: Markus Rex, Holger Dyroff
Schloßäckerstrasse 26a, 90443 Nürnberg, HRB 28050 (AG Nürnberg)




More information about the Owncloud mailing list