Review Request 110043: Proposed fix/workaround for legacy encoded filename handling

Róbert Szókovács szo at szo.hu
Wed May 22 16:23:03 BST 2013



> On May 7, 2013, 10:11 a.m., Róbert Szókovács wrote:
> > The solution is intentionally "shy", I really don't want to fan the flames surrounding this issue. I just stumbled upon this location when it can be handled painlessly. Whether or not it should be turned on by default, in my opinion, can be left for distributors.
> 
> Thomas Lübking wrote:
>     Then it's worthless.
>     
>     When I encounter broken filenames on a rw device, i know it's time for a fix.
>     When I encounter broken filenames in joliet or rockridge (latter usually caused by myself long ago - thank you, "wodim"...) i know it's time to mount norock/nojoliet.
>     Whether i do that or set a (KDE only affecting) env makes hardly a difference.
>     
>     When my little sister™ encounters broken filenames anywhere, she knows that it's time to call her personal IT (me) with "these files won't open!" - if she could not call me, she had no access to those files. Period.
>     She won't think to google for "kde broken filenames", because she would not think it's a "problem with the name" - the files have weird names, yes, but essentially they won't open when she clicks them.
>     That this could be due to some restrictions in UTF-8 and QString and other terms she does not know, cannot be an expected consideration.
>     
>     So either this is not a fixworthy issue at all, or it (as OPT-IN) only becomes a way for distro discrimination (works on distro X but not on distro Y) because fact is that the filenames are broken and if we want to assist in that situation, we assist the unskilled *only* and the unskilled simply dont set env vars. If they did, they were also skilled enough for convmv et al. to deal with that issue correctly.
>     
>     IOW *every* distro but Arch/Gentoo/LFS - ie. where you read a wiki for setup - likely would *have* to set this anyway and those have the users to turn it off at will.
>     
>     /2¢
> 
> Róbert Szókovács wrote:
>     OK, I'm all for making this on by default, but that would be a change from the current situation, when the default is QFile's filename encoding, basen on locale. If this becomes the default, it disrupts those who use a non-UTF8 locale. The current code provides an enviroment variable to force KDE to threat the filenames UTF8, this patch piggybacks that mechanism. Should we check the locale the same way QFile does?
> 
> Thomas Lübking wrote:
>     There should be no regression in regular use on non broken FS names for no-one - not even those using non UTF-8 locales, so yes - testing the locale to dis/enable this sounds reasonable.
>     
>     Is the solution as simple as deactivating it if the tested env is set to anything but "non_broken_names"?
> 
> Róbert Szókovács wrote:
>     No, I'm affraid we would need heuristics similar to the one in QT, see qtextcodec.cpp, setupLocaleMapper(): "Get the first nonempty value from $LC_ALL, $LC_CTYPE, and $LANG environment variables.", then check the CODESET part; if it's UTF8, enable this new functionality, otherwise do as before the patch.

I uploaded the new version that checks the locale.


- Róbert


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/110043/#review32184
-----------------------------------------------------------


On May 17, 2013, 10:08 a.m., Róbert Szókovács wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/110043/
> -----------------------------------------------------------
> 
> (Updated May 17, 2013, 10:08 a.m.)
> 
> 
> Review request for kdelibs and Thiago Macieira.
> 
> 
> Description
> -------
> 
> This patch works around the problem of filenames that are not valid UTF8 strings:  in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. 
> To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value ("broken_names").
> 
> To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: 
> export KDE_UTF8_FILENAMES=broken_names
> logout, login, try dolphin on faulty files. (instead of the usual boxed "?" you'll see just boxes)
> 
> 
> This addresses bug 165044.
>     http://bugs.kde.org/show_bug.cgi?id=165044
> 
> 
> Diffs
> -----
> 
>   kdecore/localization/klocale_kde.cpp b010e74 
> 
> Diff: http://git.reviewboard.kde.org/r/110043/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Róbert Szókovács
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20130522/dccc02b1/attachment.htm>


More information about the kde-core-devel mailing list