[kde-freebsd] system:/media/cd0 and volume_label not latin symbols

Michael Nottebrock lofi at freebsd.org
Fri Apr 6 20:54:08 CEST 2007


Kevin Oberman wrote:
>> Date: Fri, 6 Apr 2007 13:37:17 +0200
>> From: Jean-Yves Lefort <jylefort at FreeBSD.org>
>> Sender: owner-freebsd-gnome at freebsd.org
>>
>> --Signature=_Fri__6_Apr_2007_13_37_17_+0200_OapU1fZfsGyEc4EJ
>> Content-Type: text/plain; charset=US-ASCII
>> Content-Disposition: inline
>> Content-Transfer-Encoding: 7bit
>>
>> On Wed, 4 Apr 2007 12:58:29 +0200
>> Michael Nottebrock <lofi at freebsd.org> wrote:
>>
>> > On Wednesday, 4. April 2007, Jean-Yves Lefort wrote:
>> >
>> > > > So I see several solutions:
>> > > >  1. By default submit to HAL user's locale encoded mount point
>> name.
>> > >
>> > > This is not possible. All hal data must be encoded in UTF-8.
>> > >
>> > > >  2. Modify mount point naming scheme to something which is not
>> > > >     dependant on locale encoding, for example, to device name.
>> > >
>> > > I'd rather not make this the default behaviour. The volume label is
>> > > much more informative than the device name and should cause no
>> > > problems for most users.
>> > >
>> > > >  3. Change user's locale to UTF8.
>> > >
>> > > This is the recommended solution. UTF-8 is now universally supported
>> > > and I see no reason to stick to a legacy encoding.
>> >
>> > Universally supported except in FreeBSD. :( I'm not aware of any
>> substantial
>> > work on UTF-8 since it was imported, which would mean that there's
>> still no
>> > collation support.
>> >
>> > If even some Linux distributions despite their vastly superior UTF-8
>> support
>> > apparently do it, I think solution 2 should at least be offered via
>> OPTIONS
>> > right in the port - installing an alternative ruleset wouldn't be too
>> > difficult to implement.
>>
>> What would be difficult (or impossible) would be to provide a
>> satisfactory explanation of the option using the small number of
>> characters available.
>>
>> You're right that the FreeBSD libc lacks Unicode collation support,
>> but it seems that no gain is made by sticking to a legacy locale:
>>
>> $ touch A B a b
>> $ export LANG=en_US.UTF-8; ls
>> A       B       a       b
>> $ export LANG=en_US.ISO8859-1; ls
>> A       B       a       b
>>
>> As you can see, the files are incorrectly sorted with both locales. On
>> a Linux box, the sort order is correct (a A b B) in both cases.
>>
>> If someone can convince me that there are good reasons to use a legacy
>> locale, I might add the option despite the fact that its description
>> would be cryptic.
>>
> Jean-Yves,
>
> I guess the term "correct" is unclear as for en_US languages.

Yes, mixed-case vs case-separated collation order is in fact undefined and
thus up to the whims of the people in charge of collation definitions. As
usual, FreeBSD stuck to tradition and Linux went for usefulness (N.B. I'm
kidding. But only re. the "as usual"). I still need to roll my own
mixed-case collating locale to fix up the gtk2 file dialog ... :P

I think the more compelling argument is that switching from a legacy
locale to UTF-8 is *hard* if you've used locale-specific characters in
filenames anywhere, eg:

$ setenv LANG de_DE.ISO8859-15
$ touch k  q  è  é  ê  ý
$ ls
$ é è ê k q ý
$ setenv LANG de_DE.UTF-8
$ ls
$ k q ? ? ? ?

Oops. Not to even mention the requirement to change charmaps in terminal
emulators, certain server applications (samba), etc etc. There's lots of
good reasons to never change the locale of a system once it has been setup
and in use. :)

Cheers,
-- 
   ,_,   | Michael Nottebrock               | lofi at freebsd.org
 (/^ ^\) | FreeBSD - The Power to Serve     | http://www.freebsd.org
   \u/   | K Desktop Environment on FreeBSD | http://freebsd.kde.org



More information about the kde-freebsd mailing list