Locale Name Primer

John Layt jlayt at kde.org
Sat Jun 7 14:53:15 UTC 2014


Locale Names.

A quick primer on Locale Names, seeing as we've had a few issues in
the last couple of days. I can't claim perfect knowledge, so feel free
to point out where I am wrong :-)

TL;DR:
* Don't use QLocale::bcp47Name().
* Use QLocale::name(), but may need to modify the results.
* You usually want QLocale().name() and not QLocale::system().name()
* Don't assume language code is alpha-2, it may be alpha-3
* Always use case insensitive comparisons.
* I'll improve support in Qt 5.4.

POSIX Locale:
* What we use on Linux systems and in glibc.
* Format is "lang_REGION.encoding at variant".
* How many of those componants are included can depend on the system
or implementation.  In most cases the language and region are always
included, with the encoding and variant optional.
* lang is ISO 639 alpha-2 or alpha-3 (so QLocale().name().left(2) is
not valid!) , usually in lower case
* REGION is ISO 3166 alpha-2, usually in upper case
* Many distros explicitly add .utf8 or .UTF-8 for Unicode, e.g. on
openSUSE "en_GB.utf8" uses UTF-8 but "en_GB" uses ISO-8859-1.
* The variant is usually used to change the script, but doesn't use an
ISO code for this e.g. "sr_RS" uses Cyrillic script but "sr_RS at latin"
uses Latin script
* The variant can change other options like the currency, e.g. "de_DE at euro"
* Always use case-insensitive comparison as case of codes is meaningless
* Run "locale -a" to see what your distro has installed

BCP47 Locale:
* IETF RFC, used in Unicode, W3C standards, etc.
* Used in Windows Vista and later, where it replaces the old LCID.
* Basic format is "lang-Script-REGION-variants-extensions"
* Always uses hyphen as a subtag separator
* Always uses minimum subtags required to uniquly identify locale,
e.g. "de-Latn-DE" will be reduced to "de" as Latn and DE are the
assumed defaults.
* lang is ISO 639 alpha-2 or alpha-3 code, usually in lower case
* Script is ISO 15924 alpha4 code usually in title case
* REGION is ISO 3166 alpha 2 code, usually in upper case
* variant are registered variant codes
* extensions are registered extension codes
* Always use case-insensitive comparison as case of codes is meaningless

Qt 5.3 support:
* name() always returns "lang_REGION", except where AnyCountry is set
or C, never returns encoding or variant
* bcp57Name() returns the minimal BCP47 name
* No direct may to get lang or country code, need to use
"QLocale().name().split('_').at(x)"

Needed in Qt 5.4:
* languageCode()
* countryCode()
* scriptCode()
* posixName()
* encoding?

Cheers!

John.


More information about the Kde-frameworks-devel mailing list