Complex text input in Plasma
hein at kde.org
Thu Apr 6 17:16:14 UTC 2017
In the aftermath of D5301, Martin asked to compile a document on the
requirements for complex text input in Plasma, especially with the
opportunities provided by the Wayland transition. It makes sense to
share this document with all of you, to
= I. What Input Methods do =
Basically, with simple text input (and skipping a few stack steps in
this explanation), the user presses a key on the keyboard, it's
interpreted according to the active layout, and results in a symbol
on screen (modulo complicated ideas like dead keys and the Compose
key, which also enter into input method territory as we'll see).
With some writing system, this isn't enough. They require multiple
key presses or pick-and-choose selection steps to produce a symbol.
The general idea is to disconnect the input from the output and
introduce a conversion inbetween that's more complex than a mere
keyboard layout. The conversion itself may have complex UI like
temporary feedback in the textfield, popups positioned relative
to the text field, or state visualization in shell chrome.
There are other ways to access input methods than just the keyboard
these days, too. I'll briefly mention one of them later.
= II. Practical examples of Input Method use =
Korean is written using an alphabet similar to German. It has about
the same number of letters, with each letter representing a vowel,
a consonant, or a diphtong.
On computers, each letter has its own key on the keyboard. As with
German, there are multiple keyboard layouts for the Korean alphabet,
although one is dominant and considered standard by far.
Unlike the German alphabet however, letters in the Korean alphabet
are grouped together into (morpho-)syllabic blocks when written.
Each block must start with a consonant and contain a vowel or
diphtong. It may optionally end in one or two consonants.
Here are some letters and their corresponding Latin latterns (the
sounds they most closely correspond to - theeir locations in the
keyboard layout do not match QWERTY/QWERTZ):
In linear order these form the syllable "han". When written
properly, these are groupedtogether:
In the UI, when pressing the keyboard keys for each letter, the
text field contents cycle through these stages:
As you can see, the existing text is replaced two times, making
the operation stateful. The Input Method Engine generates complex
events with text payload, state hints, even formatting hints (some
IMEs use text color, e.g. color inversion, to communicate state
such as "this input is not finalized yet") that are delivered to
the application. In Qt, a plugin corresponding to the Input Method
(e.g. ibus or fcitx) translates these events into QInputMethodEvent
objects that are delivered to widgets and processed there.
The rules of the Korean alphabet additionally have some
implications for cursor movement/behavior. E.g. because it's not
allowed to start a block with two consonants, and because the number
of vowels and ending consonants is limited, as keys are pressed a
block might implicitly finish composing and the cursor moves on.
There are many different strategies for inputting Chinese
characters. The most common actually makes use of the Latin
alphabet, and keyboard layouts for the Latin alphabet. Chinese
characters have assigned sound values (i.e., how a human actually
pronounces a character), and there are rule systems to transcribe
these using the Latin alphabet, e.g. Pinyin.
As users write some Pinyin using Latin characters, a selection
popup will offer a list of Chinese characters matching the input.
The user picks one and it's inserted into the text field.
Chinese input methods try to be very smart in what characters
they offer, taking preceding input, common phrasings, etc. into
account, making them highly stateful things.
c) Modes, Input Method overlap
Korean used to be written using Chinese characters (this simple
statement is the tip of a large iceberg of complicated history
and rulesets :), and especially Korean Academic writing still
makes use of them. Korean input methods wherefore tend to also
offer the ability to type Chinese characters, with a mode toggle
Korean input methods also usually have a Hangul (the Korean
alphabet) vs. Latin mode toggle.
Chinese input methods tend to have mode toggles for things like
chosing between half-width and full-width characters, i.e. also
offer control over typography.
Japanese is a mashup of all of these things, with Japanese users
typing in two Japanese-specific syllabaries, Chinese characters
and Latin during a typical session.
= Other uses for input methods =
The ever more popular Emoji character set may turn writers of
languages which typically do not require an input method into
input method users.
On Fedora/Gnome systems, pressing Ctrl+Shift+e inserts a @
character into the text field, and typing a string such as
"heart" will select among suitable emoji (shown in a popup
akin to Chinese character input). The text is underlined
during composition and the underline disappears when
composition is complete.
Under the hood, this is implemented using an ibus input method
Emoji input can also be made available via context menu actions
Another input method engine that's language-agnostic in basic
conception is the increasingly popular "typing booster", which
provides workd completion and spell check suggestions on the
fly. This is closely related to similar features in virtual
As this indicates, multiple input methods may also be chained
or coexist modally or just cycled through.
= The players on the field =
Time for a brief overview of the input method components
currently in common use on free systems. Afterwards I'll talk
about how we currently interact with and expose these things
ibus is a framework and daemon for input method engines. The
ibus project provides the central daemon, as well as default
UIs for managing input method engine plugins and a default
frontend for showing a tray icon and input method popups.
It also develops a set of input method engine plugins (with a
plugin providing e.g. Korean support, or something like the
aforementioned typing-booster), although third parties can
develop and deploy their own using public API.
The config UI and the tray/popup UI (called a "panel" in
ibus parlance) can be replaced by third-party components as
fcitx is a competitor to ibus that covers much of the same
ground. Like in ibus, there is a central component, engine
plugins, config frontend, and so on.
It's worth noting that there is contributor overlap between
Plasma and fcitx. I personally know less about fcitx because
my distro supports ibus better, but I don't mean to present
it as default choice.
There's a handful of other entries in this space - solutions
focussing on a single language but standalone instead of
using the ibus or fcitx frameworks (e.g. the Navi/Nabi input
method for Korean), or legacy systems like scim.
Mobile input stacks duplicate much of the work found in ibus
and fcitx as well, e.g. Maliit and the Qt Virtual Keyboard
both have their own language-specific engine implementations.
This is unfortunate, as config and state are not shared
between physical and virtual keyboards, and feature set and
behavior may differ.
Input method engine plugins to ibus and fcitx and others
sometimes rely on the same library stack, e.g. libhangul for
Input method systems interact with applications and toolkits
via protocols like XIM or the Wayland text-input protocol,
along with toolkit plugins. Qt offers a public API for input
method plugins. Qt 5 bundles plugins for compose key support,
ibus and non-X11 platforms. For Qt 4, the ibus plugin is an
independent install provided by ibus. fcitx' plugins are an
independent install as well (iirc).
= The situation in Plasma 5 right now =
Text input in Plasma 5 is currently handled by the following
* A System Settings module offering keyboard layout management
* A dynamic panel indicator for keyboard layout state and management
* A panel applet for key state (CAPS lock, etc.)
* An "Input Method Panel" (kimpanel) widget that provides
state/popup UI frontend to ibus (i.e. a "panel", replacing ibus'
default GTK+ panel UI), fcitx and scim
* KWin is smart enough to do cool things like switch keyboard
layouts automatically per virtual desktop or even window
* The wide character set support of our recommended default
typeface (Noto Sans)
And now the problems start: Once an input method is used, many
of these components become useless, unsupported or show ugly
Additionally, setup often requires heavy distro tooling or expert
Here's an example playbook of outfitting an existing, English
input Plasma 5 system to handle Korean input using ibus:
- Install ibus and ibus-hangul.
- Manually add ibus-daemon to the system autostart.
- Manually add the Input Method Panel widget to the panel.
- Use the (GTK+) ibus config UI to manage English and Korean
input method engines. The config UI is accessed via the Input
Method Panel widget, it's not available in System Settings.
- Use the (GTK+) ibus config UI to manage your keyboard
layouts for your input modes. The System Settings keyboard
layout module becomes useless when using ibus. While ibus
in theory has integration with the xkb "system keyboard
layout", in practice things end up fighting each other
and the System Settings layout handling has to be disabled.
- The dynamic keyboard layout panel indicator becomes useless.
- Kwin-assisted layout switching becomes useless.
- Up until commits on master yesterday that made it a little
easier, the user may need expert knowledge to get the Input
Method Panel widget to work. By default ibus will start its
default bundled GTK+ panel frontend and Input Method Panel
will not work, unless disk configuration is changed or the
ibus-daemon is started with the right CLI args. On master
Input Method Panel will work, but (due to an ibus bug) the
GTK+ panel will co-exist and clutter up the tray.
- The Input Method Panel widget itself is pretty great, but
somewhat poorly integrated into the overall Plasma panel
UX. It looks and behaves as a second system tray, showing
icon buttons for the various features the active input
method engine provides (these may change dynamically during
input or as the engine is switched), with a distinct system
for hiding individual buttons.
Using fcitx, the situation is slightly better but similar. Unlike
ibus, fcitx provides a third-party config module that integrates
into System Settings. But our bundled modules become likewise
mostly useless, as does Kwin assist, etc.
The playbook looks better assuming a fresh first-time
installation. Here's the best case:
- Select the Korean language while installing $distro.
- $distro takes care of pulling in the packages and setting up
- $distro takes care of arbitrating between Input Method Panel
and upstream UI frontends.
- Plasma on first log-in auto-adds Input Method Panel to the
default panel in locales it knows need it.
Even better, on some distros language management tools outside of
System Settings (e.g. YaST) can be used to add languages later that
will do many of these steps, except for adding Input Method Panel,
which the user needs to do manually.
Unfortunately this best case scenario isn't the reality in many
distros that ship Plasma 5 currently.
= The situation on Wayland =
Currently, only ibus works on Wayland. Input method popups are not
positioned correctly, but otherwise things work. The situation is
just as good or bad as on X11.
= Where we need to go =
* Input method use needs to be a primary usage scenario, not an
after thought. We should take care of initializing the input
* Our keyboard layout management UI needs to become an input
language management UI, configuring both input methods and
* The dynamic keyboard layout indicator and Input Method Panel
need to be merged to eliminate redundancy and make the latter
dynamic, not requiring the user to know they need to manually
add a widget.
* Input Method Panel needs to be integrated with the System tray
widget, reusing its show/hide infra and eliminating inconsistent
layout and behavior seams.
* KWin-assist (dynamic layout switching) needs to think in
input languages, not keyboard layouts.
* We should surface input method-assisted functionality like
Emoji input and typing-booster.
* Physical and virtual keyboards should share a common feature
set, common behavior and state.
This is catch-up work; other systems are already there. On
the flip side, we have some nice bits to start with (the good
Input Method Panel, the powerful tray implementation, generally
the power of the Plasma platform).
There's a common position that holds "but computers are used
in English, duh". But even a hardcore developer audience that
does use a non-localized system has a need to socialize with
people in their native writing system. And it's unfortunately
especially on those systems (not initially installed in the
native language) where the biggest pain points lie.
This work is incredibly vital for making Plasma accessible to
hundreds of millions of additional potential users. It's
incredibly vital for our mission to make free software
alternatives available to more people, and strongly aligned
with the Inclusivity goal in the KDE manifesto.
= Screenshots =
Here are some complementary screenshots of Input Method Pane
on X11, with the Korean input method engine active:
As you can see, the poorly positioned context menus are
another entry in the "integration seams" list.
And some screenshots of ibus configuration:
More information about the Plasma-devel