Complex text input in Plasma

Thu Apr 6 19:58:37 UTC 2017

On Thursday, 6 April 2017 10:16:14 PDT，Eike Hein wrote：
> Hi,
> 
> In the aftermath of D5301, Martin asked to compile a document on the
> requirements for complex text input in Plasma, especially with the
> opportunities provided by the Wayland transition. It makes sense to
> share this document with all of you, to
> 
> 
> = I. What Input Methods do =
> 
> Basically, with simple text input (and skipping a few stack steps in
> this explanation), the user presses a key on the keyboard, it's
> interpreted according to the active layout, and results in a symbol
> on screen (modulo complicated ideas like dead keys and the Compose
> key, which also enter into input method territory as we'll see).
> 
> With some writing system, this isn't enough. They require multiple
> key presses or pick-and-choose selection steps to produce a symbol.
> The general idea is to disconnect the input from the output and
> introduce a conversion inbetween that's more complex than a mere
> keyboard layout. The conversion itself may have complex UI like
> temporary feedback in the textfield, popups positioned relative
> to the text field, or state visualization in shell chrome.
> 
> There are other ways to access input methods than just the keyboard
> these days, too. I'll briefly mention one of them later.
> 
> 
> = II. Practical examples of Input Method use =
> 
> a) Korean
> 
> Korean is written using an alphabet similar to German. It has about
> the same number of letters, with each letter representing a vowel,
> a consonant, or a diphtong.
> 
> On computers, each letter has its own key on the keyboard. As with
> German, there are multiple keyboard layouts for the Korean alphabet,
> although one is dominant and considered standard by far.
> 
> Unlike the German alphabet however, letters in the Korean alphabet
> are grouped together into (morpho-)syllabic blocks when written.
> Each block must start with a consonant and contain a vowel or
> diphtong. It may optionally end in one or two consonants.
> 
> Here are some letters and their corresponding Latin latterns (the
> sounds they most closely correspond to - theeir locations in the
> keyboard layout do not match QWERTY/QWERTZ):
> 
> ㅎ h
> ㅏ a
> ㄴ n
> 
> In linear order these form the syllable "han". When written
> properly, these are groupedtogether:
> 
> 한 han
> 
> In the UI, when pressing the keyboard keys for each letter, the
> text field contents cycle through these stages:
> 
> ㅎ
> 하
> 한
> 
> As you can see, the existing text is replaced two times, making
> the operation stateful. The Input Method Engine generates complex
> events with text payload, state hints, even formatting hints (some
> IMEs use text color, e.g. color inversion, to communicate state
> such as "this input is not finalized yet") that are delivered to
> the application. In Qt, a plugin corresponding to the Input Method
> (e.g. ibus or fcitx) translates these events into QInputMethodEvent
> objects that are delivered to widgets and processed there.
> 
> The rules of the Korean alphabet additionally have some
> implications for cursor movement/behavior. E.g. because it's not
> allowed to start a block with two consonants, and because the number
> of vowels and ending consonants is limited, as keys are pressed a
> block might implicitly finish composing and the cursor moves on.
> 
> b) Chinese
> 
> There are many different strategies for inputting Chinese
> characters. The most common actually makes use of the Latin
> alphabet, and keyboard layouts for the Latin alphabet. Chinese
> characters have assigned sound values (i.e., how a human actually
> pronounces a character), and there are rule systems to transcribe
> these using the Latin alphabet, e.g. Pinyin.
> 
> As users write some Pinyin using Latin characters, a selection
> popup will offer a list of Chinese characters matching the input.
> The user picks one and it's inserted into the text field.
> 
> Chinese input methods try to be very smart in what characters
> they offer, taking preceding input, common phrasings, etc. into
> account, making them highly stateful things.
> 
> c) Modes, Input Method overlap
> 
> Korean used to be written using Chinese characters (this simple
> statement is the tip of a large iceberg of complicated history
> and rulesets :), and especially Korean Academic writing still
> makes use of them. Korean input methods wherefore tend to also
> offer the ability to type Chinese characters, with a mode toggle
> between them.
> 
> Korean input methods also usually have a Hangul (the Korean
> alphabet) vs. Latin mode toggle.
> 
> Chinese input methods tend to have mode toggles for things like
> chosing between half-width and full-width characters, i.e. also
> offer control over typography.
> 
> Japanese is a mashup of all of these things, with Japanese users
> typing in two Japanese-specific syllabaries, Chinese characters
> and Latin during a typical session.
> 
> 
> = Other uses for input methods =
> 
> The ever more popular Emoji character set may turn writers of
> languages which typically do not require an input method into
> input method users.
> 
> On Fedora/Gnome systems, pressing Ctrl+Shift+e inserts a @
> character into the text field, and typing a string such as
> "heart" will select among suitable emoji (shown in a popup
> akin to Chinese character input). The text is underlined
> during composition and the underline disappears when
> composition is complete.
> 
> Under the hood, this is implemented using an ibus input method
> plugin.
> 
> Emoji input can also be made available via context menu actions
> and similar.
> 
> Another input method engine that's language-agnostic in basic
> conception is the increasingly popular "typing booster", which
> provides workd completion and spell check suggestions on the
> fly. This is closely related to similar features in virtual
> keyboards.
> 
> As this indicates, multiple input methods may also be chained
> or coexist modally or just cycled through.
> 
> 
> = The players on the field =
> 
> Time for a brief overview of the input method components
> currently in common use on free systems. Afterwards I'll talk
> about how we currently interact with and expose these things
> in Plasma.
> 
> a) ibus
> 
> ibus is a framework and daemon for input method engines. The
> ibus project provides the central daemon, as well as default
> UIs for managing input method engine plugins and a default
> frontend for showing a tray icon and input method popups.
> 
> It also develops a set of input method engine plugins (with a
> plugin providing e.g. Korean support, or something like the
> aforementioned typing-booster), although third parties can
> develop and deploy their own using public API.
> 
> The config UI and the tray/popup UI (called a "panel" in
> ibus parlance) can be replaced by third-party components as
> well.
> 
> b) fcitx
> 
> fcitx is a competitor to ibus that covers much of the same
> ground. Like in ibus, there is a central component, engine
> plugins, config frontend, and so on.
> 
> It's worth noting that there is contributor overlap between
> Plasma and fcitx. I personally know less about fcitx because
> my distro supports ibus better, but I don't mean to present
> it as default choice.
> 
> c) Others
> 
> There's a handful of other entries in this space - solutions
> focussing on a single language but standalone instead of
> using the ibus or fcitx frameworks (e.g. the Navi/Nabi input
> method for Korean), or legacy systems like scim.
> 
> Mobile input stacks duplicate much of the work found in ibus
> and fcitx as well, e.g. Maliit and the Qt Virtual Keyboard
> both have their own language-specific engine implementations.
> This is unfortunate, as config and state are not shared
> between physical and virtual keyboards, and feature set and
> behavior may differ.
> 
> Input method engine plugins to ibus and fcitx and others
> sometimes rely on the same library stack, e.g. libhangul for
> Korean.
> 
> 
> Input method systems interact with applications and toolkits
> via protocols like XIM or the Wayland text-input protocol,
> along with toolkit plugins. Qt offers a public API for input
> method plugins. Qt 5 bundles plugins for compose key support,
> ibus and non-X11 platforms. For Qt 4, the ibus plugin is an
> independent install provided by ibus. fcitx' plugins are an
> independent install as well (iirc).
> 
> 
> = The situation in Plasma 5 right now =
> 
> Text input in Plasma 5 is currently handled by the following
> components:
> 
> * A System Settings module offering keyboard layout management
> * A dynamic panel indicator for keyboard layout state and management
> * A panel applet for key state (CAPS lock, etc.)
> * An "Input Method Panel" (kimpanel) widget that provides
>   state/popup UI frontend to ibus (i.e. a "panel", replacing ibus'
>   default GTK+ panel UI), fcitx and scim
> * KWin is smart enough to do cool things like switch keyboard
>   layouts automatically per virtual desktop or even window
> * The wide character set support of our recommended default
>   typeface (Noto Sans)
> 
> And now the problems start: Once an input method is used, many
> of these components become useless, unsupported or show ugly
> integration seams.
> 
> Additionally, setup often requires heavy distro tooling or expert
> knowledge.
> 
> Here's an example playbook of outfitting an existing, English
> input Plasma 5 system to handle Korean input using ibus:
> 
> - Install ibus and ibus-hangul.
> 
> - Manually add ibus-daemon to the system autostart.
> 
> - Manually add the Input Method Panel widget to the panel.
> 
> - Use the (GTK+) ibus config UI to manage English and Korean
>   input method engines. The config UI is accessed via the Input
>   Method Panel widget, it's not available in System Settings.
> 
> - Use the (GTK+) ibus config UI to manage your keyboard
>   layouts for your input modes. The System Settings keyboard
>   layout module becomes useless when using ibus. While ibus
>   in theory has integration with the xkb "system keyboard
>   layout", in practice things end up fighting each other
>   and the System Settings layout handling has to be disabled.
> 
> - The dynamic keyboard layout panel indicator becomes useless.
> 
> - Kwin-assisted layout switching becomes useless.
> 
> - Up until commits on master yesterday that made it a little
>   easier, the user may need expert knowledge to get the Input
>   Method Panel widget to work. By default ibus will start its
>   default bundled GTK+ panel frontend and Input Method Panel
>   will not work, unless disk configuration is changed or the
>   ibus-daemon is started with the right CLI args. On master
>   Input Method Panel will work, but (due to an ibus bug) the
>   GTK+ panel will co-exist and clutter up the tray.
> 
> - The Input Method Panel widget itself is pretty great, but
>   somewhat poorly integrated into the overall Plasma panel
>   UX. It looks and behaves as a second system tray, showing
>   icon buttons for the various features the active input
>   method engine provides (these may change dynamically during
>   input or as the engine is switched), with a distinct system
>   for hiding individual buttons.
> 
> Using fcitx, the situation is slightly better but similar. Unlike
> ibus, fcitx provides a third-party config module that integrates
> into System Settings. But our bundled modules become likewise
> mostly useless, as does Kwin assist, etc.
> 
> The playbook looks better assuming a fresh first-time
> installation. Here's the best case:
> 
> - Select the Korean language while installing $distro.
> 
> - $distro takes care of pulling in the packages and setting up
>   autostart.
> 
> - $distro takes care of arbitrating between Input Method Panel
>   and upstream UI frontends.
> 
> - Plasma on first log-in auto-adds Input Method Panel to the
>   default panel in locales it knows need it.
> 
> Even better, on some distros language management tools outside of
> System Settings (e.g. YaST) can be used to add languages later that
> will do many of these steps, except for adding Input Method Panel,
> which the user needs to do manually.
> 
> Unfortunately this best case scenario isn't the reality in many
> distros that ship Plasma 5 currently.
> 
> 
> = The situation on Wayland =
> 
> Currently, only ibus works on Wayland. Input method popups are not
> positioned correctly, but otherwise things work. The situation is
> just as good or bad as on X11.
> 
> 
> = Where we need to go =
> 
> * Input method use needs to be a primary usage scenario, not an
>   after thought. We should take care of initializing the input
>   method system.
> 
> * Our keyboard layout management UI needs to become an input
>   language management UI, configuring both input methods and
>   layouts.
> 
> * The dynamic keyboard layout indicator and Input Method Panel
>   need to be merged to eliminate redundancy and make the latter
>   dynamic, not requiring the user to know they need to manually
>   add a widget.
> 
> * Input Method Panel needs to be integrated with the System tray
>   widget, reusing its show/hide infra and eliminating inconsistent
>   layout and behavior seams.
> 
> * KWin-assist (dynamic layout switching) needs to think in
>   input languages, not keyboard layouts.
> 
> * We should surface input method-assisted functionality like
>   Emoji input and typing-booster.
> 
> * Physical and virtual keyboards should share a common feature
>   set, common behavior and state.
> 
> This is catch-up work; other systems are already there. On
> the flip side, we have some nice bits to start with (the good
> Input Method Panel, the powerful tray implementation, generally
> the power of the Plasma platform).
> 
> There's a common position that holds "but computers are used
> in English, duh". But even a hardcore developer audience that
> does use a non-localized system has a need to socialize with
> people in their native writing system. And it's unfortunately
> especially on those systems (not initially installed in the
> native language) where the biggest pain points lie.
> 
> This work is incredibly vital for making Plasma accessible to
> hundreds of millions of additional potential users. It's
> incredibly vital for our mission to make free software
> alternatives available to more people, and strongly aligned
> with the Inclusivity goal in the KDE manifesto.
> 
> 
> = Screenshots =
> 
> Here are some complementary screenshots of Input Method Pane
> on X11, with the Korean input method engine active:
> 
> http://imgur.com/a/F2d9Y
> 
> As you can see, the poorly positioned context menus are
> another entry in the "integration seams" list.
> 
> And some screenshots of ibus configuration:
> 
> http://imgur.com/a/EIW3u
> 
> 
> 
> Cheers,
> Eike

I'd like to be more clear on what input method would need from kwin-wayland 
side.

1. Current situtation
Right now, almost all input method frameworks under linux are implemented by a 
Client-Server architecture. Client are the real applications, they send key 
event to the input method server, and input method replies with what to do. 
Currently under X11, client will also send the coordinate of cursor location 
to input method, and input method server will move a window to that location. 
The window will usually shows candidates and bunch of other stuff.

2. Input method protocols
- XIM
- im module based custom protocols (gtk/qt/sdl...)
- wayland-input-method (It's really bad, no one is using it. The only support 
I know if from efl toolkit, qt/gtk doesn't work with it)

3. What makes it harder for wayland?
Wayland does not have a global positioning. 
- If client is using XIM, it works under xwayland, but, the position send by 
the client may be just offset from the window.
- Same applies to im module
- wayland im protocol does not send the position, instead, it allow client to 
set a surface to be the input method panel, and ask compositor to move it 
around.

There's no existing protocol solves all this requirements. So kwin need to 
provide certain ability for client to move window around.

Also I'd like to add how gnome-shell is doing this, not essentially the only 
method though.

gnome-shell's kimpanel equilvalent stuff is bundled in gnome-shell, which make 
it be able to access the windows position. Input method server send the offset 
received from client (which is not global), and the gnome-shell move the panel 
to the (offset+current window location)

4. Let input method server controls the keyboard layout (which layout to use)
Keyboard layout is nothing special comparing with input method. Nowadays, 
modern input method framework are trying to take over all this stuff. This is 
essential for users to get best experience if they use multiple input method. 
Because there's a concept called "input context", which is not essentially 
one-to-one map to the window.