APIs for persistent remote access and headless/hybrid sessions

Erik Jensen rkjnsn at google.com
Thu Feb 29 20:51:07 GMT 2024


> We are definitely interested in working in this area. We have a lot of
> the proposed core infrastructure that's listed in that proposed
> specification; pipewire, creation of virtual screens, libei access is
> coming soon. It's "just" a case of gluing everything together.

That's great to hear!

> The current portal API is expanding with a concept of tokens to allow
> creation of streams without user prompts and access. I was hoping this
> would suffice if it was coupled with some mechanism to get these
> tokens ahead of time?

That should work as long as there is a mechanism to provision the
tokens in an automated fashion that grants access to virtual monitor
configuration and clipboard in addition to capture and input
injection. (One of our use cases is being able to set up CRD during
the automated provisioning process for a VM, so the user can connect
immediately after provisioning.) Indeed, since our remote assistance
flow will be using the Portal APIs in any event, having less
divergence would not be unwelcome. (The proposed dedicated API still
uses PipeWire and libei, though, so it wouldn't be a big deal either
way.)

I could imagine a flow that looks like the following:
 * The remote desktop tool connects to the login manager on the system
bus using some TBD API.
 * The two negotiate an authentication method. (Graphical greeter,
username & password, PAM conversation, Kerberos, et cetera.)
 * If a graphical greeter is negotiated, a token is provided to the
remote desktop tool to connect to it. (How? Is an implementation of
the needed portal interfaces provided by the greeter on the system
DBUS, protected by the token?)
 * When the user logs in, a shared token is passed to the desktop
environment that is being started in or transitioned to headless mode.
 * The login manager provides the remote desktop tool with the token
and some kind of handle to the resulting session.
 * The remote desktop tool spins up a process running as the target
user, which uses the token to connect to the compositor via the Portal
APIs on the user bus.
 * The remote desktop tool hands off the connection from the system
process to the new user process.

(Note that I'm not 100% on how everything works, so apologies if any
aspects of that don't make sense.)

That said, there are enough differences between the current Portal API
use case and what is wanted by a persistent remote access tool that I
can definitely see the argument for a separate API like that proposed
by Jonas Ã…dahl. E.g., having methods to modify the monitory layout in
the Portal API that are in practice only used via persistent remote
access tools via a token and which users would rarely, if ever, want
to grant via the UI to ephemeral remote assistance tools feels weird.
Having a separate API that's only available in headless mode would
also help persistent remote access tools ensure that the connection
wasn't visible from the local workstation.

> Does Chrome Remote Desktop provide the server part that runs on the
> client computer? Or is it using some network API to connect to talk to
> the existing gnome/kde remote-desktop server?

Chrome Remote Desktop provides its own server components. It uses
WebRTC to establish a direct peer-to-peer connection between the
process running on the host machine (the machine being remotely
accessed) and a web-based client running in the user's web browser on
the client machine.


More information about the kde-devel mailing list