The curious case of stuck systemd poweroff

Andreas Hartmetz ahartmetz at gmail.com
Thu Jul 14 14:43:56 BST 2016


Hello,

Am Donnerstag, 14. Juli 2016, 11:11:26 CEST schrieb Harald Sitter:
> Hola!
> 
> ever since systemd and or sddm started not killing all our session
> processes we have had problems of poweroff/reboot getting hung up
> waiting for processes to quit.
> Recently systemd then started sending them TERM by default, which in
> theory should make things behave as before, but more often than not it
> doesn't.
> 
> The reason for this is meh to debug and altogether somewhat
> convoluted. So all that follows was partially inferred from numerous
> logging attempts.
> They all root in a simple fact: ksmserver is rubbish at its job and
> only terminates half the stuff in the session before handing over to
> the outside expecting the outside to deal with it.
> 
> I found two likely holdup scenarios caused by this:
> 
> a) procfoo is still running -> ksmserver hands over to systemd ->
> systemd stops sddm -> xserver stops -> procfoo now crashes because it
> does x-things (pretty sure [1] is an instance of this) -> kcrash jumps
> in -> drkonqi -> gdb -> procfoo wont react to anything but KILL now
>
Hah, that's a nice one. It should indeed be fixed in kcrash.
 
> b) procfoo is still running -> ksmserver hands over to systemd ->
> procfoo survives without X (e.g. kio slave) -> procfoo crashes for
> (maybe unreleated) reasons such as qt bug because network is down ->
> kcrash gets hung up on recursion crashes handling for kdeinit5 or some
> other nonesense
> 
It is not even clear that surviving processes need to be killed in case 
of logout, which one also needs to consider. See below.

> Long story short: if things crash, usually the TERM from systemd won't
> do anything.
> 
> The way I see it ksmserver needs to properly TERM everything to
> protect against a). Kcrash additionally ought to not do anything when
> its session is in shutdown to guard against both a) and b) AND allow
> core dumps to be collected instead so there actually can be a trace of
> something having gone wong.
> 
It is not really ksmserver's job to SIGTERM or even SIGKILL 
applications. It implements XSMP which involves asking application 
nicely to die. If they didn't, they were killed just fine until systemd 
"improved" things.
Not everything participates in XSMP so ksmserver doesn't see all 
processes in any case.
What systemd ought to do is:
- when shutting down, kill everything after a short timeout
- when logging out, don't kill anything (think of screen sessions and
  such)

This is a systemd problem. Before systemd it worked as described above 
and it was good.

> Thoughts?
> 
> I have no clue how we'd implement kcrash changes since that would have
> to somehow know if the session is active without doing business on
> the heap. For ksmserver we could probably lean on systemd to give a
> proc list of the session.
> 
So that would mean adding code on our side and integrating deeper with 
systemd to unbreak systemd behavior. I think systemd should do its job 
properly and get out of the way.

> [1] https://bugs.kde.org/show_bug.cgi?id=364340

Cheers,
Andreas




More information about the kde-core-devel mailing list