The curious case of stuck systemd poweroff
Andreas Hartmetz
ahartmetz at gmail.com
Thu Jul 14 19:06:58 BST 2016
On Donnerstag, 14. Juli 2016 16:19:15 CEST Harald Sitter wrote:
> On Thu, Jul 14, 2016 at 3:43 PM, Andreas Hartmetz
<ahartmetz at gmail.com> wrote:
> > Hello,
> >
> > Am Donnerstag, 14. Juli 2016, 11:11:26 CEST schrieb Harald Sitter:
> >> Hola!
> >>
> >> ever since systemd and or sddm started not killing all our session
> >> processes we have had problems of poweroff/reboot getting hung up
> >> waiting for processes to quit.
> >> Recently systemd then started sending them TERM by default, which
> >> in
> >> theory should make things behave as before, but more often than not
> >> it doesn't.
> >>
> >> The reason for this is meh to debug and altogether somewhat
> >> convoluted. So all that follows was partially inferred from
> >> numerous
> >> logging attempts.
> >> They all root in a simple fact: ksmserver is rubbish at its job and
> >> only terminates half the stuff in the session before handing over
> >> to
> >> the outside expecting the outside to deal with it.
> >>
> >> I found two likely holdup scenarios caused by this:
> >>
> >> a) procfoo is still running -> ksmserver hands over to systemd ->
> >> systemd stops sddm -> xserver stops -> procfoo now crashes because
> >> it
> >> does x-things (pretty sure [1] is an instance of this) -> kcrash
> >> jumps in -> drkonqi -> gdb -> procfoo wont react to anything but
> >> KILL now>
> > Hah, that's a nice one. It should indeed be fixed in kcrash.
> >
> >> b) procfoo is still running -> ksmserver hands over to systemd ->
> >> procfoo survives without X (e.g. kio slave) -> procfoo crashes for
> >> (maybe unreleated) reasons such as qt bug because network is down
> >> ->
> >> kcrash gets hung up on recursion crashes handling for kdeinit5 or
> >> some other nonesense
> >
> > It is not even clear that surviving processes need to be killed in
> > case of logout, which one also needs to consider. See below.
> >
> >> Long story short: if things crash, usually the TERM from systemd
> >> won't do anything.
> >>
> >> The way I see it ksmserver needs to properly TERM everything to
> >> protect against a). Kcrash additionally ought to not do anything
> >> when
> >> its session is in shutdown to guard against both a) and b) AND
> >> allow
> >> core dumps to be collected instead so there actually can be a trace
> >> of something having gone wong.
> >
> > It is not really ksmserver's job to SIGTERM or even SIGKILL
> > applications. It implements XSMP which involves asking application
> > nicely to die. If they didn't, they were killed just fine until
> > systemd "improved" things.
> > Not everything participates in XSMP so ksmserver doesn't see all
> > processes in any case.
> > What systemd ought to do is:
> > - when shutting down, kill everything after a short timeout
> > - when logging out, don't kill anything (think of screen sessions
> > and
> >
> > such)
> >
> > This is a systemd problem. Before systemd it worked as described
> > above and it was good.
> >
> >> Thoughts?
> >>
> >> I have no clue how we'd implement kcrash changes since that would
> >> have to somehow know if the session is active without doing
> >> business on the heap. For ksmserver we could probably lean on
> >> systemd to give a proc list of the session.
> >
> > So that would mean adding code on our side and integrating deeper
> > with systemd to unbreak systemd behavior. I think systemd should do
> > its job properly and get out of the way.
>
> so no useful input then. ok.
The hell are you talking about? The action items are:
- Disable kcrash during logout
- File upstream bug in systemd to stop with its ill-advised behavior
More information about the kde-core-devel
mailing list