The curious case of stuck systemd poweroff

Mark Gaiser markg85 at gmail.com
Tue Jul 26 12:31:46 BST 2016


On Thu, Jul 14, 2016 at 8:17 PM, Harald Sitter <sitter at kde.org> wrote:

> On Thu, Jul 14, 2016 at 8:06 PM, Andreas Hartmetz <ahartmetz at gmail.com>
> wrote:
> > On Donnerstag, 14. Juli 2016 16:19:15 CEST Harald Sitter wrote:
> >> On Thu, Jul 14, 2016 at 3:43 PM, Andreas Hartmetz
> > <ahartmetz at gmail.com> wrote:
> >> > Hello,
> >> >
> >> > Am Donnerstag, 14. Juli 2016, 11:11:26 CEST schrieb Harald Sitter:
> >> >> Hola!
> >> >>
> >> >> ever since systemd and or sddm started not killing all our session
> >> >> processes we have had problems of poweroff/reboot getting hung up
> >> >> waiting for processes to quit.
> >> >> Recently systemd then started sending them TERM by default, which
> >> >> in
> >> >> theory should make things behave as before, but more often than not
> >> >> it doesn't.
> >> >>
> >> >> The reason for this is meh to debug and altogether somewhat
> >> >> convoluted. So all that follows was partially inferred from
> >> >> numerous
> >> >> logging attempts.
> >> >> They all root in a simple fact: ksmserver is rubbish at its job and
> >> >> only terminates half the stuff in the session before handing over
> >> >> to
> >> >> the outside expecting the outside to deal with it.
> >> >>
> >> >> I found two likely holdup scenarios caused by this:
> >> >>
> >> >> a) procfoo is still running -> ksmserver hands over to systemd ->
> >> >> systemd stops sddm -> xserver stops -> procfoo now crashes because
> >> >> it
> >> >> does x-things (pretty sure [1] is an instance of this) -> kcrash
> >> >> jumps in -> drkonqi -> gdb -> procfoo wont react to anything but
> >> >> KILL now>
> >> > Hah, that's a nice one. It should indeed be fixed in kcrash.
> >> >
> >> >> b) procfoo is still running -> ksmserver hands over to systemd ->
> >> >> procfoo survives without X (e.g. kio slave) -> procfoo crashes for
> >> >> (maybe unreleated) reasons such as qt bug because network is down
> >> >> ->
> >> >> kcrash gets hung up on recursion crashes handling for kdeinit5 or
> >> >> some other nonesense
> >> >
> >> > It is not even clear that surviving processes need to be killed in
> >> > case of logout, which one also needs to consider. See below.
> >> >
> >> >> Long story short: if things crash, usually the TERM from systemd
> >> >> won't do anything.
> >> >>
> >> >> The way I see it ksmserver needs to properly TERM everything to
> >> >> protect against a). Kcrash additionally ought to not do anything
> >> >> when
> >> >> its session is in shutdown to guard against both a) and b) AND
> >> >> allow
> >> >> core dumps to be collected instead so there actually can be a trace
> >> >> of something having gone wong.
> >> >
> >> > It is not really ksmserver's job to SIGTERM or even SIGKILL
> >> > applications. It implements XSMP which involves asking application
> >> > nicely to die. If they didn't, they were killed just fine until
> >> > systemd "improved" things.
> >> > Not everything participates in XSMP so ksmserver doesn't see all
> >> > processes in any case.
> >> > What systemd ought to do is:
> >> > - when shutting down, kill everything after a short timeout
> >> > - when logging out, don't kill anything (think of screen sessions
> >> > and
> >> >
> >> >   such)
> >> >
> >> > This is a systemd problem. Before systemd it worked as described
> >> > above and it was good.
> >> >
> >> >> Thoughts?
> >> >>
> >> >> I have no clue how we'd implement kcrash changes since that would
> >> >> have to somehow know if the session is active without doing
> >> >> business on the heap. For ksmserver we could probably lean on
> >> >> systemd to give a proc list of the session.
> >> >
> >> > So that would mean adding code on our side and integrating deeper
> >> > with systemd to unbreak systemd behavior. I think systemd should do
> >> > its job properly and get out of the way.
> >>
> >> so no useful input then. ok.
> >
> > The hell are you talking about? The action items are:
> > - Disable kcrash during logout
> > - File upstream bug in systemd to stop with its ill-advised behavior
> >
>  whats the bug?
>

Hi,

Thought i might add this small bit of info from the systemd 231 release
notes [1]:

        * systemd will now log about all service processes it kills forcibly
          (using SIGKILL) because they remained after the clean shutdown
phase
          of the service completed. This should help identifying services
that
          shut down uncleanly. Moreover if KillUserProcesses= is enabled in
          systemd-logind's configuration a similar log message is generated
for
          processes killed at the end of each session due to this setting.

It might help in figuring out some of this.

[1]
https://lists.freedesktop.org/archives/systemd-devel/2016-July/037220.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20160726/6f23a96a/attachment.htm>


More information about the kde-core-devel mailing list