The curious case of stuck systemd poweroff

Harald Sitter sitter at kde.org
Thu Jul 14 15:19:15 BST 2016


On Thu, Jul 14, 2016 at 3:43 PM, Andreas Hartmetz <ahartmetz at gmail.com> wrote:
> Hello,
>
> Am Donnerstag, 14. Juli 2016, 11:11:26 CEST schrieb Harald Sitter:
>> Hola!
>>
>> ever since systemd and or sddm started not killing all our session
>> processes we have had problems of poweroff/reboot getting hung up
>> waiting for processes to quit.
>> Recently systemd then started sending them TERM by default, which in
>> theory should make things behave as before, but more often than not it
>> doesn't.
>>
>> The reason for this is meh to debug and altogether somewhat
>> convoluted. So all that follows was partially inferred from numerous
>> logging attempts.
>> They all root in a simple fact: ksmserver is rubbish at its job and
>> only terminates half the stuff in the session before handing over to
>> the outside expecting the outside to deal with it.
>>
>> I found two likely holdup scenarios caused by this:
>>
>> a) procfoo is still running -> ksmserver hands over to systemd ->
>> systemd stops sddm -> xserver stops -> procfoo now crashes because it
>> does x-things (pretty sure [1] is an instance of this) -> kcrash jumps
>> in -> drkonqi -> gdb -> procfoo wont react to anything but KILL now
>>
> Hah, that's a nice one. It should indeed be fixed in kcrash.
>
>> b) procfoo is still running -> ksmserver hands over to systemd ->
>> procfoo survives without X (e.g. kio slave) -> procfoo crashes for
>> (maybe unreleated) reasons such as qt bug because network is down ->
>> kcrash gets hung up on recursion crashes handling for kdeinit5 or some
>> other nonesense
>>
> It is not even clear that surviving processes need to be killed in case
> of logout, which one also needs to consider. See below.
>
>> Long story short: if things crash, usually the TERM from systemd won't
>> do anything.
>>
>> The way I see it ksmserver needs to properly TERM everything to
>> protect against a). Kcrash additionally ought to not do anything when
>> its session is in shutdown to guard against both a) and b) AND allow
>> core dumps to be collected instead so there actually can be a trace of
>> something having gone wong.
>>
> It is not really ksmserver's job to SIGTERM or even SIGKILL
> applications. It implements XSMP which involves asking application
> nicely to die. If they didn't, they were killed just fine until systemd
> "improved" things.
> Not everything participates in XSMP so ksmserver doesn't see all
> processes in any case.
> What systemd ought to do is:
> - when shutting down, kill everything after a short timeout
> - when logging out, don't kill anything (think of screen sessions and
>   such)
>
> This is a systemd problem. Before systemd it worked as described above
> and it was good.
>
>> Thoughts?
>>
>> I have no clue how we'd implement kcrash changes since that would have
>> to somehow know if the session is active without doing business on
>> the heap. For ksmserver we could probably lean on systemd to give a
>> proc list of the session.
>>
> So that would mean adding code on our side and integrating deeper with
> systemd to unbreak systemd behavior. I think systemd should do its job
> properly and get out of the way.

so no useful input then. ok.




More information about the kde-core-devel mailing list