libksane seems to break QProcess::start calls

Thiago Macieira thiago at kde.org
Fri Mar 4 17:35:23 GMT 2022


On Friday, 4 March 2022 09:02:37 PST Thiago Macieira wrote:
> > I uploaded the traces here:
> > https://l3u.de/tmp/strace_brscan.txt.xz
> > https://l3u.de/tmp/strace_plustek.txt.xz
> > 
> > Thanks again for all help!
> 
> I'll take a look. Let's see... the brscan trace doesn't have any use of
> PIDFD.

Let's see the plustek scan. It also has zero uses of PIDFD (probably libksane 
is using KProcess, which triggers the QProcess pidfd protection in Qt 5). So 
PIDFD is not the issue.

Let's see what it's doing. The first mention of SIGCHLD is here:

[pid 20601] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|
CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fb873fff910) = 20602
[pid 20601] rt_sigaction(SIGCHLD, {sa_handler=0x7fb894726350, sa_mask=[CHLD], 
sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fb8cd33a5a0}, 
{sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0

That's NOT QProcess. I can tell because the sequence is wrong, possibly buggy: 
it's a bad idea to install the SIGCHLD handler *after* you've fork()ed your 
child, because the child might die before you get there (unless you construct 
the child code so that it can't). QProcess also doesn't set the sa_mask, which 
this code did.

And there's no execve() in PID 20602, so it definitely isn't QProcess. This is 
just something using fork() to run some code that may or may not crash, may or 
may not hang so it may need to be kill()ed.

The second mention of SIGCHLD is this child dying:

[pid 20601] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20602, 
si_uid=1000, si_status=0, si_utime=3, si_stime=31} ---
[pid 20601] rt_sigreturn({mask=[ALRM]}) = 0

I don't know what the signal handler did, because it made no system calls 
before returning. That indicates a poorly written signal handler (it probably 
wrote to a volatile variable). That's also probably buggy.

The third mention of SIGCHLD is QProcess/forkfd:

[pid 20579] rt_sigaction(SIGCHLD, {sa_handler=0x7fb8cda513e0, sa_mask=[], 
sa_flags=SA_RESTORER|SA_SIGINFO|SA_NOCLDSTOP, sa_restorer=0x7fb8cd33a5a0}, 
{sa_handler=0x7fb894726350, sa_mask=[CHLD], sa_flags=SA_RESTORER|SA_RESTART, 
sa_restorer=0x7fb8cd33a5a0}, 8) = 0
[pid 20579] rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], 
sa_flags=SA_RESTORER, sa_restorer=0x7fb8cd33a5a0}, NULL, 8) = 0
[pid 20579] futex(0x7fb8cdd7f3e8, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid 20579] pipe2([24, 25], O_CLOEXEC)  = 0
[pid 20579] eventfd2(0, EFD_CLOEXEC)    = 26
[pid 20579] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|
CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fb8ca53ba90) = 20605
[pid 20579] write(26, "*\0\0\0\0\0\0\0", 8) = 8

Note the same sequence as the previous commit, with a pipe and an eventfd, 
which gets value 42. Note also how the handler for SIGCHLD is installed before 
we fork(), with no mask.

This child's death is properly reported in the fourth mention of SIGCHLD:

[pid 20579] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20605, 
si_uid=1000, si_status=0, si_utime=25, si_stime=0} ---
[pid 20579] waitid(P_ALL, 0, {si_signo=SIGCHLD, si_code=CLD_EXITED, 
si_pid=20605, si_uid=1000, si_status=0, si_utime=0, si_stime=0}, WNOHANG|
WEXITED|WNOWAIT, NULL) = 0
[pid 20579] wait4(20605, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG, 
{ru_utime={tv_sec=0, tv_usec=248891}, ru_stime={tv_sec=0, tv_usec=3966}, ...}) 
= 20605
[pid 20579] write(25, "\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0;
\314\3\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 152) = 152
[pid 20579] close(25)                   = 0
[pid 20579] waitid(P_ALL, 0, 0x7ffc6caac790, WNOHANG|WEXITED|WNOWAIT, NULL) = 
-1 ECHILD (Keine Kind-Prozesse)
[pid 20579] rt_sigreturn({mask=[]})     = 2

And here you see a proper SIGCHLD handler, that actually does some work before 
returning.

The fifth mention of SIGCHLD is the same as the first:

[pid 20636] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|
CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fb8720ab910) = 20637
[pid 20636] rt_sigaction(SIGCHLD, {sa_handler=0x7fb894726350, sa_mask=[CHLD], 
sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fb8cd33a5a0}, 
{sa_handler=0x7fb8cda513e0, sa_mask=[], sa_flags=SA_RESTORER|SA_SIGINFO|
SA_NOCLDSTOP, sa_restorer=0x7fb8cd33a5a0}, 8) = 0

This installs *again* the same SIGCHLD handler at pointer address 
0x7fb894726350 after fork(). This is the first certain bug in this code, third 
possible bug overall. Since it never uninstalled its handler, it shouldn't 
attempt to install it again. There's NO condition under doing that is a 
correct thing to do in a multithreaded application. Zero.

The sixth mention is the same as the second.

The seventh mention is forkfd again, but like a proper SIGCHLD citizen, it's 
different from the fourth mention by not installing the handler again. It just 
starts the child via fork():

[pid 20579] pipe2([24, 25], O_CLOEXEC)  = 0
[pid 20579] eventfd2(0, EFD_CLOEXEC)    = 26
[pid 20579] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|
CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fb8ca53ba90) = 20641
[pid 20579] write(26, "*\0\0\0\0\0\0\0", 8) = 8

The eighth mention when child PID 20641 dies:

[pid 20579] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20641, 
si_uid=1000, si_status=0, si_utime=23, si_stime=0} ---
[pid 20579] rt_sigreturn({mask=[]})     = 2

And this is the reason why you got the symptom of QProcess breaking. We'd have 
expected that the eight mention be the same as the fifth, to keep the pattern 
going, but it isn't. It's the same as the second and sixth mentions. The 
SIGCHLD handler that ran is that stupidly short one, which we now know to be 
definitely buggy.


Unix signals are a very fragile thing that must be used with extreme care, 
especially in multithreaded applications. That's true for the signals that can 
only have one purpose and one source application-wide, but it's even more so 
for the cooperative ones like SIGCHLD. That's the whole reason why pidfd now 
exists: to remove this wart.

This other SIGCHLD-using code isn't cooperating at all. We can easily tell in 
that last mention: it simply returned, without calling the handler that was 
there before it was installed. Any and all SIGCHLD handlers MUST, without 
exception, call the previous handler that was there, in a multithreaded 
application. It failed to call the one that forkfd installs, so forkfd & 
QProcess never found out its own children had died.

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel DPG Cloud Engineering





More information about the kde-devel mailing list