[Bug 3531] Ssh will not exit when it receives SIGTERM before calling poll in client_wait_until_can_do_something until some events happen.
bugzilla-daemon at mindrot.org
bugzilla-daemon at mindrot.org
Tue Feb 14 13:58:18 AEDT 2023
https://bugzilla.mindrot.org/show_bug.cgi?id=3531
--- Comment #4 from renmingshuai <rmsh1216 at 163.com> ---
Maybe I didn't describe my problem clearly. Using ppoll instead doesn't
seem to solve my problem.
As described above, My expect script would send a SIGTERM signal to the
sftp process to kill it, when it exits because of some errors. The sftp
process would also send a SIGTERM to its child process, ssh, to kill it
after receiving the signal. However, if the ssh process, for example,
received the signal after set_control_persist_exit_time() and just
before call poll() in client_wait_until_can_do_something(), it would
call signal_handler() to handle the SIGTERM first and then sleep due to
call poll() or ppoll() instead of exitting due to receive SIGTERM. In
that case, the ssh process would never exit unless killing it by hand,
becase it's parent, my expect script, had exited and would never send
SIGTERM to it again.
client_loop
client_wait_until_can_do_something
set_control_persist_exit_time(ssh);
<- receive SIGTERM
-> call signal_handler()
poll() // sleep and never exit unless killing it by hand
In my opinion, when SIGTERM is received, ssh should not continue to
call the poll(), or call poll() with setting timeout to -1.
The following describes the reproduction.
1. To make it easier to reproduce this problem, I added sleep(5) before
calling poll().
```
# git diff
diff --git a/clientloop.c b/clientloop.c
@@ -560,6 +560,7 @@ client_wait_until_can_do_something(struct ssh *ssh,
struct pollfd **pfdp,
ssh_packet_get_rekey_timeout(ssh));
}
+ sleep(5);
ret = poll(*pfdp, *npfd_activep, ptimeout_get_ms(&timeout));
if (ret == -1) {
```
2. Run strace -Tttyvs 1024 ./myexpect.sh and myexpect.sh exited because
of some error.
```
15:16:16.575781 openat(AT_FDCWD, "/dev/null", O_RDONLY) = 0</dev/null>
<0.000023>
15:16:16.575862 fcntl(0</dev/null>, F_SETFD, FD_CLOEXEC) = 0 <0.000021>
15:16:16.575984 close(4</dev/null>) = 0 <0.000026>
15:16:16.576064 close(3</dev/null>) = 0 <0.000025>
15:16:16.576145 close(2</dev/null>) = 0 <0.000026>
15:16:16.576229 close(0</dev/null>) = 0 <0.000025>
15:16:16.576310 write(6<pipe:[427008753]>, "q", 1) = 1 <0.000380>
15:16:16.576776 close(-1) = -1 EBADF (错误的文件描述符)
<0.000026>
15:16:16.576927 exit_group(0) = ?
15:16:16.577487 +++ exited with 0 +++
```
3. While myexpect.sh was running, strace its grandchild process,ssh.
The ssh process still called poll() and then sleep even if the SIGTERM
signal is received.
```
15:16:16.573303 nanosleep({tv_sec=5, tv_nsec=0}, {tv_sec=4,
tv_nsec=998917430}) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
<0.001154>
15:16:16.574512 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER,
si_pid=1678659, si_uid=0} ---
15:16:16.574541 rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system
call) <0.000021>
15:16:16.574726 ppoll([{fd=3<socket:[427009847]>, events=POLLIN},
{fd=3<socket:[427009847]>, events=0}, {fd=4<socket:[427011086]>,
events=POLLIN}], 3, NULL, NULL, 0
```
4. When myexpect script exited, its child sftp and the child of sftp,
ssh, are still exists. They would never exit unless being killed by
hand because its parent has exited.
```
# ps -ef|grep myexpect
root 1680657 1462737 0 15:17 pts/3 00:00:00 grep --color=auto
myexpect
[root at localhost ~]# ps -ef|grep sftp
root 1678659 1 0 15:14 ? 00:00:00
/home/communities/openssh-portable/sftp rms at 10.137.16.214
root 1678663 1678659 0 15:14 ? 00:00:00 /usr/local/bin/ssh
-oForwardX11 no -oPermitLocalCommand no -oClearAllForwardings yes
-oForwardAgent no -l rms -s -- 10.137.16.214 sft p
root 1680721 1462737 0 15:17 pts/3 00:00:00 grep --color=auto
sftp
```
The attachment is myexpect.sh.
Sorry to interrupt again.
--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
More information about the openssh-bugs
mailing list