[Bug 2167] Connection remains when fork() fails.
bugzilla-daemon at natsu.mindrot.org
bugzilla-daemon at natsu.mindrot.org
Fri Nov 1 13:39:34 EST 2013
https://bugzilla.mindrot.org/show_bug.cgi?id=2167
--- Comment #2 from Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp> ---
This is not a kernel bug. fatal() is designed to eventually call
exit(), but
this bug is preventing sshd process from calling exit().
The child process calls fatal() when fork() failed at
privsep_postauth().
(Please note that fork() is replaced with -1 for explanation.)
---------- privsep_postauth() in sshd.c ----------
static void
privsep_postauth(Authctxt *authctxt)
{
u_int32_t rnd[256];
#ifdef DISABLE_FD_PASSING
if (1) {
#else
if (authctxt->pw->pw_uid == 0 || options.use_login) {
#endif
/* File descriptor passing is broken or root login */
use_privsep = 0;
goto skip;
}
/* New socket pair */
monitor_reinit(pmonitor);
pmonitor->m_pid = -1; // Emulate the fork() failure.
if (pmonitor->m_pid == -1)
fatal("fork of unprivileged child failed");
(...snipped...)
---------- privsep_postauth() in sshd.c ----------
fatal() calls cleanup_exit(255). (Please note that dummy write(-1, "",
$step)
lines are inserted for comparing with strace command.)
---------- fatal() in fatal.c ----------
void
fatal(const char *fmt,...)
{
va_list args;
va_start(args, fmt);
do_log(SYSLOG_LEVEL_FATAL, fmt, args);
va_end(args);
write(-1, "", 0);
cleanup_exit(255);
}
---------- fatal() in fatal.c ----------
cleanup_exit(255) will eventually call _exit(255).
---------- cleanup_exit() in sshd.c ----------
/* server specific fatal cleanup */
void
cleanup_exit(int i)
{
static int in_cleanup;
int is_privsep_child;
write(-1, "", 1);
/* cleanup_exit can be called at the very least from the
privsep
wrappers used for auditing. Make sure we don't recurse
indefinitely. */
if (in_cleanup)
_exit(i);
write(-1, "", 2);
in_cleanup = 1;
if (the_authctxt)
do_cleanup(the_authctxt);
write(-1, "", 3);
is_privsep_child = use_privsep && pmonitor != NULL &&
!mm_is_monitor();
write(-1, "", 4);
if (sensitive_data.host_keys != NULL)
destroy_sensitive_data(is_privsep_child);
write(-1, "", 5);
packet_destroy_all(1, is_privsep_child);
#ifdef SSH_AUDIT_EVENTS
/* done after do_cleanup so it can cancel the PAM auth 'thread'
*/
write(-1, "", 6);
if ((the_authctxt == NULL || !the_authctxt->authenticated) &&
(!use_privsep || mm_is_monitor()))
audit_event(SSH_CONNECTION_ABANDON);
#endif
write(-1, "", 7);
_exit(i);
}
---------- cleanup_exit() in sshd.c ----------
Did we reach the _exit(i) line? Let's check the strace command.
---------- strace log start ----------
[pid 17153] socketpair(PF_FILE, SOCK_STREAM, 0, [5, 6]) = 0
[pid 17153] fcntl(5, F_SETFD, FD_CLOEXEC) = 0
[pid 17153] fcntl(6, F_SETFD, FD_CLOEXEC) = 0
[pid 17153] sendto(4, "<82>Nov 1 11:11:08 sshd[17153]:"..., 73,
MSG_NOSIGNAL, NULL, 0) = 73
[pid 17153] close(4) = 0
[pid 17153] write(4294967295, "", 0) = -1 EBADF (Bad file
descriptor)
[pid 17153] write(4294967295, "\0", 1) = -1 EBADF (Bad file
descriptor)
[pid 17153] write(4294967295, "\0G", 2) = -1 EBADF (Bad file
descriptor)
[pid 17153] write(4294967295, "\0Go", 3) = -1 EBADF (Bad file
descriptor)
[pid 17153] write(4294967295, "\0Got", 4) = -1 EBADF (Bad file
descriptor)
[pid 17153] getuid() = 0
[pid 17153] write(5, "\0\0\0DR", 5) = 5
[pid 17153] write(5, "\0\0\0/c1:bd:33:a5:66:d5:83:2d:0c:9"..., 67) = 67
[pid 17153] read(5, ^C <unfinished ...>
Process 17147 detached
Process 17153 detached
---------- strace log end ----------
We can see that the child process reached the write(-1, "", 4) line but
did not reach the write(-1, "", 5) line. This means that the child
process
is sleeping at
if (sensitive_data.host_keys != NULL)
destroy_sensitive_data(is_privsep_child);
trying to read data from fd == 5.
What is fd == 5 connected with?
According to strace command, fd == 5 and fd == 6 are a socket pair
created by monitor_reinit() call in privsep_postauth().
---------- monitor_reinit() in monitor.c ----------
void
monitor_reinit(struct monitor *mon)
{
int pair[2];
monitor_socketpair(pair);
mon->m_recvfd = pair[0];
mon->m_sendfd = pair[1];
}
---------- monitor_reinit() in monitor.c ----------
destroy_sensitive_data() tried to write to fd == 5 and trying to read
from
fd == 5, but there is no writers writing to fd == 6.
Dead lock caused by trying to I/O against wrong file descriptor.
This was why calling shutdown() seems to solve the problem.
--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
More information about the openssh-bugs
mailing list