Call for testing: OpenSSH 8.9
Darren Tucker
dtucker at dtucker.net
Tue Feb 22 15:11:41 AEDT 2022
On Fri, 18 Feb 2022 at 11:45, Darren Tucker <dtucker at gate.dtucker.net> wrote:
> Looks like it's actually poll vs select.
>
> $ autoreconf
> $ CC=musl-gcc ./configure --without-openssl --without-zlib --with-cflags=-DBROKEN_POLL
TL;DR: it's the combination of the rlimit sandbox and some poll
implementations which fail with EINVAL if "nfds was greater than the
number of available file descriptors".
Additional data point: it seems to be an interaction with the rlimit
sandbox since:
$ ./configure --without-sandbox && make t-exec
passes. By default, it picks the rlimit sandbox
$ ./configure && egrep '#define.*SANDBOX' config.h
#define SANDBOX_RLIMIT 1
On some platforms select(2) fails if it can't open a new FD (these
seem to be ones where select is implemented in userspace on top of
poll).
Here's an strace of where it fails:
30131 write(4, "\0\0\0044\7\24\357\342@\2060\350\0073hV\3\225d\202PH\0\0\1\tcurve2"...,
1080) = 1080
30131 ppoll([{fd=4, events=POLLIN}], 1, NULL, NULL, 8) = -1 EINVAL
(Invalid argument)
30131 munmap(0x7f784ac57000, 4096) = 0
and the call stack where it fails (frame 0 elided since it was my
debugging hack):
#1 0x00005555555b926e in ssh_packet_read_seqnr
(ssh=ssh at entry=0x7ffff7ca7070, typep=typep at entry=0x7fffffffe5c3 "",
seqnr_p=seqnr_p at entry=0x7fffffffe5c4) at packet.c:1368
#2 0x00005555555be322 in ssh_dispatch_run
(ssh=ssh at entry=0x7ffff7ca7070, mode=mode at entry=0, done=0x7ffff7ca7b98)
at dispatch.c:96
#3 0x00005555555be429 in ssh_dispatch_run_fatal
(ssh=ssh at entry=0x7ffff7ca7070, mode=mode at entry=0, done=<optimized
out>)
at dispatch.c:133
#4 0x000055555556106f in do_ssh2_kex (ssh=0x7ffff7ca7070) at sshd.c:2404
#5 main (ac=<optimized out>, av=<optimized out>) at sshd.c:2231
packet.c line 1368 is
if ((r = ppoll(&pfd, 1, timespecp, NULL)) >= 0)
break;
If we stick this in a test program with the rlmits:
$ cat test.c
#define _GNU_SOURCE
#include <sys/resource.h>
#include <errno.h>
#include <string.h>
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <signal.h>
#include <poll.h>
int main(void)
{
struct rlimit rl_zero;
int r;
struct pollfd pfd;
pfd.fd = open("/dev/null", O_RDWR);
pfd.events = POLLIN|POLLOUT;
r = ppoll(&pfd, 1, NULL, NULL);
printf("before rlimit, poll returned %d (%s)\n", r, strerror(errno));
rl_zero.rlim_cur = rl_zero.rlim_max = 0;
setrlimit(RLIMIT_FSIZE, &rl_zero);
setrlimit(RLIMIT_NOFILE, &rl_zero);
r = ppoll(&pfd, 1, NULL, NULL);
printf("after rlimit poll returned %d (%s)\n", r, strerror(errno));
}
$ gcc test.c && ./a.out
before rlimit, poll returned 1 (Success)
after rlimit poll returned -1 (Invalid argument)
This happens on at least Linux+glibc and OpenBSD too. Why? It's
documented! Both Linux and OpenBSD have something like:
ERRORS
poll() and ppoll() will fail if:
[...]
[EINVAL] nfds was greater than the number of available file
descriptors.
and is in fact specified by POSIX[1]:
ERRORS
The poll() function shall fail if:
[EINVAL] The nfds argument is greater than {OPEN_MAX}
This is arguably not useful behaviour (it's not creating a new
descriptor, and in this case we know the FD is perfectly valid since
we successfully wrote to it immediately before the ppoll).
Why does it not happen on other Linux configurations? Those have
different sandbox implementations.
We have a check for similar behaviour in select(), we probably need to
add an equivalent one for poll().
[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/poll.html
--
Darren Tucker (dtucker at dtucker.net)
GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new)
Good judgement comes with experience. Unfortunately, the experience
usually comes from bad judgement.
More information about the openssh-unix-dev
mailing list