OS X poll breakage (Was: Please help test recent changes)

Damien Miller djm at mindrot.org
Tue Jan 11 15:00:36 AEDT 2022


On Mon, 10 Jan 2022, Darren Tucker wrote:

> On Fri, Jan 07, 2022 at 09:52:09AM +1100, Damien Miller wrote:
> > 1. Conversion of the ssh and sshd mainloop from select() to poll()
> > 
> > This should be entirely invisible to users, so any behaviour change
> > is a bug. If you see something and want to help debug it further,
> > uncomment the DEBUG_CHANNEL_POLL #define in channels.c for helps of
> > extra debug logging.
> 
> This change breaks (at least) the exit-status test on Mac OS X.  I
> haven't had a chance to investigate yet,
> 
> $ git bisect run sh -c "autoreconf && ./configure --without-openssl &&
> make t-exec LTESTS=exit-status"
> [... lots ...]
> 17877bc81db3846e6e7d4cfb124d966bb9c9296b is the first bad commit
> commit 17877bc81db3846e6e7d4cfb124d966bb9c9296b
> Author: djm at openbsd.org <djm at openbsd.org>
> Date:   Thu Jan 6 21:48:38 2022 +0000
> 
>     upstream: convert ssh, sshd mainloops from select() to poll();
> 
>     feedback & ok deraadt@ and markus@ has been in snaps for a few months
> 
>     OpenBSD-Commit-ID: a77e16a667d5b194dcdb3b76308b8bba7fa7239c
> 
> $ uname -a
> Darwin osx-highsierra 17.7.0 Darwin Kernel Version 17.7.0: Fri Oct 30
> 13:34:27 PDT 2020; root:xnu-4570.71.82.8~1/RELEASE_X86_64 x86_64

Here's the client side log of the failure. It comes from the
"same with early close of stdout/err" section of the test, but I can't
actually see anything get closed...

debug3: receive packet: type 91
debug2: channel_input_open_confirmation: channel 0: callback start
debug2: client_session2_setup: id 0
debug1: Sending command: exec sh -c 'sleep 2; exec > /dev/null 2>&1; sleep 3; exit 0'
debug2: channel 0: request exec confirm 1
debug3: send packet: type 98
debug2: channel_input_open_confirmation: channel 0: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768
debug2: channel 0: rcvd adjust 2097152
debug3: receive packet: type 99
debug2: channel_input_status_confirm: type 99 id 0
debug2: exec request accepted on channel 0
channel 0: invalid rfd pollfd[2].fd 4 r4 w7 e8 s-1

and (separate run) with DEBUG_CHANNEL_POLL set:

debug2: channel_input_status_confirm: type 99 id 0
debug2: exec request accepted on channel 0
debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[2].fd=4 want 0x01 ev 0x01 ready 0x00 rev 0x00
debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[3].fd=7 want 0x01 ev 0x00 ready 0x00 rev 0x00
debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[4].fd=8 want 0x01 ev 0x00 ready 0x00 rev 0x00
debug1: channel_after_poll: pfd[2].fd 4 rev 0x0020
debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[2].fd=4 want 0x01 ev 0x01 ready 0x00 rev 0x20
channel 0: invalid rfd pollfd[2].fd 4 r4 w7 e8 s-1
FAIL: exit code (with sleep) mismatch for: 255 != 0

It looks like it fails with HAVE_POLL set to 0 too.

Still looking...

-d


More information about the openssh-unix-dev mailing list