OS X poll breakage (Was: Please help test recent changes)

Damien Miller djm at mindrot.org
Tue Jan 11 18:30:29 AEDT 2022


On Tue, 11 Jan 2022, Damien Miller wrote:

> Here's the client side log of the failure. It comes from the
> "same with early close of stdout/err" section of the test, but I can't
> actually see anything get closed...
> 
> debug3: receive packet: type 91
> debug2: channel_input_open_confirmation: channel 0: callback start
> debug2: client_session2_setup: id 0
> debug1: Sending command: exec sh -c 'sleep 2; exec > /dev/null 2>&1; sleep 3; exit 0'
> debug2: channel 0: request exec confirm 1
> debug3: send packet: type 98
> debug2: channel_input_open_confirmation: channel 0: callback done
> debug2: channel 0: open confirm rwindow 0 rmax 32768
> debug2: channel 0: rcvd adjust 2097152
> debug3: receive packet: type 99
> debug2: channel_input_status_confirm: type 99 id 0
> debug2: exec request accepted on channel 0
> channel 0: invalid rfd pollfd[2].fd 4 r4 w7 e8 s-1
> 
> and (separate run) with DEBUG_CHANNEL_POLL set:
> 
> debug2: channel_input_status_confirm: type 99 id 0
> debug2: exec request accepted on channel 0
> debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[2].fd=4 want 0x01 ev 0x01 ready 0x00 rev 0x00
> debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[3].fd=7 want 0x01 ev 0x00 ready 0x00 rev 0x00
> debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[4].fd=8 want 0x01 ev 0x00 ready 0x00 rev 0x00
> debug1: channel_after_poll: pfd[2].fd 4 rev 0x0020
> debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[2].fd=4 want 0x01 ev 0x01 ready 0x00 rev 0x20
> channel 0: invalid rfd pollfd[2].fd 4 r4 w7 e8 s-1
> FAIL: exit code (with sleep) mismatch for: 255 != 0
> 
> It looks like it fails with HAVE_POLL set to 0 too.
> 
> Still looking...

Wow, it looks like Darwin's poll(2) is completely broken for character-
special devices (at least). E.g. the attached program spins shows similar
behaviour when run on /dev/null - it spins, returning revents=POLLNVAL.

It looks like I'm not the first to see this either, some people noticed
it 17 years ago!

https://lists.apple.com/archives/darwin-dev/2005/May/msg00220.html

There's apparently a bug open (Apple bug 3710161), but I can't see it
and if they haven't fixed it by now then they're presumably not in any
great hurry.

Unsetting HAVE_POLL lets the test pass. It seems like some other
programs use a similar approach, e.g.

https://www.mail-archive.com/bug-gnulib@gnu.org/msg00296.html

So I think we need a HAVE_BROKEN_POLL :(

-d


More information about the openssh-unix-dev mailing list