[Bug 3304] New: SSH client MUX to multiple hosts causes select: Bad file descriptor

bugzilla-daemon at mindrot.org bugzilla-daemon at mindrot.org
Sat Apr 24 11:20:27 AEST 2021


https://bugzilla.mindrot.org/show_bug.cgi?id=3304

            Bug ID: 3304
           Summary: SSH client MUX to multiple hosts causes select: Bad
                    file descriptor
           Product: Portable OpenSSH
           Version: 8.5p1
          Hardware: amd64
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P5
         Component: ssh
          Assignee: unassigned-bugs at mindrot.org
          Reporter: openssh-bugzilla at erik.ca

Created attachment 3499
  --> https://bugzilla.mindrot.org/attachment.cgi?id=3499&action=edit
OpenSSH client strace output

Hello,

We encountered an issue with the ssh client (even version 8.5p1) where
it tries to select() a closed file descriptor resulting in a failure
and the control master socket is closed.  The issue occurs when we
connect to multiple target hosts (~ 100 hosts) through an SSH bastion
server (using ProxyJump) and issue a command to each target host (Eg.
'id'). We consistently encounter the following error with one of the
*read* file descriptors on a MUX channel:

select: Bad file descriptor

Tested the following versions on Debian 10 (identical results):

OpenSSH 7.9p1 (latest Debian 10 package)
OpenSHS 8.5p1 (github manual build)

Client configuration:

# Bastion: Persistent Socket and SOCKS Proxy
Host my-bastion
    User myuser
    ProxyJump none
    ControlMaster auto
    ControlPersist 28800s
    ControlPath ~/.ssh/my-bastion.sock
    DynamicForward 127.0.0.1:1080
    ExitOnForwardFailure yes
    HostName my-bastion1.mydomain.com

# Jump via Bastion for those hosts
Host *.mydomain.com
    ProxyJump my-bastion

# Catch all
Host *
    User root
    SendEnv LANG LC_*
    AddKeysToAgent yes
    ForwardAgent yes
    TCPKeepAlive yes
    ServerAliveCountMax 3
    ServerAliveInterval 20
    AddressFamily inet


Build:

(See openssh-build.txt attachment)

Steps to reproduce:

# Create a connection to the bastion (debug level 3 logging), exit
(socket is still present on client), strace the ssh pid attached to the
bastion socket on client host:
ssh -vvv -E ssh.log my-bastion
exit
# myuser 14510  0.5  0.0  16256  2660 ?        Ss   00:25   0:00 ssh:
/home/myuser/.ssh/my-bastion.sock [mux]
strace -f -s 2048 -o strace.txt -p 14510

# separate terminal
ANSIBLE_SSH_ARGS= ansible -i my_target_hosts all -a id


When the client attempts to select the closed file descriptor for a MUX
channel, the end result is the control master socket is closed and
unlinked. I will attach files for:

* source locations of both the close() and select()
* ssh logs
* strace output

Let me know if you need any additional info.
Much appreciated,

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the openssh-bugs mailing list