[Bug 3971] New: Race condition when using ControlMaster=auto with simultaneous connections

bugzilla-daemon at mindrot.org bugzilla-daemon at mindrot.org
Mon Jun 22 02:40:55 AEST 2026


https://bugzilla.mindrot.org/show_bug.cgi?id=3971

            Bug ID: 3971
           Summary: Race condition when using ControlMaster=auto with
                    simultaneous connections
           Product: Portable OpenSSH
           Version: 10.3p1
          Hardware: Other
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: ssh
          Assignee: unassigned-bugs at mindrot.org
          Reporter: jens.rosenboom at web.de

Created attachment 3967
  --> https://bugzilla.mindrot.org/attachment.cgi?id=3967&action=edit
fix race condition when using ControlMaster=auto with  simultaneous
connections

Hello everyone,

Baptiste Jonglez (baptiste.jonglez at inria.fr) back in 2022 (on Wed
Aug 31 23:24:12 AEST) reported an issue, which still persists:
---
Hello,

I'm trying to multiplex many simultaneous SSH connections through a
single
master connection, and I'm hitting a race condition while doing this.
This is not a bug; I'm either hitting a limit in the design of OpenSSH
or
misusing it.

The use-case is to use Ansible to configure many hosts simultaneously,
while all connections need to go through a single "SSH bastion" via
ProxyJump.
For efficiency and to avoid hitting MaxStartups limits, I would like to
use a control master for the connection to the bastion, via the
following
client configuration:

    Host bastion.example.com
      ControlMaster auto
      ControlPath /dev/shm/ssh-%h
      ControlPersist 30

    Host !bastion.example.com *.example.com
      ProxyJump bastion.example.com

However, this does not work when making simultaneous connections: all
SSH
connections create a new, separate connection to the bastion.  Here is
a
simple way to reproduce:

    $ for i in {1..3}; do ssh myhost.example.com "sleep 1" & done
    ControlSocket /dev/shm/ssh-bastion.example.com already exists,
disabling multiplexing
    ControlSocket /dev/shm/ssh-bastion.example.com already exists,
disabling multiplexing

What happens is the following:

1) each SSH process tries to connect to the control socket and fails
   (this is expected, the control socket is not yet bound)

2) each SSH process then creates a new SSH connection

3) once connected, each process tries to bind to the control socket

4a) one process successfully binds the control socket
4b) all other processes fail to bind the control socket (error message
above)

5) in both cases, each process is now using its own separate SSH
connection to the bastion

The window for the race condition is between 1) and 4), so it's rather
large: it includes the time to establish a new SSH connection.
[...]
---
[See
https://lists.mindrot.org/pipermail/openssh-unix-dev/2022-August/040388.html
for his full report.]

In my case, I had the same issue when starting multiple connections by
"konsole --tabs-from-file tabs.txt" and a target system behind a VPN
and an intrusion detection system (which seems to have delayed all
incoming connections significantly, so the issue happened each time).

Baptiste suggested to use a lock file, but IMHO it is better and
simpler to change the order of creating the mux control socket vs
trying to open it as a mux client.

With attached patch, the steps (similar to how Baptiste described it;
with multiplexing enabled) would be:

1) each SSH process tries to bind the control socket. For the first one
this will succeed and this process becomes the mux master. The others
become mux clients.

2a) the mux master creates a new SSH connection and serves the control
socket
2b) each mux client connects to the control socket (this should
succeed, as the control socket is bound and listening already)

For this purpose, the patch puts the actual socket creation and binding
from function muxserver_listen() into a separate function, muxserver().
This new function is then called already before muxclient() - and when
muxserver() succeeds, muxclient() is not called anymore.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the openssh-bugs mailing list