Race condition when using ControlMaster=auto with simultaneous connections
Demi Marie Obenour
demiobenour at gmail.com
Thu Sep 1 13:37:12 AEST 2022
On 8/31/22 09:24, Baptiste Jonglez wrote:
> Hello,
>
> I'm trying to multiplex many simultaneous SSH connections through a single
> master connection, and I'm hitting a race condition while doing this.
> This is not a bug; I'm either hitting a limit in the design of OpenSSH or
> misusing it.
>
> The use-case is to use Ansible to configure many hosts simultaneously,
> while all connections need to go through a single "SSH bastion" via ProxyJump.
> For efficiency and to avoid hitting MaxStartups limits, I would like to
> use a control master for the connection to the bastion, via the following
> client configuration:
>
> Host bastion.example.com
> ControlMaster auto
> ControlPath /dev/shm/ssh-%h
> ControlPersist 30
>
> Host !bastion.example.com *.example.com
> ProxyJump bastion.example.com
>
> However, this does not work when making simultaneous connections: all SSH
> connections create a new, separate connection to the bastion. Here is a
> simple way to reproduce:
>
> $ for i in {1..3}; do ssh myhost.example.com "sleep 1" & done
> ControlSocket /dev/shm/ssh-bastion.example.com already exists, disabling multiplexing
> ControlSocket /dev/shm/ssh-bastion.example.com already exists, disabling multiplexing
>
> What happens is the following:
>
> 1) each SSH process tries to connect to the control socket and fails
> (this is expected, the control socket is not yet bound)
>
> 2) each SSH process then creates a new SSH connection
>
> 3) once connected, each process tries to bind to the control socket
>
> 4a) one process successfully binds the control socket
> 4b) all other processes fail to bind the control socket (error message above)
>
> 5) in both cases, each process is now using its own separate SSH connection to the bastion
>
> The window for the race condition is between 1) and 4), so it's rather
> large: it includes the time to establish a new SSH connection.
>
> I believe that taking a lock between steps 1) and 4) could solve the issue:
>
> 1.1) each process tries to take an exclusive lock related to the control socket
> 1.1a) one process gets the lock and can continue creating a SSH connection
> 1.1b) all other processes wait on the lock; when the lock is released, they
> go back to step 1) to connect to the control socket
>
> 4.1) once the control socket has been bound, the "lucky process" releases the lock
>
> Does it make sense? Would the project accept a patch implementing this as
> an additional option?
Not sure if this is related, but I would like to have an option to *only* use the
control socket.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xB288B55FFF9C22C1.asc
Type: application/pgp-keys
Size: 4885 bytes
Desc: OpenPGP public key
URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20220831/b9be508f/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20220831/b9be508f/attachment.asc>
More information about the openssh-unix-dev
mailing list