Control Sockets - Understanding ControlPersist

Wed Nov 2 08:12:47 AEDT 2022

Hello,

I'm trying to understand some irregularities with how long SSH control sockets should survive when given a ControlPersist= value. Essentially, I'm assigning them a ControlPersist value of 4 hours but they seem to be lasting an unexpected amount of time. From seconds to minutes, sometimes over an hour. Never 4 hours though. I'm assuming this is something I'm doing in ignorance but haven't been able to pinpoint what it is yet.  

- Control socket host is running OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1d  10 Sep 2019
- Remote hosts are running OpenSSH_6.7p1, OpenSSL 1.0.2d 9 Jul 2015

Essentially, I have a bash script that SSHs into several embedded Linux systems every ~15 seconds. This script utilizes control sockets for each connection like so:

### For creating SSH control socket in the first place
SSH_SOCKET_TIMEOUT="4h"
SSH_SOCKET_OPTIONS="-M -S $SSH_SOCKET_NAME -o ControlPersist=${SSH_SOCKET_TIMEOUT} -o StrictHostKeyChecking=no -o ConnectTimeout=5 "

### For subsequent SSH connections using the control socket
SSH_OPTIONS="-S $SSH_SOCKET_NAME -o StrictHostKeyChecking=no -o ConnectTimeout=5 -o BatchMode=yes"

These variables are set in the script and invoked with a simple:
### Master Socket creation
ssh ${SSH_SOCKET_OPTIONS} ${USERNAME}@${REMOTEHOST}
### Subsequent connections
ssh ${SSH_OPTIONS} ${USERNAME}@${REMOTEHOST}

Later in the script, existence of these control sockets are validated with a simple if check:
if [[ -S "$SSH_SOCKET_NAME" ]]; then
    return 0
else
    return 1
fi

This script then logs when a socket no longer exists (i.e. when that conditional returns 1) and rebuilds it as-needed. Again, what I'm seeing is that the sockets last a variable length of time, from minutes to hours but never 4 hours, the length of the set ControlPersist value.
Wed 26 Oct 2022 12:02:18 AM UTC  transfer_files.sh: INFO: Building SSH Control socket for host1.
Wed 26 Oct 2022 12:02:18 AM UTC  transfer_files.sh: INFO: Building SSH Control socket for hostN.
##### ~8 minutes elapsed before all sockets are rebuilt
Wed 26 Oct 2022 12:10:10 AM UTC  transfer_files.sh: INFO: Building SSH Control socket for host1.
Wed 26 Oct 2022 12:10:10 AM UTC  transfer_files.sh: INFO: Building SSH Control socket for hostN.
##### ~38 seconds elapse
Wed 26 Oct 2022 12:10:48 AM UTC  transfer_files.sh: INFO: Building SSH Control socket for host1.
Wed 26 Oct 2022 12:10:48 AM UTC  transfer_files.sh: INFO: Building SSH Control socket for hostN.
##### ~12 minutes elapsed but only one hosts's socket was rebuilt
Wed 26 Oct 2022 12:22:11 AM UTC  transfer_files.sh: INFO: Building SSH Control socket for host1.

Having said all this (thanks for making it this far), is there something that would cause an SSH control socket to not last the full length of its ControlPersist value? The manpage references using ControlPersist in concert with ControlMaster. I was thinking the -M flag I used was a stand-in for ControlMaster based on the manpage wording. In testing for shorter durations, a control socket always survived the length of the ControlPersist value, then rebuilt itself exactly as intended. 

Any help would be greatly appreciated,
William Dell