SSH hang question

Darren Tucker dtucker at dtucker.net
Sun Nov 10 18:58:47 AEDT 2019


On Sun, 10 Nov 2019 at 05:10, Steve McAfee <smcafee.social at gmail.com> wrote:
> Very rarely, but it has repeated, we see openssh on the client side
> hanging. On the server side there is no indication of connection in the
> logs. These are always scripted remote commands that do not have user
> interaction when we find it. This seems to be happening only in vm
> environments but I could be wrong.

What's the VM platform and underlying network technology?  At least one (VMWare
Fusion) is known to have problems, although yours doesn't sound
exactly like this:
https://marc.info/?l=openssh-unix-dev&m=153535111501535&w=2

> It seems surprising to me that there
> would not be timeouts and retries on the protocol,

SSH is built on top of TCP, which provides the reliable bytestream and
thus implements the
timeouts and retries, so if you can find the problematic connection in
the output of netstat
you mayget some clues about what's going on.

One of the failure modes that can behave as you describe is the infamous TCP MTU
blackhole, wherein a large packet gets fragmented, the 2nd fragment
gets dropped for
some reason and the IP packet times out during reassembly.  TCP
retransmits the packet,
which again gets fragmented and the cycle repeats until TCP eventually
times out the
connection.  PPPoE and 802.1Q vlans are common culprits because they reduce the
MTUs just a little bit.

I'd suggest checking:
 - netstat for the failing connections looking for increasing SendQ values,
 - netstat -s on problematic machines looking for atypical counter values
 - MTUs on the hosts and everything in between them.

If it's none of these things then it's probably time to break out tcpdump.

> Or maybe there is some setting to make the connection reliable.

The ServerAliveInterval and ServerAliveCount settings can detect ths
class of failure
I described above, but in those cases the root cause is a broken
network and the network
is what needs to be fixed.

-- 
Darren Tucker (dtucker at dtucker.net)
GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860  37F4 9357 ECEF 11EA A6FA (new)
    Good judgement comes with experience. Unfortunately, the experience
usually comes from bad judgement.


More information about the openssh-unix-dev mailing list