how to troubleshoot ssh multiplexing hanging issues?

Damien Miller djm at mindrot.org
Wed Sep 13 08:46:15 AEST 2017


On Tue, 12 Sep 2017, Ken Chang wrote:

> Everything seems to work for a few days, but then ssh starts to hang, and
> we start seeing several hundred ssh processes all trying to send their
> message but cannot. When i try to run ssh by hand, this is what i get:
> 
> $ ssh -vvv boss at ui1
> OpenSSH_6.6.1, OpenSSL 1.0.1e-fips 11 Feb 2013
> debug1: Reading configuration data /var/lib/worker/.ssh/config
> debug1: /var/lib/worker/.ssh/config line 1: Applying options for *
> debug1: Reading configuration data /etc/ssh/ssh_config
> debug1: /etc/ssh/ssh_config line 56: Applying options for *
> debug1: auto-mux: Trying existing master
> 
> And it hangs at that point indefinitely until Ctrl-C.
> 
> At this point in time, we do see the ssh mux process still running:
> 
> $ ps -eo pid,user,args | awk '$2=="worker" && $3=="ssh:" && $5=="[mux]"
> {print}'
> 29305 worker   ssh: /var/lib/worker/.ssh/cm-boss at ui1:22 [mux]
> 
> I tried to attach strace to the ssh mux process, and this is what i see
> when the problem is happening:

A debug log from the mux process from around this point would be much
more useful. Is there any chance you could catch one?

> accept(4, 0x7ffe26b34360, [128])        = -1 EMFILE (Too many open files)

Indicates that you're running out of file descriptors. Have you increased
MaxSessions on the server? Do you have a PAM module, login.conf or
similar reducing a ulimit? Otherwise, there might be a fd leak in the
mux code. You should try a recent version, 6.6.1 is over three years
old...

-d


More information about the openssh-unix-dev mailing list