issue with openssh-server running in a libvirt based centos virtual machine

Adrian Pascalau adrian27oradea at gmail.com
Mon Jan 29 20:41:26 AEDT 2018


On Sun, Jan 28, 2018 at 9:45 PM, Adrian Pascalau
<adrian27oradea at gmail.com> wrote:
> I have tested this one more time with all the hosts (ssh client and
> ssh server) in the same subnet, no routers/vpn in between. All hosts
> are connected to the same switch, same problem persist, so it is not
> an MTU issue.
>
> I took several tcpdump traces, and compared the working ssh sessions
> with the non working ones, and this is what I have found: when an
> Ethernet frame that is less then 60 bytes in size goes through the
> network, it is padded with 0x00 bytes until it has 60 bytes in length
> (64 with the frame check sequence). In my network I have a linux
> bridge that connects the centos VM to he external network. When this
> kind of padded frames goes through the linux bridge, somehow the IP
> and TCP headers in those frames wrongly consider the 0x00 padded bytes
> as part of the user data, therefore the upstream protocol (SSH in this
> case) tries to interpret them, and this is why Putty hangs. Those 0x00
> padded bytes are at the layer2 Ethernet frame level, and should not be
> considered in the user data of the higher level protocols. I think I
> should take this to the linux bridge mailing lists.

Ok, so I found a workaround for this, even if I do not know who caused
this issue.

Basically I noticed that I have this ssh connection issue only when
the ssh client runs on a windows host. If the ssh client runs on a
linux host, the ssh connection works without any problem. So I have
compared the tcpdump for ssh connections initiated from both windows
and linux, and what I have noticed is that on centos linux, by default
the TCP stack uses timestamps in the TCP Options, and because of this,
the Ethernet frames are never below 60 bytes, while in my windows the
TCP Options timestamps are not used, and therefore some Ethernet
frames are less than 60 bytes.

So I enabled the TCP Options timestamps in windows as well, by running
the command 'netsh int tcp set global timestamps=enabled', and just
like that, the ssh started to work. Still, I do not know who is
causing this issue, and who to blame for this behavior...

Any suggestion how to identify which network element wrongly assigns
the Ethernet padding to the TCP payload is more than welcome.



More information about the openssh-unix-dev mailing list