Difference in buffer behaviour between 8.89 and 8.9?
Damien Miller
djm at mindrot.org
Thu May 19 08:57:25 AEST 2022
On Wed, 18 May 2022, rapier wrote:
> Hey all,
>
> I've run in to a strange issue and I'm wondering if anyone has any
> insight on this. So I have a modified version of OpenSSH called hpnssh
> that uses larger receive buffer sizes to improve throughput. When I
> ported that patch set to 8.9 I found out after an initial burst of data
> the throughput would drop to zero after an initial burst and just sort
> of stay there. This happens regardless of the RTT; same issues at
> sub-millisecond RTTs and 150ms.
>
> I've been able to replicate this behaviour in OpenSSH by increasing the
> size of CHAN_SES_DEFAULT_WINDOW from 64*CHAN_SES_PACKET_DEFAULT to
> higher values. You kind of see it in the debug log at 96*CSPD and you
> can start to see it impact throughput at 128*CSDP. With my tests I have
> it at 512*CSPD and the pause will last upwards of 5 seconds. When it's
> at 1024*CSPD (corresponding to a 32MB receive buffer) the pause will
> last 18 seconds or more.
>
> I dropped in a debug statement in to sshbuf.c in sshbuf_allocate to get
> a better view of whats happening. The following are the results of that
> in both 8.9p1 and 8.8p1
[snip]
> In version 8.8 I also see a buffer growing during the course of the
> transfer but it looks like it doesn’t need to grow to the same size as
> in 8.9p1 (in this case WINDOW_DEFAULT was set to 4096*CSPD). This is on
> the same test bed, same set up, and the tests are being run right after
> each other.
>
> The behaviour in 8.9 is very consistent. After a certain number of
> adjusted rlens for, what I am assuming is the packet buffer, it goes
> into this growth phase for another buffer. In my test that been about
> 700 of these adjusts which is unrelated to the size of
> CHAN_SES_WINDOW_DEFAULT
>
> So I’m wondering what changed between 8.8 and 8.9 that might account for
> this and if this is expected/desired behaviour. I'm still working
> through the diff from 8.8 to 8.9 but nothing has leaped out at me yet.
8.9 switch from select() to poll() and included a couple of bugs that
could cause weird problems. IMO you should try to port to what's on
the top of the V_9_0 git branch, which is 9.0 + one more poll()-
related fix.
-d
More information about the openssh-unix-dev
mailing list