Odd Performance Issue in clientloop

rapier rapier at psc.edu
Wed Nov 10 05:16:02 AEDT 2021


So this isn't an issue as much as a weird situation I am not fully 
understanding. That said, if I can understand it then it might be a 
benefit.

In the function client_process_net_input in clientloop.c if I increase 
the size of buf[SSH_IOBUFSZ] to 128k I'm seeing a pretty substantial 
performance improvement - mostly when using aes256-ctr.

For example; with the command

./ssh HostB -caes256-ctr "cat /dev/zero" > /dev/null

I'm seeing throughput of around 610MB/s on a 10Gb network.

When I use an unmodified version I'll see 480MB/s.

Same hosts, same command. The only difference being the size of buf in 
client_process_net_input.

HostA is a Xeon x5675 @3Ghz. HostB is an AMD Ryzen 7 5800X.

My initial assumption is since HostA is CPU bound reducing the number of 
reads has a significant impact. That said, a nearly 30% improvement 
seems excessive for that to be all that's going on. Additionally, I'm 
not seeing as much improvement using chachapoly. In that case, I'm only 
seeing about a 20% improvement. 310MB/s for stock and 375MB/s for the 
big buffer.

Additionally, I'm only seeing the improvement when HostB is sending the 
data and HostA receiving. When HostA (the Xeon) is sending (cat 
/dev/zero | ./ssh HostB "cat > /dev/null") then I'm seeing about 290MB/s 
with either version.

I'm not suggesting any changes to the code. I'm just trying to 
understand what might be happening as it could be the buffer size, 
something with the CPU architecture, the switch I'm using, the distro 
(HostA is fedora, HostB is ubuntu), etc. Any clues would be appreciated.

Here is the specific change I made:

diff --git a/clientloop.c b/clientloop.c
index bfcd50c2..8eebf9c2 100644
--- a/clientloop.c
+++ b/clientloop.c
@@ -600,7 +600,7 @@ client_suspend_self(struct sshbuf *bin, struct 
sshbuf *bout, struct sshbuf *berr
  static void
  client_process_net_input(struct ssh *ssh, fd_set *readset)
  {
-       char buf[SSH_IOBUFSZ];
+       char buf[128*1024];
         int r, len;

         /*


Thanks,

Chris


More information about the openssh-unix-dev mailing list