High Performance SSH/SCP - HPN-SSH when?

Corinna Vinschen vinschen at redhat.com
Wed Apr 5 21:02:38 EST 2006

On Apr  5 10:09, Darren Tucker wrote:
> Chris Rapier wrote:
> > So this is an additional decrease in performance if the HPN patch is 
> > used with 8k buffers? What can you tell me about the path? Does it have 
> > a very low RTT? The available version of HPN (HPN11) polls the SO_RCVBUF 
> > once every window so in low RTT environments the additional cycles spent 
> > on this could have an impact. The beta version of the patch (HPN12) 
> > provides a switch to disabled buffer polling. I'm still working on this 
> > issue (recreating the problems consistently has been an issue for me) 
> > but you might want to look at the HPN12 patch set.
> I think this is another symptom of the HPN patch letting the buffers get
> way too big under some conditions, then ssh spents a disproportionate
> amount of time compacting those buffers.
> Assume the the 8k writes are relatively slow on Cygwin (which appears to
> be the case[1]).  ssh will be emptying the output buffer relatively
> slowly, but the CPU can encrypt much faster than the IO rate.  Normally
> the buffer would peak at 5-10 MB under these conditions, but the
> BUFFER_MAX_HPN_LEN change means that they can grow really big (up to
> 2^29 bytes).  The buffer gets compacted at 1MB consumed, so the process
> becomes "read 1MB in 8k chunks then memmove (2^29 - 1M).
> Corinna, if this is the case you should see ssh consuming a lot more
> memory, more CPU and, if you can profile it, spending a lot more time in
> memmove.

As I replied to Chris already, I won't be able to make more tests in
the next few days, but I'll be back on Tuesday.

As for Cygwin's slowness, I think that neither memcpy/memmove, nor
malloc are the problem.  The only function call in the client loop,
which would account for Cygwin's slowness is probably select(2).
The select implementation in Cygwin is rather complicated and it's
slow by definition.  The problem is that Windows' own select only works
on sockets, not on any file descriptor.  Therefore, to implement a
general select, a lot of handstands have to be made.  Perhaps that's
the problem which slows down the loop when read uses a rather small


Corinna Vinschen
Cygwin Project Co-Leader
Red Hat

More information about the openssh-unix-dev mailing list