Parallel transfers with sftp (call for testing / advice)

Cyril Servant cyril.servant at gmail.com
Wed May 6 02:03:25 AEST 2020


Ron Frederick wrote:
> On May 5, 2020, at 8:36 AM, Cyril Servant <cyril.servant at gmail.com> wrote:
>> Ron Frederick wrote:
>>> I haven’t reviewed the patch in detail, but one thing that jumped out at me when looking at this was your choice to use multiple channels to implement the parallelism. The SFTP protocol already allows multiple outstanding reads or writes on a single channel, without waiting for the server’s response. I don’t know if the current OpenSSH client takes advantage of that or not, but it seems to me that you’d get the same benefit from that as you would from opening multiple channels with less setup cost (not having to open the additional channels, and not having to do separate file open calls on each channel). It might also make some other things like the progress bar support easier, as all of the responses would be coming back on the same channel.
>> 
>> As Matthieu said earlier, each ssh channel speed is bound to a CPU core speed.
>> The whole point here is to use multiple ssh channels in order to increase
>> transfer speed. And as Ben just said, there is already the -R option in sftp.
>> This option indeed helps, but the channel speed is still bound to a CPU core
>> speed.
> 
> 
> Thanks. It looks like OpenSSH’s SFTP “-B” and “-R” are indeed similar to what I described.
> 
> When you say “SSH channel” here, though, do you actually mean multiple data streams sent over a single TCP connection, or multiple independent TCP connections? I haven’t tried to measure this, but I would have expected most of the cost of OpenSSH processing to be the decryption, which happens at the connection level, not any processing at the channel level. So, I’m a bit surprised that you’d be able to get much benefit from multiple cores when running multiple channels on a single TCP connection.

I say SSH channel, but I should talk about SSH connection. And as these are
different TCP connections, it allows simultaneous connection to multiple
servers. Of course, this is only useful if these servers share the same network
storage (NFS, luste…).
-- 
Cyril


More information about the openssh-unix-dev mailing list