SCP with Resume Feature

rapier rapier at psc.edu
Fri Apr 9 05:21:36 AEST 2021



On 4/7/21 10:41 AM, Ron Frederick wrote:

> That said, is the SCP implementation in OpenSSH currently doing any file-level parallelization? I wouldn’t expect it to, so I’m not sure that would explain the performance difference. If I had to guess, it’s more likely due to the fact that there’s a single round-trip with SCP for each file transfer, whereas SFTP involves separate requests to do an open(), read(), stat(), etc. each of which has its own round-trip. Some of those (such as the read() calls) are parallelized, but you still have to pay for the open() before beginning the reads, and possibly for other things like stat() when preserving attributes.
> 

No parallelization at all. It's something I thought about but it's 
something I'll have to come back to when I have time. There are other 
deliverables for this project I need to focus on. As for the number of 
RTs - there are a couple of message round trips but nothing all that 
much. The resume feature increases the number of RTs but it's still faster.

I absolutely agree with Damien about the pipeline stalling being the 
major factor. Anyway, I've been looking at learning more about 
pipelining. :)

In some cases there *might* be an issue with hitting the outstanding 
message request limit but that's not what's happening here. I really do 
want to take a closer look at this - especially if SCP is going to 
default to the SFTP protocol soon. In the high performance computing 
community we do have faster transport tools like GridFTP and Aspera but 
they have some serious barriers to entry for a lot of users. SCP is 
still widely used for transferring large data sets (people moving TBs of 
data via SCP isn't uncommon where I work) so performance in those 
environments is a concern of mine.


More information about the openssh-unix-dev mailing list