Parallel transfers with sftp (call for testing / advice)

Cyril Servant cyril.servant at gmail.com
Mon May 18 18:14:49 AEST 2020


Hi Peter, and thank you for your advice, it's really appreciated.

> * A new thread queue infrastructure
> 
> Please forget about using this pattern in the OpenSSH context. Others have
> mentioned that threading requires much more care, and while that is true I
> think a simple thread queue is doable, although I think there's an
> issue or two still overlooked in the proposed implementation (specifically,
> threads combined with child processes especially requires carefully dealing
> with signals in all threads, which the change doesn't seem to do, and the
> thread_loop() logic is not clear - the function will never return).

The thread exists in thread_real_loop(), but yeah, it may not be clear enough.
Anyways, if the we have to avoid using threads, we'll have to rework all this…

> * Established behavior is changed, an error condition arbitrarily introduced
> 
> The proposed change refuses file transfer if the local and remote files
> have different sizes. This has never been a requirement before, and should
> certainly not become one. It's also not neccessary for what you want to do.
> If a user says to transfer a file then the behavior always was and should
> continue to be that whatever was there before on the recipient side is simply
> overwritten.

No, this error is only raised if the thread wants to write a part of the file,
just after the main thread created the sparse file. If the file doesn't have
the right size, this means there has been a problem during the sparse file
creation.

> * Don't add server workarounds to the client
> 
> [...]
> 
> * Ad-hoc name resolution and random server address choice

Those 2 functionalities have been added with our specific HPC clusters in mind.
I think we can simply remove them, and just focus on a point to point
operation. For information, the unpatched sftp never resolves hostnames, it
just lets the underlying ssh process do it.

> Another approach would be for the recipient to initially create the full
> size file before any data transfer, but that comes at the cost of a probably
> quite significant delay. (Maybe not though? Can you test?)

Well, without changing anything server-side, the only solution is to write
something at the end of the file. In our tests it creates a sparse file, but
indeed, this must be portable, and we have to test it on multiple platforms…

Once again, thanks for your advice. I've only answered a few things here, but
as you said, portability is the main subject, then if an alternative to threads
has to be chosen, we'll have to think about it.

-- 
Cyril



More information about the openssh-unix-dev mailing list