Parallel transfers with sftp (call for testing / advice)

Mon May 11 00:02:26 AEST 2020

On Fri, May 8, 2020 at 6:46 PM Matthieu Hautreux
<matthieu.hautreux at cea.fr> wrote:
>
> Le 06/05/2020 à 03:16, Nico Kadel-Garcia a écrit :
> > On Tue, May 5, 2020 at 4:31 AM Peter Stuge <peter at stuge.se> wrote:
> >> Matthieu Hautreux wrote:
> >>> The change proposed by Cyril in sftp is a very pragmatic approach to
> >>> deal with parallelism at the file transfer level. It leverages the
> >>> already existing sftp protocol and its capability to write/read file
> >>> content at specified offsets. This enables to speed up sftp transfers
> >>> significantly by parallelizing the SSH channels used for large
> >>> transfers. This improvement is performed only by modifying the sftp
> >>> client, which is a very small modification compared to the openssh
> >>> codebase. The modification is not too complicated to review and validate
> >>> (I did it) and does not change the default behavior of the cli.
> >> I think you make a compelling argument. I admit that I haven't
> >> reviewed the patch, even though that is what matters the most.
> >>
> >> I guess that noone really minds ways to make SFTP scale, but ever since
> >> the patch was proposed I have been thinking that the paralell channel
> >> approach is likely to introduce a whole load of not very clean error
> >> conditions regarding reassembly, which need to be handled sensibly both
> >> within the sftp client and on the interface to outside/calling processes.
> >> Can you or Cyril say something about this?
> > I find it an unnecessary feature given the possibilities of
> > out-of-band parallelism with multiple scp sessions transmitting
> > diferent manifests of files, of sftp to do the same thing, and of
> > tools like rsync to do it more efficiently by avoiding replication of
> > previously transmitted data and re-connection to complete partial
> > transmisions. It sounds like a bad case of "here, let me do this at a
> > different level of the stack" that is not normally necessary and has
> > already been done more completely and efficiently by other tools.
>
> I think you misunderstood the main point that is that we want to
> overcome the bandwidth limitation of a single SSH connection for
> transferring _very_large_ files.

>From painful experience, the big problem with large single files is
not optimizing the individual transmission. It's avoiding repetition:
Use rsync to send only *one* copy, with verification of successful
transmission,  And if you need that much parallelism for monoloithic
large files, the design effort is usually better spent splitting the
files into manageable chunks.

> A single SSH connection as a bandwidth limitation that is either the
> network bandwidth or the efficiency of the cipher/MAC on the less
> powerfull core of the two connected endpoints.
>
> If you traditionnaly use 1GE network cards, you will probably not see
> that if you have a good processorand the right cipher/mac, as the
> network will be the bottleneck.
>
> If you you are using 10GE (or more) network cards, you will see the cpu
> limitation, and will get to your bandwidth roofline at something very
> far from you network capacity.

Or, eventually, IOPs allocated to your host or hosts for "local disk access".

> However, I am not aware of anything enabling to send _very_large_ files
> using mutiple SSH connections. The proposed patch do that.

It's usually not really "only one large file". It's usually a set, and
the efficiences from

> Give it a try, and send or receive a single 5GB file using a 10GE
> network and you will better see the point. If you have a solution with
> current ssh/scp/sftp/rsync that enables to get the most of the network
> (>1GB/s), then surely the patches are useless. But I am pretty sure that
> you will experience a bandwidth about a few hundreds MB/s at most
> depending on the cores involved on both sides.

That was at a different job. Mirroring bulky MySQL databases,
including some *ginormous* and very dynamic tables, to svcale up new
cloud based services at need. *utoscaling" is not your friend for
hosts where syncing a scaled up server takes more than 24 hours. It
was much more efficient to boost the instance type for a few hours to
enable enhanced networking and allow more IOPs, then scale them back
down when they were populated.

> >> And another thought - if the proposed patch and/or method indeed will not
> >> go anywhere, would it still be helpful for you if the sftp client would
> >> only expose the file offset functionality? That way, the complexity of
> >> reassembly and the associated error handling doesn't enter into OpenSSH.
> > Re-assembly, eror handling, and delivery verification were done by
> > rsync ages ago. It really seems like re-inventing the wheel.
>
> In the proposed patch, no re-assembly is necessary outside of the sftp
> client, as the sftp protocol was sufficiently well designed to allow
> read/write from/to particular remote offsets in files.
>
> I do not see the patch as reinventing the wheel, maybe more widening it
> to run on widen roads.

As soon as you start tracking remote offsets, you're in considerable
danger. of any process that may write to the locally written file on
the target host and corrupt your existing transfer,.  Given "very
large files", the likelihood of running out of filesystem and rusking
failed transfers with a poor record of the state of such failures
increases proofoundly. That's yet another set of risks to manage.

Faced with similar isues professionally, I'd ask "why am I dealing
with such a big damn file? Can I cracefully split it into smaller,
more manageable chunks for reliable transmission". And I hav,e in
fact, done so for splitting very large database tables into numbered
pieces of "only" 100,000,000 rows each for reliable transmission
offsite. It made them far easier to archive, copy, or load, or to load
for database upgrades. Rather than re-=inventing sftp, scp, etc., I
was able to migrate the problem to a much more manageable "rsync
dozens of bulky files instead of one overwhelming one".

> Regards,
>
> Matthieu
>
> > _______________________________________________
> > openssh-unix-dev mailing list
> > openssh-unix-dev at mindrot.org
> > https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
>
>