Support for transferring sparse files via scp/sftp correctly?

Darren Tucker dtucker at dtucker.net
Fri Apr 4 12:02:47 AEDT 2025


On Sat, 29 Mar 2025 at 16:14, Ron Frederick <ronf at timeheart.net> wrote:

> [...]
> If you don’t get all of the requested ranges in a single request,
> additional requests can be sent starting at just past the end of the last
> range previously returned.
>
> What do you think?
>

That seems like it'd work well for things with SEEK_HOLE or equivalent,
although there's always the chance of the underlying file changing between
mapping it out and doing the transfer.

Damien pointed out that it's possible to do a reasonable but not perfect
sparse file support by memcmp'ing your existing file buffer with a block of
zeros and skipping the write if it matches.  OpenBSD's cp(1) does this
(look for "skipholes"):
https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/bin/cp/utils.c?annotate=HEAD.

This seems surprisingly effective in the case where you already have the
file content in a buffer anyway, but it would be harder to do (or at least
more expensive) as part of a separate request type that returns the
ranges.  It'd be easier to implement if there was some kind of
"read-sparse" operation that could return a list of {offset, len, data}
instead of just the offsets and lengths.  This would reduce the time
between the sparse check and the read although it's still potentially racy.

-- 
Darren Tucker (dtucker at dtucker.net)
GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860  37F4 9357 ECEF 11EA A6FA
    Good judgement comes with experience. Unfortunately, the experience
usually comes from bad judgement.


More information about the openssh-unix-dev mailing list