Support for transferring sparse files via scp/sftp correctly?

Sun Apr 6 02:20:29 AEDT 2025

On Apr 5, 2025, at 3:08 AM, Darren Tucker <dtucker at dtucker.net> wrote:
> On Sat, 5 Apr 2025 at 09:07, Lionel Cons <lionelcons1972 at gmail.com> wrote:
>> On Fri, 4 Apr 2025 at 07:07, Ron Frederick <ronf at timeheart.net> wrote:
>>> On Apr 3, 2025, at 6:02 PM, Darren Tucker <dtucker at dtucker.net> wrote:
>> [...]
>>>> Damien pointed out that it's possible to do a reasonable but not
>> perfect sparse file support by memcmp'ing your existing file buffer with a
>> block of zeros and skipping the write if it matches.  OpenBSD's cp(1) does
>> this (look for "skipholes"):
>> https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/bin/cp/utils.c?annotate=HEAD
>> .
>> 
>> This should not be done. Either a system has SEEK_DATA/SEEK_HOLE,
>> Win32 (Windows&ReactOS) FSCTL_QUERY_ALLOCATED_RANGES, or just copy all
>> bytes.
> 
> 
> If there's a protocol extension I'd like for it to be able to support other use cases, not just the one you care about.

Is there a use case the protocol extension I proposed wouldn’t support? The extension returns the raw data/hole information reported by the operating system, intentionally divorcing itself from any read operations, as there may be use cases where a caller needs to know only what ranges are present in the file, without needing to read all (or even any) of the data.

Note that data ranges in a file can be quite large. Reporting such large ranges isn’t a problem if only the ranges are returned, but if you want to return offsets, lengths, and data bytes you need to break the data ranges you get back into smaller chunks, and you’ll probably also want to have a way to stream multiple read results at once (potentially even having them complete out of order, as sometimes happens today when scheduling parallel reads or writes). Logic for this already exists in many SFTP implementations, but it would have to be rewritten to handle these new sparse reads if you mix the ranges and the data. If you keep them separate, though, all the existing logic can be reused with just a thin wrapper around it to jump from one range to the next when the reads for each range are all scheduled.
-- 
Ron Frederick
ronf at timeheart.net