sftp reget/reput

Thu Sep 18 20:16:11 EST 2003

imho, rolling checksums would be more useable. e.g.
if a file changes.

On Tue, Sep 16, 2003 at 09:03:13PM -0700, Dan Kaminsky wrote:
> It's a mighty inefficient codepath that literally reads data out of 
> order and sends it such; disk seek times are deadly.  That being said, 
> simply implement a cache that handles out of order transactions and only 
> writes to disk complete windows of data.  This does mean memory usage 
> can grow in case of a small missing block, but certainly we can control 
> that by monitoring our number of outstanding requests and failing to 
> issue more when the server obstinately refuses to give us one particular 
> entry.
> 
> This is, of course, directly analogous to a TCP Window.
> 
> --Dan
> 
> 
> Markus Friedl wrote:
> 
> >we could modify the protocol and implement
> >rolling checksums like Niels Provos suggests:
> >
> >	MD5_CTX ctx1, ctx2;
> >
> >	MD5_Init(&ctx1);
> >
> >	while new page in data
> >	  MD5_Update(&ctx1, newpage, pagesize)
> >	  ctx2 = ctx1;
> >	  MD5_Final(digest, &ctx2)
> >	  if (compare with remote not equal)
> >	     break;
> >	end while
> >
> >	continue data transfer.
> >
> >On Wed, Sep 17, 2003 at 11:12:36AM +0800, Dmitry Lohansky wrote:
> > 
> >
> >>Hello openssh@
> >>
> >>I thought about sftp's reget/reput commands.
> >>
> >>Several days ago, Damien Miller write to tech at openbsd.org (it was
> >>reply for my letter):
> >>
> >>   
> >>
> >>>Herein lies a problem which is not easy to detect or solve. For
> >>>performance reasons, the sftp client does pipelined reads/writes when
> >>>transferring files. The protocol spec allows for a server to process
> >>>these requests out of order. For example:
> >>>     
> >>>
> >>>client                     server
> >>>------                     ------
> >>>open file                  your file handle is "blah"
> >>>gimme bytes 0-8191
> >>>gimme bytes 8192-16383
> >>>gimme bytes 16384-24575
> >>>gimme bytes 24576-32767    here are bytes 24576-32767
> >>>close file                 here are bytes 16384-24575
> >>>                          here are bytes 8192-16383
> >>>                          here are bytes 0-8191
> >>>                          close successful
> >>>     
> >>>
> >>>If the client writes the bytes our in the order they are received (which
> >>>it probably should, to avoid buffering large amounts of data) then an
> >>>interruption will leave a full-length, but "holey" file on disk. There
> >>>is no general way to determine how to do resume such a transfer.
> >>>     
> >>>
> >>>The best the client can do to make transfers resumable is ftruncate()
> >>>the file at the highest contiguous byte received. This will stop the
> >>>potential corruption on resume.
> >>>     
> >>>
> >>This is good method, but if client crash, we also may get a "hole".
> >>What your think about next way?
> >>
> >>Storing extra-data at the end of file, for example:
> >>
> >><---orig-part-><-extra->
> >>[*][ ][*][ ][*][*******]
> >><---------file--------->
> >>
> >>where [*] - already loaded data, [ ] - not yet
> >>
> >>In extra part, we may store which block was already loaded and it
> >>offset and size. After download, extra part will be removed.
> >>
> >>Comments?
> >>-- 
> >>Dmitry Lohansky
> >>
> >>_______________________________________________
> >>openssh-unix-dev mailing list
> >>openssh-unix-dev at mindrot.org
> >>http://www.mindrot.org/mailman/listinfo/openssh-unix-dev
> >>   
> >>
> >
> >_______________________________________________
> >openssh-unix-dev mailing list
> >openssh-unix-dev at mindrot.org
> >http://www.mindrot.org/mailman/listinfo/openssh-unix-dev
> > 
> >
> 
> 
> _______________________________________________
> openssh-unix-dev mailing list
> openssh-unix-dev at mindrot.org
> http://www.mindrot.org/mailman/listinfo/openssh-unix-dev