HiddenStore option may be useful

William Ahern william at 25thandClement.com
Sun Apr 15 14:18:18 EST 2007


On Sat, Apr 14, 2007 at 10:54:43PM -0400, Jason wrote:
> Thomas Blank wrote:
> > I'm missing a HiddenStore option in OpenSSH, known from some ftp-server 
> > implementations like ProFTPd.
> > 
> > Consider the following scenario:
> > - A process PROCA is frequently polling the directory for a file called 
> > myfile.txt
> > - Someone transfers this file via sftp or scp to the directory
> > - While transfer is going on and the file is not completely written, 
> > PROCA reads in the file and removes is
> > -> Corrupt data is seen by PROCA
> > 
> > Knowing this problem you have to solutions:
> > 1. PROCA must check if myfile.txt is changing (filesize, mtime...) and 
> > wait until it does not change any more
> > 2. sftp and scp use a HiddenStore by writing the file with a unique 
> > filename (eg. .myfile.txt) and renaming it at the end of the transfer 
> > (mv .myfile.txt myfile.txt)
> > 
> > What do you think about this?
> 
> Why not have PROCA use inotify?
> 
> See /usr/src/linux/Documentation/filesystems/inotify.txt
> 

How does that address the race condition? inotify is just a better poll in
this case. It doesn't even tell you how many processes have an open
descriptor.

Excluding OpenSSH modifications, the following might work:

1) Poll for new files (stat + sleep or dnotify or inotify or FAM or kqueue).
2) Open the file
3) unlink(2) the file 
3) ?? Track the number of open file descriptors on the file and wait
   till it drops to 1 (just you). I don't know how to do this.
4) Process the file however you want, i.e. by copying the data to another
   file elsewhere.

The downside here is that you may accidentally unlink the wrong file, if a
new one was created in its place after open(2) and before unlink(2).

Ideally, inotify or kqueue would return to you a file descriptor, rather
than a simple path [component]. And the gods would somehow bestow the Unix
API w/ a funlink(2) system call, which would magically remove filesystem
references to the file (or maybe atomically unlink a fd/path pair, failing
if the relationship doesn't exist). Then in conjunction w/ the wait for
others to lose their reference, you'd have a solid solution where you are
assured to hold a reference to a finalized version of the file. Alas, none
of this exists.

The only real answer, in this case, for race free, provably correct
behavior, is to hack up sftp. OTOH, if "good enough" is sufficient, you're
probably already there.



More information about the openssh-unix-dev mailing list