SCP protocol question and outstanding requests in SFTP

Chris Rapier rapier at psc.edu
Sat Oct 13 03:28:29 EST 2007


Bob Proulx wrote:


> It seems wasteful to put a huge amount of effort into scp when rsync
> seems to fill the feature list quite well.  I think if you were to
> push your users over to rsync that they would be very happy and
> productive to be using it.

There are almost always better solutions. However, getting people to 
*use* those solutions can be painful, frustrating, and ultimately a 
waste of time and effort. So I agree with you entirely, but that doesn't 
seem to be enough.

For example, I work at an HPC site known as the Pittsburgh 
Supercomputing Center. We have a huge 3k+ node processor, hundreds of 
terabytes of rotating media and petabytes of robot controlled tape 
storage. Getting data in and out of our center is *major* part of what 
we do here and its why we have multiple 10Gbps connections to the 
outside world. We have methods in place for the high speed transfer of 
data using secure authentication - gridFTP, kFTP, and a couple of home 
rolled solutions. These will give users throughput rates in the multi 
Gbps range.

So what ends up happening? A good chunk of our users end up using SSH to 
transfer data and then, not knowing *why* using SSH to transfer bulk 
data is bad idea, come to us and say our networks are broken ("We have a 
1Gb link why are we only getting 1.5Mbps? Whats wrong with your 
network?"). We put a lot of effort into getting gridFTP up and running 
and making sure kerberos works properly but the users just aren't using 
it. I would end up spending days in some cases trying to teach users 
that they can't get the performance they need out of SSH. I'd send them 
statically compiled binaries of kftp, we'd send people to their sites to 
get globus running, etc etc etc... 9 times out of 10 we'd stamp out one 
fire out to have 5 more pop up at the *same* site (generally because of 
a lack of communication between users, labs, admins, and managers).

Eventually it ended up being easier and more cost effective to develop, 
distribute, and promote the HPN-SSH patches. The result of this is that 
its actually displacing gridFTP and kFTP as a primary bulk data transfer 
tool on HPNs. Its not that its better than either of those but its what 
the users know and constantly revert to given the preference. 
Personally, I believe that by working with the user desires it actually 
ends up being more secure because they aren't spending as much time 
trying to circumvent the 'problem' (we had users hacking iperf, netcat, 
and the like trying to turn them into transfer tools. Didn't make any 
sense to me).

So while I think rsync would be a better solution (assuming its paired 
with the HPN-SSH patches) I can't get the users to agree with me. So I 
prefer to make the tools that users know and are comfortable with do 
what they need them to do. I'm not always happy about doing this but it 
makes our users happy and that helps keep our funding coming in. Of 
course, I end up doing more work but our users don't and its been made 
very clear to me which is more important :)

Chris

p.s. sorry to be wordy. i'm trying to avoid writing a paper for a 
conference (on SSH actually) and I get overly verbose during those 
times. No, I don't think that makes sense either. ;)


More information about the openssh-unix-dev mailing list