OpenSSH (1.2.3) sshd hanging when using rsync over ssh (retry)

Guy Helmer ghelmer at cs.iastate.edu
Tue May 16 07:15:43 EST 2000


Now that the list is said to be open again, I'm resending this.  I've
merged my changes into OpenSSH 2.1.0 as Kris imported it into FreeBSD over
the weekend.

---------- Forwarded message ----------
Date: Thu, 4 May 2000 08:40:22 -0500 (CDT)
From: Guy Helmer <ghelmer at cs.iastate.edu>
To: openssh-unix-dev at mindrot.org
Subject: OpenSSH (1.2.3) sshd hanging when using rsync over ssh

I have debugged a problem with OpenSSH's sshd (as found in FreeBSD, based
on OpenSSH 1.2.3) that has been bugging me ever since I switched from
ssh-1.2.27.

I use rsync (FreeBSD port ports/net/rsync) over ssh to synchronize and
backup my main home directory and development directories to other
systems.  rsync always worked great with ssh-1.2.2[67].

Since I switched my machines to run OpenSSH's sshd, rsync over ssh would
randomly hang (although the hangs were very persistent when synchronizing
large files).  I noticed from netstat that the connection to ssh on the
sshd server machine showed waiting data in the Recv-Q, but no waiting data
in the Send-Q, so I decided to look into sshd.  I grabbed a core from sshd
when this hang happened, and gdb showed this stack trace:

#0  0x281e20c4 in write () from /usr/lib/libc.so.4
#1  0x804fb18 in process_output (writeset=0xbfbfed04)
    at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/serverloop.c:366
#2  0x8050029 in server_loop (pid=43486, fdin_arg=9, fdout_arg=9, fderr_arg=11)
    at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/serverloop.c:563
#3  0x8053b60 in do_exec_no_pty (
    command=0x80750c0 "rsync --server --sender -vlgtpr --delete . /home/ghelmer/
", pw=0xbfbfef80, display=0x806c0a0 "mocha.cs.iastate.edu:10.0",
    auth_proto=0x806c100 "MIT-MAGIC-COOKIE-1", 
    auth_data=0x8075000 "cdf4b6cb730310be3d51a8abf77303fc")
    at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:2211
#4  0x805386c in do_authenticated (pw=0xbfbfef80)
    at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:2037
#5  0x80527b4 in do_authentication ()
    at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:1408
#6  0x8051b43 in main (ac=1, av=0xbfbff624)
    at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:970
#7  0x804aae1 in _start ()

The code around frame #1 was
361     {
362             int len;
363
364             /* Write buffered data to program stdin. */
365             if (fdin != -1 && FD_ISSET(fdin, writeset)) {
366                     len = write(fdin, buffer_ptr(&stdin_buffer),
367                         buffer_len(&stdin_buffer));
368                     if (len <= 0) {
369     #ifdef USE_PIPES
370                             close(fdin);

and stdin_buffer contains
$2 = {buf = 0x80b1000 "è\004\212D\204úc½", alloc = 45056, offset = 0, 
  end = 8192}

So, it appears sshd was stuck in a write() that wouldn't complete.  (Even
when I kill the ssh client, sshd hangs around and never notices that the
connection has gone away.)

I figured this was probably something that was fixed in ssh-1.2.27, and
sure enough, fdin was set to be nonblocking and errno was checked for the
value EWOULDBLOCK in process_output.  I added similar code to
serverloop.c, and now rsync over ssh works great.

I'm worried that my code is tainted, though, since I looked at the
ssh-1.2.27 sources.  If you don't think it is a problem, and if you are
interested, I can send you my diffs...  I don't have ties to OpenBSD, so
I'm not sure who in particular I should contact about this.

Thanks,
Guy

Guy Helmer, Ph.D. Candidate, Iowa State University Dept. of Computer Science 
Research Assistant, Dept. of Computer Science   ---   ghelmer at cs.iastate.edu
http://www.cs.iastate.edu/~ghelmer









More information about the openssh-unix-dev mailing list