Hanging ssh sessions with openssh-5.1p1 and Solaris 8 & 10

Graham Steward gpsteward at yahoo.com
Thu Jul 9 23:24:55 EST 2009


Hi,

Has anyone had any luck looking into this by any chance ?

> On Mon, Aug 04, 2008 at 02:34:23PM -0400, Jeff Wieland wrote:
>> Since we upgraded OpenSSH from 5.0p1 to 5.1p1 on our Solaris 8 boxes
>> (I know, I know, we should upgrade or retire them...), we've started
>> experiencing problems with slogin'ing into these boxes, running vi,
>> and pasting text into the vi session.
>>
>> As long as we are pasting in less that 1024 characters it's fine.
>> With >= 1024 characters, the session hangs.

We've also seen this problem, which also affects Apple's Mac OS X OpenSSH 5.1p1 as well as all our Solaris 8,9 & 10 machines running 5.1p1 & 5.2p1 - but debian's OpenSSH 5.1 is fine. 

Running a truss against the parent sshd sits at:
20193:  write(1, 0x00063078, 1)         (sleeping...) 

Here's a load of output which I hope could help identify the cause(s) of this behaviour if anyone's interested. 

I ran a dtrace script against the sshd processes on the machine and noticed one reading & writing as I pasted in a large quantity of text to a file (/tmp/sshd.test_test):

# ./opensnoop -n sshd
  UID    PID CMD          D   BYTES FILE
12962   9649 sshd         R    1056 <unknown>
12962   9649 sshd         W    1022 /devices/pseudo/clone at 0:ptm
12962   9649 sshd         R     480 <unknown>
12962   9649 sshd         W     386 /devices/pseudo/clone at 0:ptm
12962   9649 sshd         R     512 /devices/pseudo/clone at 0:ptm
12962   9649 sshd         R    9712 <unknown>
12962   9649 sshd         R     512 /devices/pseudo/clone at 0:ptm
12962   9649 sshd         W    7313 /devices/pseudo/clone at 0:ptm

A ptree of the vi shows all processes involved:
# ptree 10879
1083  /usr/local/sbin/sshd
  9626  /usr/local/sbin/sshd -R
    9649  /usr/local/sbin/sshd -R
      9651  -tcsh
        10879 vi /tmp/sshd.test_test

Running pfiles on pid 9649 we get:
# pfiles 9649
9649:   /usr/local/sbin/sshd -R
  Current rlimit: 256 file descriptors
   0: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
      O_RDWR|O_LARGEFILE
      /devices/pseudo/mm at 0:null
   1: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
      O_RDWR|O_LARGEFILE
      /devices/pseudo/mm at 0:null
   2: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
      O_RDWR|O_LARGEFILE
      /devices/pseudo/mm at 0:null
   3: S_IFDOOR mode:0444 dev:365,0 ino:92 uid:0 gid:0 size:0
      O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[2394]
      /var/run/name_service_door
   4: S_IFIFO mode:0000 dev:354,0 ino:16183814 uid:12962 gid:4640 size:0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
   5: S_IFSOCK mode:0666 dev:363,0 ino:2848 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK
        SOCK_STREAM
        SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(49152),SO_RCVBUF(49232),IP_NEXTHOP(0.0.192.80)
        sockname: AF_INET xxx.yyy.10.222  port: 22
        peername: AF_INET xxx.yyy.12.20  port: 50166
   6: S_IFIFO mode:0000 dev:354,0 ino:16183814 uid:12962 gid:4640 size:0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
   7: S_IFSOCK mode:0666 dev:363,0 ino:3489 uid:0 gid:0 size:0
      O_RDWR FD_CLOEXEC
        SOCK_STREAM
        SO_SNDBUF(16384),SO_RCVBUF(5120)
        sockname: AF_UNIX 
   8: S_IFCHR mode:0000 dev:356,0 ino:56458 uid:0 gid:0 rdev:23,9
      O_RDWR|O_NONBLOCK|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/clone at 0:ptm
  10: S_IFCHR mode:0000 dev:356,0 ino:56458 uid:0 gid:0 rdev:23,9
      O_RDWR|O_NONBLOCK|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/clone at 0:ptm
  11: S_IFCHR mode:0000 dev:356,0 ino:56458 uid:0 gid:0 rdev:23,9
      O_RDWR|O_NONBLOCK|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/clone at 0:ptm

Other process pfiles:
# pfiles 9626
9626:   /usr/local/sbin/sshd -R
  Current rlimit: 256 file descriptors
   0: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
      O_RDONLY|O_LARGEFILE
      /devices/pseudo/mm at 0:null
   1: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
      O_RDWR|O_LARGEFILE
      /devices/pseudo/mm at 0:null
   2: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
      O_RDWR|O_LARGEFILE
      /devices/pseudo/mm at 0:null
   3: S_IFDOOR mode:0444 dev:365,0 ino:92 uid:0 gid:0 size:0
      O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[2394]
      /var/run/name_service_door
   4: S_IFCHR mode:0000 dev:356,0 ino:56458 uid:0 gid:0 rdev:23,9
      O_RDWR|O_NONBLOCK|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/clone at 0:ptm
   5: S_IFSOCK mode:0666 dev:363,0 ino:2848 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK
        SOCK_STREAM
        SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(49152),SO_RCVBUF(49232),IP_NEXTHOP(0.0.192.80)
        sockname: AF_INET xxx.yyy.10.222  port: 22
        peername: AF_INET xxx.yyy.12.20  port: 50166
   6: S_IFSOCK mode:0666 dev:363,0 ino:5314 uid:0 gid:0 size:0
      O_RDWR FD_CLOEXEC
        SOCK_STREAM
        SO_SNDBUF(16384),SO_RCVBUF(5120)
        sockname: AF_UNIX 


# pfiles 10879
10879:  vi /tmp/sshd.test_test
  Current rlimit: 256 file descriptors
   0: S_IFCHR mode:0620 dev:356,0 ino:12582934 uid:12962 gid:7 rdev:24,9
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/pts at 0:9
   1: S_IFCHR mode:0620 dev:356,0 ino:12582934 uid:12962 gid:7 rdev:24,9
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/pts at 0:9
   2: S_IFCHR mode:0620 dev:356,0 ino:12582934 uid:12962 gid:7 rdev:24,9
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/pts at 0:9
   3: S_IFREG mode:0600 dev:85,0 ino:42979 uid:12962 gid:4640 size:24576
      O_RDWR|O_CREAT|O_EXCL
      /var/tmp/Exxka4pv

Nothing makes it into the temp file. Most of it (depending on size) shows up in a truss of the sshd process 9649, but never makes it any further.

Thanks in advance,
Graham



Darren Tucker wrote:
> On Mon, Aug 04, 2008 at 02:34:23PM -0400, Jeff Wieland wrote:
>> Since we upgraded OpenSSH from 5.0p1 to 5.1p1 on our Solaris 8 boxes
>> (I know, I know, we should upgrade or retire them...), we've started
>> experiencing problems with slogin'ing into these boxes, running vi,
>> and pasting text into the vi session.
>>
>> As long as we are pasting in less that 1024 characters it's fine.
>> With >= 1024 characters, the session hangs.
> 
> Do you know if the problem occurs on the client or server side?  ie if
> you use an older client with a newer server (and vice versa) does the
> problem occur?

It's the server side.  It still happens if you use the 5.0p1 client,
and it also happens with the SecureCRT client.

>> If you run "/usr/ucb/lptest 72 23 | cat -n" in one window, and
>> then cut paste up to the "V" on line 13, things work as expected.
>> If you include the "W" on the line 13, the vi session will hang
>> with none of characters that are being pasted showing up.
>>
>> We've been building OpenSSH with Sun Studio 11 -- I tried building
>> it with GNU-CC 3.4.4 with the same results.  We also link against
>> a locally built zlib, since Solaris 8 doesn't have zlib 1.2.3.
>> And we've used OpenSSL 0.9.8g and 0.9.8h with the same results.
>>
>> We also tried building OpenSSH 5.1p1 on our Solaris 10 boxes using
>> Sun Studio 12, and we also get the hangs there.  The client doesn't
>> seem to matter -- we've seen it OpenSSH 5.1p1 from both Solaris
>> and Slackware Linux, and also from SecureCRT.
>>
>> I have not been able to get anything useful from running sshd in
>> debug mode (at least, not that I recognize as useful :-) ).
> 
> Well you could post it, someone else might recognise someting :-)

I'll see if I can get this done tomorrow.  It's a crazy couple of
weeks right now...

> Some versions of AIX have bugs in the tty drivers that prevent largish
> writes from working correctly.  Pehaps Solaris has something similar
> (although I can't imagine why it's only started recently).
> 
> You could try the patch below to test this theory.
> 
> Index: channels.c
> ===================================================================
> RCS file: /usr/local/src/security/openssh/cvs/openssh/channels.c,v
> retrieving revision 1.273
> diff -u -p -r1.273 channels.c
> --- channels.c	16 Jul 2008 12:42:06 -0000	1.273
> +++ channels.c	5 Aug 2008 01:08:22 -0000
> @@ -1578,11 +1578,10 @@ channel_handle_wfd(Channel *c, fd_set *r
>  			}
>  			return 1;
>  		}
> -#ifdef _AIX
> +
>  		/* XXX: Later AIX versions can't push as much data to tty */
>  		if (compat20 && c->wfd_isatty)
> -			dlen = MIN(dlen, 8*1024);
> -#endif
> +			dlen = MIN(dlen, 1024);
>  
>  		len = write(c->wfd, buf, dlen);
>  		if (len < 0 &&
> 

OK -- I can try this too.  But it isn't necessary with the 5.0p1 sshd,
so I'm thinking that something changed w.r.t. OpenSSH.
-- 


      


More information about the openssh-unix-dev mailing list