Hanging ssh sessions with openssh-5.1p1 and Solaris 8 & 10
Graham Steward
gpsteward at yahoo.com
Thu Jul 9 23:24:55 EST 2009
Hi,
Has anyone had any luck looking into this by any chance ?
> On Mon, Aug 04, 2008 at 02:34:23PM -0400, Jeff Wieland wrote:
>> Since we upgraded OpenSSH from 5.0p1 to 5.1p1 on our Solaris 8 boxes
>> (I know, I know, we should upgrade or retire them...), we've started
>> experiencing problems with slogin'ing into these boxes, running vi,
>> and pasting text into the vi session.
>>
>> As long as we are pasting in less that 1024 characters it's fine.
>> With >= 1024 characters, the session hangs.
We've also seen this problem, which also affects Apple's Mac OS X OpenSSH 5.1p1 as well as all our Solaris 8,9 & 10 machines running 5.1p1 & 5.2p1 - but debian's OpenSSH 5.1 is fine.
Running a truss against the parent sshd sits at:
20193: write(1, 0x00063078, 1) (sleeping...)
Here's a load of output which I hope could help identify the cause(s) of this behaviour if anyone's interested.
I ran a dtrace script against the sshd processes on the machine and noticed one reading & writing as I pasted in a large quantity of text to a file (/tmp/sshd.test_test):
# ./opensnoop -n sshd
UID PID CMD D BYTES FILE
12962 9649 sshd R 1056 <unknown>
12962 9649 sshd W 1022 /devices/pseudo/clone at 0:ptm
12962 9649 sshd R 480 <unknown>
12962 9649 sshd W 386 /devices/pseudo/clone at 0:ptm
12962 9649 sshd R 512 /devices/pseudo/clone at 0:ptm
12962 9649 sshd R 9712 <unknown>
12962 9649 sshd R 512 /devices/pseudo/clone at 0:ptm
12962 9649 sshd W 7313 /devices/pseudo/clone at 0:ptm
A ptree of the vi shows all processes involved:
# ptree 10879
1083 /usr/local/sbin/sshd
9626 /usr/local/sbin/sshd -R
9649 /usr/local/sbin/sshd -R
9651 -tcsh
10879 vi /tmp/sshd.test_test
Running pfiles on pid 9649 we get:
# pfiles 9649
9649: /usr/local/sbin/sshd -R
Current rlimit: 256 file descriptors
0: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
O_RDWR|O_LARGEFILE
/devices/pseudo/mm at 0:null
1: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
O_RDWR|O_LARGEFILE
/devices/pseudo/mm at 0:null
2: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
O_RDWR|O_LARGEFILE
/devices/pseudo/mm at 0:null
3: S_IFDOOR mode:0444 dev:365,0 ino:92 uid:0 gid:0 size:0
O_RDONLY|O_LARGEFILE FD_CLOEXEC door to nscd[2394]
/var/run/name_service_door
4: S_IFIFO mode:0000 dev:354,0 ino:16183814 uid:12962 gid:4640 size:0
O_RDWR|O_NONBLOCK FD_CLOEXEC
5: S_IFSOCK mode:0666 dev:363,0 ino:2848 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK
SOCK_STREAM
SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(49152),SO_RCVBUF(49232),IP_NEXTHOP(0.0.192.80)
sockname: AF_INET xxx.yyy.10.222 port: 22
peername: AF_INET xxx.yyy.12.20 port: 50166
6: S_IFIFO mode:0000 dev:354,0 ino:16183814 uid:12962 gid:4640 size:0
O_RDWR|O_NONBLOCK FD_CLOEXEC
7: S_IFSOCK mode:0666 dev:363,0 ino:3489 uid:0 gid:0 size:0
O_RDWR FD_CLOEXEC
SOCK_STREAM
SO_SNDBUF(16384),SO_RCVBUF(5120)
sockname: AF_UNIX
8: S_IFCHR mode:0000 dev:356,0 ino:56458 uid:0 gid:0 rdev:23,9
O_RDWR|O_NONBLOCK|O_NOCTTY|O_LARGEFILE
/devices/pseudo/clone at 0:ptm
10: S_IFCHR mode:0000 dev:356,0 ino:56458 uid:0 gid:0 rdev:23,9
O_RDWR|O_NONBLOCK|O_NOCTTY|O_LARGEFILE
/devices/pseudo/clone at 0:ptm
11: S_IFCHR mode:0000 dev:356,0 ino:56458 uid:0 gid:0 rdev:23,9
O_RDWR|O_NONBLOCK|O_NOCTTY|O_LARGEFILE
/devices/pseudo/clone at 0:ptm
Other process pfiles:
# pfiles 9626
9626: /usr/local/sbin/sshd -R
Current rlimit: 256 file descriptors
0: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
O_RDONLY|O_LARGEFILE
/devices/pseudo/mm at 0:null
1: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
O_RDWR|O_LARGEFILE
/devices/pseudo/mm at 0:null
2: S_IFCHR mode:0666 dev:356,0 ino:6815752 uid:0 gid:3 rdev:13,2
O_RDWR|O_LARGEFILE
/devices/pseudo/mm at 0:null
3: S_IFDOOR mode:0444 dev:365,0 ino:92 uid:0 gid:0 size:0
O_RDONLY|O_LARGEFILE FD_CLOEXEC door to nscd[2394]
/var/run/name_service_door
4: S_IFCHR mode:0000 dev:356,0 ino:56458 uid:0 gid:0 rdev:23,9
O_RDWR|O_NONBLOCK|O_NOCTTY|O_LARGEFILE
/devices/pseudo/clone at 0:ptm
5: S_IFSOCK mode:0666 dev:363,0 ino:2848 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK
SOCK_STREAM
SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(49152),SO_RCVBUF(49232),IP_NEXTHOP(0.0.192.80)
sockname: AF_INET xxx.yyy.10.222 port: 22
peername: AF_INET xxx.yyy.12.20 port: 50166
6: S_IFSOCK mode:0666 dev:363,0 ino:5314 uid:0 gid:0 size:0
O_RDWR FD_CLOEXEC
SOCK_STREAM
SO_SNDBUF(16384),SO_RCVBUF(5120)
sockname: AF_UNIX
# pfiles 10879
10879: vi /tmp/sshd.test_test
Current rlimit: 256 file descriptors
0: S_IFCHR mode:0620 dev:356,0 ino:12582934 uid:12962 gid:7 rdev:24,9
O_RDWR|O_NOCTTY|O_LARGEFILE
/devices/pseudo/pts at 0:9
1: S_IFCHR mode:0620 dev:356,0 ino:12582934 uid:12962 gid:7 rdev:24,9
O_RDWR|O_NOCTTY|O_LARGEFILE
/devices/pseudo/pts at 0:9
2: S_IFCHR mode:0620 dev:356,0 ino:12582934 uid:12962 gid:7 rdev:24,9
O_RDWR|O_NOCTTY|O_LARGEFILE
/devices/pseudo/pts at 0:9
3: S_IFREG mode:0600 dev:85,0 ino:42979 uid:12962 gid:4640 size:24576
O_RDWR|O_CREAT|O_EXCL
/var/tmp/Exxka4pv
Nothing makes it into the temp file. Most of it (depending on size) shows up in a truss of the sshd process 9649, but never makes it any further.
Thanks in advance,
Graham
Darren Tucker wrote:
> On Mon, Aug 04, 2008 at 02:34:23PM -0400, Jeff Wieland wrote:
>> Since we upgraded OpenSSH from 5.0p1 to 5.1p1 on our Solaris 8 boxes
>> (I know, I know, we should upgrade or retire them...), we've started
>> experiencing problems with slogin'ing into these boxes, running vi,
>> and pasting text into the vi session.
>>
>> As long as we are pasting in less that 1024 characters it's fine.
>> With >= 1024 characters, the session hangs.
>
> Do you know if the problem occurs on the client or server side? ie if
> you use an older client with a newer server (and vice versa) does the
> problem occur?
It's the server side. It still happens if you use the 5.0p1 client,
and it also happens with the SecureCRT client.
>> If you run "/usr/ucb/lptest 72 23 | cat -n" in one window, and
>> then cut paste up to the "V" on line 13, things work as expected.
>> If you include the "W" on the line 13, the vi session will hang
>> with none of characters that are being pasted showing up.
>>
>> We've been building OpenSSH with Sun Studio 11 -- I tried building
>> it with GNU-CC 3.4.4 with the same results. We also link against
>> a locally built zlib, since Solaris 8 doesn't have zlib 1.2.3.
>> And we've used OpenSSL 0.9.8g and 0.9.8h with the same results.
>>
>> We also tried building OpenSSH 5.1p1 on our Solaris 10 boxes using
>> Sun Studio 12, and we also get the hangs there. The client doesn't
>> seem to matter -- we've seen it OpenSSH 5.1p1 from both Solaris
>> and Slackware Linux, and also from SecureCRT.
>>
>> I have not been able to get anything useful from running sshd in
>> debug mode (at least, not that I recognize as useful :-) ).
>
> Well you could post it, someone else might recognise someting :-)
I'll see if I can get this done tomorrow. It's a crazy couple of
weeks right now...
> Some versions of AIX have bugs in the tty drivers that prevent largish
> writes from working correctly. Pehaps Solaris has something similar
> (although I can't imagine why it's only started recently).
>
> You could try the patch below to test this theory.
>
> Index: channels.c
> ===================================================================
> RCS file: /usr/local/src/security/openssh/cvs/openssh/channels.c,v
> retrieving revision 1.273
> diff -u -p -r1.273 channels.c
> --- channels.c 16 Jul 2008 12:42:06 -0000 1.273
> +++ channels.c 5 Aug 2008 01:08:22 -0000
> @@ -1578,11 +1578,10 @@ channel_handle_wfd(Channel *c, fd_set *r
> }
> return 1;
> }
> -#ifdef _AIX
> +
> /* XXX: Later AIX versions can't push as much data to tty */
> if (compat20 && c->wfd_isatty)
> - dlen = MIN(dlen, 8*1024);
> -#endif
> + dlen = MIN(dlen, 1024);
>
> len = write(c->wfd, buf, dlen);
> if (len < 0 &&
>
OK -- I can try this too. But it isn't necessary with the 5.0p1 sshd,
so I'm thinking that something changed w.r.t. OpenSSH.
--
More information about the openssh-unix-dev
mailing list