[Bug 2071] New: sshd closes stderr but not stdout when child process exits

bugzilla-daemon at mindrot.org bugzilla-daemon at mindrot.org
Sat Feb 16 02:50:32 EST 2013


https://bugzilla.mindrot.org/show_bug.cgi?id=2071

            Bug ID: 2071
           Summary: sshd closes stderr but not stdout when child process
                    exits
    Classification: Unclassified
           Product: Portable OpenSSH
           Version: 6.1p1
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P5
         Component: sshd
          Assignee: unassigned-bugs at mindrot.org
          Reporter: jdpaul at interstel.net

Created attachment 2220
  --> https://bugzilla.mindrot.org/attachment.cgi?id=2220&action=edit
strace output and code fragments demonstrating stderr-vs-stdout bug

Description
-----------

When using ssh to run a remote command which forks a child process and
then exits, the behavior of sshd differs for stdout and stderr: stdout
stays connected, but stderr gets closed.

This behavior occurs for Protocol 2 on OpenSSH 4.6p1 and later, on any
OS.

For Protocol 2 and OpenSSH 4.5p1 and earlier, stdout and stderr both
stay connected.

For Protocol 1 on any version, stdout and stderr both stay connected.


The effect of the stderr close is that any process on the server side
that tries to write to stderr, once sshd closes its end of the stderr
pipe, will get a SIGPIPE signal.  This will in most cases kill the
process tree (either directly or via cascading errors).

This is a subtle problem (and can be rare and hard to reproduce, since
it depends on something writing to stderr), but it can be easily
reproduced.


Demonstration
-------------

NOTE: these examples work only with a remote user with a Bourne shell
(sh, ksh, bash, or similar).  A shell with a built-in 'echo' (bash or
ksh) will give the cleanest demonstration.


1.  stdout only:  stdout stays connected after primary command exit:

CMD="echo parent ; ( echo child stdout 1 ; sleep 2 ; echo child stdout
2; sleep 2 ; echo child stdout 3 ) & echo parent end ; exit 0"

ssh -2 $testhost "$CMD" 2>&1 |
    while read line ; do date "+%Y%m%d %T.%3N $line" ; done

This emits output like this:

20130206 15:35:33.743 parent
20130206 15:35:33.751 parent end
20130206 15:35:33.752 child stdout 1
20130206 15:35:35.742 child stdout 2
20130206 15:35:37.768 child stdout 3


2.  Add writes to stderr:  this will cause the connection to break.
The time of the break is dependent on which completes first, the
parent shell exit or the child shell's write to stderr.  This is a
classic race condition, and is dependent on the CPU speed of the
remote host.

CMD="echo parent ; ( echo child stdout 1 ; echo child stderr 1 1>&2 ;
sleep 2 ; echo child stdout 2 ; echo child stderr 2 1>&2 ; sleep 2 ;
echo child stdout 3; echo child stderr 3 1>&2 ) & echo parent end ;
exit 0"

ssh -2 $testhost "$CMD" 2>&1 |
    while read line ; do date "+%Y%m%d %T.%3N $line" ; done

For affected versions of OpenSSH, this emits output like this:

20130206 15:37:43.406 parent
20130206 15:37:43.408 child stdout 1
20130206 15:37:43.410 child stderr 1
20130206 15:37:43.412 parent end
20130206 15:37:45.406 child stdout 2

Slower hosts might break the connection on the first write to stderr
('echo child stderr 1 1>&2'), while faster hosts usually break the
connection on the second ('echo child stderr 2 1>&2').

For unaffected versions of OpenSSH, it emits the full output:

20130206 16:10:44.418 parent
20130206 16:10:44.420 child stdout 1
20130206 16:10:44.551 parent end
20130206 16:10:44.553 child stderr 1
20130206 16:10:46.422 child stdout 2
20130206 16:10:46.424 child stderr 2
20130206 16:10:48.425 child stdout 3
20130206 16:10:48.427 child stderr 3


Software Versions
-----------------

I tested a range of OpenSSH versions, each compiled from source, to
figure out when the bug was introduced.

The bug is not present in OpenSSH-4.5p1 and earlier.

The bug is present in OpenSSH-4.6p1 and later, all the way through
OpenSSH-6.1p1.

(Confusingly, the bug IS present on RedHat Enterprise/CentOS 5.4,
which include OpenSSH-4.3p2 via the openssh-4.3p2-36.el5_4.2 RPM;
RedHat probably back-ported changes without changing the version
number.)


Analysis
--------

strace output and code fragments are included in the attachment.


Thanks -- 

JD Paul
jdpaul at interstel.net

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the openssh-bugs mailing list