[Bug 3312] New: Behavior change in OpenSSH8.1 compared to OpenSSH7.5

bugzilla-daemon at mindrot.org bugzilla-daemon at mindrot.org
Tue May 18 00:26:46 AEST 2021


            Bug ID: 3312
           Summary: Behavior change in OpenSSH8.1 compared to OpenSSH7.5
           Product: Portable OpenSSH
           Version: 8.1p1
          Hardware: PPC
                OS: AIX
            Status: NEW
          Severity: major
          Priority: P5
         Component: sshd
          Assignee: unassigned-bugs at mindrot.org
          Reporter: mayasha9 at in.ibm.com


I recently build and install Openssh8.1 and I found out some strange
behavior compared to OpenSSH 7.5. 
I want to get your opinion on this change in behavior between 7.5 and
8.1 OpenSSH.

Problem title:

8.1 OpenSSH waits for command run in background

Problem description:

Using ssh to run a command in the background will cause the ssh session
to stay open until the command is done executing, if using OpenSSH 8.1.
For example:

ssh -n user at hostname 'nohup /tmp/dosleep > /dev/null & > /dev/null'

...if /tmp/dosleep has a sleep of 30 seconds, this session will stay
open for 30 seconds. At earlier levels of OpenSSH, the ssh session
would exit immediately since the command is being run in the

Analysis so far:
This test is run using this script:

exec 1>/tmp/dosleep.out
sleep 10
lslpp -L


# time (ssh -n root at tele1 'nohup /tmp/dosleep > /dev/null & >

...output will show it paused for 10 seconds:

real    0m11.30s
user    0m0.12s
sys     0m0.00s

...whereas using OpenSSH 7.5 version of sshd will show more like "real 
  0m0.94s" as the ssh session is allowed to exit immediately.

Redirecting stderr to null will allow it to exit immediately even when
using 8.X:
# time (ssh -n root at tele1 'nohup /tmp/dosleep > /dev/null 2>&1 & >

real    0m0.94s
user    0m0.12s
sys     0m0.00s

sshd debug will show that we pause here:

debug2: notify_done: reading
debug2: channel 0: read<=0 rfd 13 len 0
debug2: channel 0: read failed
debug2: channel 0: chan_shutdown_read (i0 o3 sock -1 wfd 13 efd 15
debug2: channel 0: input open -> drain
debug2: channel 0: ibuf_empty delayed efd 15/(0)
<pause here>

In file, channels.c (OpenSSH8.1), If I do the following change,
OpenSSH8.1 behaves same as what we saw in ssh 7.5 starts working.

        if ((len = sshbuf_len(c->input)) == 0) {
                if (c->istate == CHAN_INPUT_WAIT_DRAIN) {
        -               /*
        -                * input-buffer is empty and read-socket
        -                * tell peer, that we will not send more data:
        -                * send IEOF.
        -                * hack for extended data: delay EOF if EFD
        -                * in use.
        -                */
        -               if (CHANNEL_EFD_INPUT_ACTIVE(c))
        -                       debug2("channel %d: "
        -                           "ibuf_empty delayed efd %d/(%zu)",
        -                           c->self, c->efd,
        -               else
        -                       chan_ibuf_empty(ssh, c);
        +               chan_ibuf_empty(ssh, c);

It seems to think we have an input - an 'extended file descriptor /
EFD" - active still that we are waiting on input from.  efd 15 is
opened for the stderr of the command.

This might mean we still think we have input to read from stderr -
since we see that, by redirecting stderr to stdout, the problem does
not happen.

It says this is a 'hack for extended data', which is a bit worrying. 
This happens whether the program being run is a ksh shell script or a
compiled C program.

Testing by adding a circumvention and calling chan_ibuf_empty even if
we do hit that CHANNEL_EFD_INPUT_ACTIVE condition will avoid the
problem, although that would undo the 'hack' that is in place.

It would appear the logic is not correct that leads us to wait on this
file descriptor since we are running the command with nohup and in the
background. Even if something in the script were to write to stderr
after our sleep has happened, we shouldn't be displaying that on the
client side, since the nohup ought to prevent things from writing to
the screen.

But then, if I change my test script to this:

exec 1>/tmp/dosleep.out
sleep 10
lslpp -L
echo "this is going to stderr" >&2

...and run this:

# time (ssh -n root at tele1 'nohup /tmp/dosleep >/dev/null & >

...the client side actually sees what the script is writing to stderr:

<10 second pause happens here, then we see: >

this is going to stderr
real    0m11.47s
user    0m0.12s
sys     0m0.00s

I wouldn't have expected we'd see that since we are nohup'ing it.

Testing the same way with 7.5 sshd, we do not see the 'this is going to
stderr' output.

What do you think, is this change in behavior a defect?


You are receiving this mail because:
You are watching the assignee of the bug.

More information about the openssh-bugs mailing list