TCP Forwarding hangs when TCP service is unresponsive, even when TCP client exits

Corey Hickey bugfood-ml at fatooh.org
Tue Sep 20 04:23:20 AEST 2022


On 2022-09-19 01:55, Damien Miller wrote:
> On Fri, 16 Sep 2022, Corey Hickey wrote:
> 
> First, thanks for the detailed investigation and for reproducing this
> with git HEAD.

You're welcome, and thank you for your attention to this.

> This is kind of a tricky case, because for some cases it's AFAIK impossible
> for the client to discern between a TCP server that a) will never respond
> from b) hasn't responded *yet*.
> 
> The solution that you proposed is unfortunately not without side
> effects - I think it changes the behaviour of half-closed TCP connection
> in a way that might lose data.

I was afraid there would be something like this.

>> [djm at lll ~]$ wget --timeout=1 --tries=1 http://127.0.0.1:8888--2022-09-19 18:16:17--  http://127.0.0.1:8888/
>> Connecting to 127.0.0.1:8888... debug3: receive packet: type 90
>> debug1: client_input_channel_open: ctype forwarded-tcpip rchan 3 win 2097152 max 32768
>> debug1: client_request_forwarded_tcpip: listen localhost port 8888, originator 127.0.0.1 port 35148
>> debug2: fd 7 setting O_NONBLOCK
>> debug2: fd 7 setting TCP_NODELAY
>> debug1: connect_next: host 127.0.0.1 ([127.0.0.1]:9999) in progress, fd=7
>> debug3: fd 7 is O_NONBLOCK
>> debug3: fd 7 is O_NONBLOCK
>> debug1: channel 1: new [127.0.0.1]
>> debug1: confirm forwarded-tcpip
>> debug3: channel 1: waiting for connection
>> debug3: channel 1: waiting for connection
>> connected.
>> HTTP request sent, awaiting response... debug3: channel 1: waiting for connection
>> debug3: channel 1: waiting for connection
>> Read error (Connection timed out) in headers.
>> Giving up.
>>
>> debug3: channel 1: waiting for connection
>> debug3: channel 1: waiting for connection
> 
> Is this what you see too?

 From the context, this appears to be for the remote port forwarding case
to localhost with an interactive SSH session. I think your example is a
bit different from mine, though--in your case, nothing is responding to
the SYNs sent to port 9999. Is the above from OpenBSD? If so, then there
might be a difference in OS behavior. On Linux, at least, a 3-way
handshake appears to be able to complete even if the service listening
on the socket is stopped.

I can reproduce your output if I configure my localhost firewall to drop
packets sent to 9999. In that case, I get output like yours, but the SSH
session does not hang upon logout.

The problem I'm seeing is a bit different: the TCP connection to the
service completes, so the SSH forwarded channel becomes fully
established. Any data sent over the channel, however, goes to a service
that is stopped. When the TCP client times out, the SSH server sends the
SSH client an EOF, but the SSH client never does it's EOF+CLOSE, since
it is waiting for an unresponsive TCP service to respond.


I can provide debug3 output for the remote forwarding scenario using an
interactive shell.

If I run:

$ ssh localhost -p 2222 -R 8888:127.0.0.1:9999 -v -v -v

...and then run wget from _within_ that interactive shell, I get:
------------------------------------------------------------------
$ wget --timeout=1 --tries=1 http://127.0.0.1:8888
--2022-09-19 10:00:45--  http://127.0.0.1:8888/
Connecting to 127.0.0.1:8888... debug3: receive packet: type 90
debug1: client_input_channel_open: ctype forwarded-tcpip rchan 3 win 2097152 max 32768
debug1: client_request_forwarded_tcpip: listen localhost port 8888, originator 127.0.0.1 port 54490
debug1: connect_next: start for host 127.0.0.1 ([127.0.0.1]:9999)
debug2: fd 7 setting O_NONBLOCK
debug2: fd 7 setting TCP_NODELAY
debug1: connect_next: connect host 127.0.0.1 ([127.0.0.1]:9999) in progress, fd=7
debug3: fd 7 is O_NONBLOCK
debug3: fd 7 is O_NONBLOCK
debug1: channel 1: new [127.0.0.1]
debug1: confirm forwarded-tcpip
debug3: channel 1: waiting for connection
connected.
HTTP request sent, awaiting response... debug1: channel 1: connected to 127.0.0.1 port 9999
debug3: send packet: type 91
debug3: receive packet: type 96
debug2: channel 1: rcvd eof
debug2: channel 1: output open -> drain
debug2: channel 1: obuf empty
debug2: chan_shutdown_write: channel 1: (i0 o1 sock 7 wfd 7 efd -1 [closed])
debug2: channel 1: output drain -> closed
Read error (Connection timed out) in headers.
Giving up.
------------------------------------------------------------------



Meanwhile, the SSH server reports:
------------------------------------------------------------------
debug1: Connection to port 8888 forwarding to localhost port 0 requested.
debug2: fd 9 setting TCP_NODELAY
debug2: fd 9 setting O_NONBLOCK
debug3: fd 9 is O_NONBLOCK
debug1: channel 3: new [forwarded-tcpip]
debug3: send packet: type 90
debug3: receive packet: type 91
debug2: channel 3: open confirm rwindow 2097152 rmax 32768
debug2: channel 3: read failed rfd 9 maxlen 32768: Broken pipe
debug2: channel 3: read failed
debug2: chan_shutdown_read: channel 3: (i0 o0 sock 9 wfd 9 efd -1 [closed])
debug2: channel 3: input open -> drain
debug2: channel 3: ibuf empty
debug2: channel 3: send eof
debug3: send packet: type 96
debug2: channel 3: input drain -> closed
------------------------------------------------------------------


If I then exit from the interactive shell, I get:
------------------------------------------------------------------
$ exit
logout
debug3: receive packet: type 96
debug2: channel 0: rcvd eof
debug2: channel 0: output open -> drain
debug2: channel 0: obuf empty
debug2: chan_shutdown_write: channel 0: (i0 o1 sock -1 wfd 5 efd 6 [write])
debug2: channel 0: output drain -> closed
debug3: receive packet: type 98
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug3: receive packet: type 98
debug1: client_input_channel_req: channel 0 rtype eow at openssh.com reply 0
debug2: channel 0: rcvd eow
debug2: chan_shutdown_read: channel 0: (i0 o3 sock -1 wfd 4 efd 6 [write])
debug2: channel 0: input open -> closed
debug3: receive packet: type 97
debug2: channel 0: rcvd close
debug3: channel 0: will not send data after close
debug2: channel 0: almost dead
debug2: channel 0: gc: notify user
debug2: channel 0: gc: user detached
debug2: channel 0: send close
debug3: send packet: type 97
debug2: channel 0: is dead
debug2: channel 0: garbage collecting
debug1: channel 0: free: client-session, nchannels 2
debug3: channel 0: status: The following connections are open:
   #0 client-session (t4 r2 i3/0 o3/0 e[write]/0 fd -1/-1/6 sock -1 cc -1 io 0x00/0x00)
   #1 127.0.0.1 (t4 r3 i0/0 o3/0 e[closed]/0 fd 7/7/-1 sock 7 cc -1 io 0x00/0x00)
------------------------------------------------------------------


The SSH client is hung at this point. The server reports:
------------------------------------------------------------------

debug2: channel 2: read failed rfd 10 maxlen 32768: Input/output error
debug2: channel 2: read failed
debug2: chan_shutdown_read: channel 2: (i0 o0 sock -1 wfd 10 efd -1 [closed])
debug2: channel 2: input open -> drain
debug2: channel 2: ibuf empty
debug2: channel 2: send eof
debug3: send packet: type 96
debug2: channel 2: input drain -> closed
debug1: Received SIGCHLD.
debug1: session_by_pid: pid 1840470
debug1: session_exit_message: session 0 channel 2 pid 1840470
debug2: channel 2: request exit-status confirm 0
debug3: send packet: type 98
debug1: session_exit_message: release channel 2
debug2: channel 2: write failed
debug2: chan_shutdown_write: channel 2: (i3 o0 sock -1 wfd 8 efd -1 [closed])
debug2: channel 2: send eow
debug3: send packet: type 98
debug2: channel 2: output open -> closed
debug3: mm_request_send: entering, type 30
debug2: channel 2: send close
debug3: send packet: type 97
debug3: mm_request_receive: entering
debug3: monitor_read: checking request 30
debug3: mm_answer_pty_cleanup: entering
debug1: session_by_tty: session 0 tty /dev/pts/6
debug3: mm_session_close: session 0 pid 1840469
debug3: mm_session_close: tty /dev/pts/6 ptyfd 4
debug3: channel 2: will not send data after close
debug1: session_pty_cleanup2: session 0 release /dev/pts/6
debug3: session_unused: session id 0 unused
debug3: receive packet: type 97
debug2: channel 2: rcvd close
debug3: channel 2: will not send data after close
debug2: channel 2: is dead
debug2: channel 2: gc: notify user
debug1: session_by_channel: session 0 channel 2
debug1: session_close_by_channel: channel 2 child 0
Close session: user chickey from 127.0.0.1 port 46454 id 0
debug3: session_unused: session id 0 unused
debug2: channel 2: gc: user detached
debug2: channel 2: is dead
debug2: channel 2: garbage collecting
debug1: channel 2: free: server-session, nchannels 4
debug3: channel 2: status: The following connections are open:
   #2 server-session (t4 r0 i3/0 o3/0 e[closed]/0 fd -1/-1/-1 sock -1 cc -1 io 0x00/0x00)
   #3 forwarded-tcpip: listening port 8888 for localhost port 0, connect from 127.0.0.1 port 54490 to 127.0.0.1 port 8888 (t4 r1 i3/0 o0/0 e[closed]/0 fd 9/9/-1 sock 9 cc -1 io 0x00/0x00)
------------------------------------------------------------------


Note that upon logout, there is nothing about the port-forwarding
channel (#1 on the client and #3 on the server) other than the final
reports. Logging out from the SSH session's interactive shell has no
effect on the forwarding channel, which is already hung by that point.

> IMO the root problem here is that channels in state SSH_CHANNEL_CONNECTING
> have no timeout unless there system's TCP stack implements one. Maybe
> OpenSSH should implement something conservative here.

If there's an issue there, then I think that is separate from the
trouble I'm having; as noted above, the channels in my case go into a
fully open state.

> I do notice some different behaviour between Linux (above) and OpenBSD.
> On OpenBSD the connection is accepted but obviously does not pass any
> data (of course).

This actually sounds exactly like what I'm seeing on Linux.

$ uname -a
Linux corey 5.18.0-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 5.18.5-1 (2022-06-16) x86_64 GNU/Linux

> This is harder to fix without the side effects I
> mentioned above, e.g. consider a TCP client program that connects to a
> forwarded socket, sends a message and exits without waiting for a reply.
> I think setting c->force_drain in this case could cause the message to be
> lost (though I'm not 100% sure).

 From channel_input_ieof():
------------------------------------------------------------------
         if (c->force_drain && c->istate == CHAN_INPUT_OPEN) {
                 debug("channel %d: FORCE input drain", c->self);
                 c->istate = CHAN_INPUT_WAIT_DRAIN;
                 if (sshbuf_len(c->input) == 0)
                         chan_ibuf_empty(ssh, c);
         }
------------------------------------------------------------------

And chan_ibuf_empty() is:
------------------------------------------------------------------
chan_ibuf_empty(struct ssh *ssh, Channel *c)
{
         debug2("channel %d: ibuf empty", c->self);
         if (sshbuf_len(c->input)) {
                 error("channel %d: chan_ibuf_empty for non empty buffer",
                     c->self);
                 return;
         }
         switch (c->istate) {
         case CHAN_INPUT_WAIT_DRAIN:
                 if (!(c->flags & (CHAN_CLOSE_SENT|CHAN_LOCAL)))
                         chan_send_eof2(ssh, c);
                 chan_set_istate(c, CHAN_INPUT_CLOSED);
                 break;
         default:
                 error("channel %d: chan_ibuf_empty for istate %d",
                     c->self, c->istate);
                 break;
         }
}
------------------------------------------------------------------

I don't know the full ramifications of that. It looks like there are
two immediate effects:
1. The channel input goes into a draining state.
2. If there is no buffered input, the channel input goes into a closed
    state.

For the case where there is no buffered input, that seems like it would
be ok to me, from a layman's perspective: no harm in closing something
when all data has been consumed.

If there is any buffered input, though, I don't know what ends up
happening later in the state machine.

Thanks again for your attention here. I'm figuring out what I can, but
I'm not familiar with the code or architecture.

-Corey


More information about the openssh-unix-dev mailing list