[Lutz.Jaenicke at aet.TU-Cottbus.DE: 2.9p1: HP-UX 10.20 utmp/wtmp handling broken?]

Lutz Jaenicke Lutz.Jaenicke at aet.TU-Cottbus.DE
Thu Jun 28 20:04:28 EST 2001


On Sat, Jun 23, 2001 at 12:03:06AM +0200, Markus Friedl wrote:
> i get this on openbsd-current:
> 
> Connection closed by remote host.
> debug1: channel_free: channel 0: server-session, nchannels 1
> debug3: channel_free: status: The following connections are open:
>   #0 server-session (t4 r0 i1/0 o16/0 fd 4/3)
> 
> debug1: channel_free: channel 0: dettaching channel user
> debug1: session_by_channel: session 0 channel 0
> debug1: session_close_by_channel: channel 0 kill 5870
> debug1: Received SIGCHLD.
> debug3: channel_close_fds: channel 0: r 4 w 3 e -1
> debug1: session_by_pid: pid 5870
> debug1: session_close: session 0 pid 5870
> debug1: session_pty_cleanup: session 0 release /dev/ttyqc
> Closing connection to 127.0.0.1

Today I found some time to fire up DDD against the latest CVS version:
* The problem still persists in normal operation.
--------------------------------------------------------------------------
Connection closed by remote host.
debug1: channel_free: channel 0: server-session, nchannels 2
debug3: channel_free: status: The following connections are open:
  #0 server-session (t4 r0 i1/0 o16/0 fd 7/3)

debug1: channel_free: channel 0: dettaching channel user
debug1: session_by_channel: session 0 channel 0
debug1: session_close_by_channel: channel 0 kill 8599
debug3: channel_close_fds: channel 0: r 7 w 3 e -1
debug1: channel_free: channel 1: X11 inet listener, nchannels 1
debug3: channel_free: status: The following connections are open:

debug3: channel_close_fds: channel 1: r 8 w 8 e -1
Closing connection to 141.43.132.151
--------------------------------------------------------------------------
* The problem does not appear when stepping through the code.
-> This immediatly brings up the idea of a race condition.
  session_by_pid is called, when SIGCHLD has been received and is
  detected at the end of serverloop2(). Due to the WNOHANG flag,
  if SIGCHLD was not yet received after channel_free_all() was finished,
  the exiting of the child won't be noted and session_close_by_pid()
  will never be called (for that child).

  For test purposes, I have extended the code by a "sleep(1)" and
  now session_close_by_pid() is properly called:

        channel_free_all();
#start inserted by Lutz
        sleep(1);
#end inserted by Lutz 
        signal(SIGCHLD, SIG_DFL);
        while ((pid = waitpid(-1, &status, WNOHANG)) > 0)
                session_close_by_pid(pid, status);

--------------------------------------------------------------------------
Connection closed by remote host.
debug1: channel_free: channel 0: server-session, nchannels 2
debug3: channel_free: status: The following connections are open:
  #0 server-session (t4 r0 i1/0 o16/0 fd 7/3)

debug1: channel_free: channel 0: dettaching channel user
debug1: session_by_channel: session 0 channel 0
debug1: session_close_by_channel: channel 0 kill 8729
debug3: channel_close_fds: channel 0: r 7 w 3 e -1
debug1: channel_free: channel 1: X11 inet listener, nchannels 1
debug3: channel_free: status: The following connections are open:

debug3: channel_close_fds: channel 1: r 8 w 8 e -1
debug1: Received SIGCHLD.
debug1: session_by_pid: pid 8729
debug1: session_close: session 0 pid 8729
debug1: session_pty_cleanup: session 0 release /dev/pts/9
Closing connection to 141.43.132.151
--------------------------------------------------------------------------

That leaves the question about a "cleaner" solution to this effect.
After first sending the bug report, I received private mails indicating,
that this effect has also been seen on Linux, so it is not a pure HP-problem.

As session_close_by_channel() kill()s the child with either TERM or HUP,
the child has the opportunity to perform some cleanup before exit, so
it is well possible that there is a delay causing the problem.
(I use tcsh to see this effect, for what its worth.)

Best regards,
	Lutz
-- 
Lutz Jaenicke                             Lutz.Jaenicke at aet.TU-Cottbus.DE
BTU Cottbus               http://www.aet.TU-Cottbus.DE/personen/jaenicke/
Lehrstuhl Allgemeine Elektrotechnik                  Tel. +49 355 69-4129
Universitaetsplatz 3-4, D-03044 Cottbus              Fax. +49 355 69-4153



More information about the openssh-unix-dev mailing list