Call for testing: OpenSSH 6.8

Corinna Vinschen vinschen at redhat.com
Mon Feb 23 21:33:44 AEDT 2015


Hi Damien,

On Feb 23 10:28, Corinna Vinschen wrote:
> On Feb 22 07:59, Damien Miller wrote:
> > On Sat, 21 Feb 2015, Corinna Vinschen wrote:
> > > - The failing last loop in the "forwarding" script as reported back
> > >   during 6.7 testing is still failing for me more often than not.  It's
> > >   always the same reason, the script tries to use in-use port numbers.
> > >   Reducing the forwarding script to only this last test loop succeeds
> > >   every time, but is quite a hack for testing.
> > 
> > Is it colliding with itself or with other services running on your
> > test host? (especially with ones windows starts itself)
> 
> It's colliding with itself, afaics.  It tries to use ports 3301/3302
> again after it already used it in a former test in the script, and with
> a high probability they are still taken.  The same ports are used for
> earlier tests in the same script.  On the testmachine, only very few
> ports are taken by Windows processes and only one of them (RDP, port
> 3389) in the range used by the tests.
> 
> Note that this is still the same as described in
> http://lists.mindrot.org/pipermail/openssh-unix-dev/2014-August/032842.html
> 
> See also
> http://lists.mindrot.org/pipermail/openssh-unix-dev/2014-August/032854.html
> 
> And this workaround still lets the test succeed:
> http://lists.mindrot.org/pipermail/openssh-unix-dev/2014-August/032862.html

I think I'm a step closer to a solution.

I just added debug output to the forwarding.sh script and it turns out
that the test prior to the "transfer over chained unix domain socket
forwards ..." test, namely

  echo "LocalForward ${base}01 127.0.0.1:$PORT" >> $OBJ/ssh_config
  echo "RemoteForward ${base}02 127.0.0.1:${base}01" >> $OBJ/ssh_config
  for p in 1 2; do
	  trace "config file: start forwarding, fork to background"
	  ${SSH} -$p -F $OBJ/ssh_config -f somehost sleep 10

	  trace "config file: transfer over forwarded channels and check result"
	  ${SSH} -F $OBJ/ssh_config -p${base}02 -o 'ConnectionAttempts=4' \
		  somehost cat ${DATA} > ${COPY}
	  test -s ${COPY}         || fail "failed copy of ${DATA}"
	  cmp ${DATA} ${COPY}     || fail "corrupted copy of ${DATA}"

	  wait
  done

leaves the ports 3301/3302 in TIME_WAIT state (as is 4242 from some
earlier test).  Here are the relevant excerpts from ps -e and (Windows)
netstat output.  The first group is the output prior to the above test:

      PID    PPID    PGID     WINPID   TTY     UID    STIME COMMAND
     1320    3632    1320       3964  ?       1049577 11:19:26 /home/corinna/sshbuild/bin/sshd
     3512    3632    3512       3444  ?       1049577 11:19:27 /home/corinna/sshbuild/bin/sshd
     3632       1    3632       3632  ?       1049577 11:17:49 /home/corinna/sshbuild/bin/sshd

Active Connections

  Proto  Local Address          Foreign Address        State           PID
  TCP    127.0.0.1:4242         0.0.0.0:0              LISTENING       3632
  TCP    127.0.0.1:61665        127.0.0.1:4242         TIME_WAIT       0
  TCP    127.0.0.1:61666        127.0.0.1:61667        TIME_WAIT       0
  TCP    127.0.0.1:61668        127.0.0.1:61669        TIME_WAIT       0
  TCP    127.0.0.1:61673        127.0.0.1:4242         TIME_WAIT       0
  TCP    127.0.0.1:61674        127.0.0.1:61675        TIME_WAIT       0
  TCP    127.0.0.1:61676        127.0.0.1:61677        TIME_WAIT       0
  TCP    127.0.0.1:61679        127.0.0.1:4242         TIME_WAIT       0
  TCP    127.0.0.1:61680        127.0.0.1:61681        TIME_WAIT       0
  TCP    127.0.0.1:61682        127.0.0.1:61683        TIME_WAIT       0
  TCP    127.0.0.1:61686        127.0.0.1:4242         TIME_WAIT       0
  TCP    127.0.0.1:61687        127.0.0.1:61688        TIME_WAIT       0
  TCP    127.0.0.1:61689        127.0.0.1:61690        TIME_WAIT       0
  TCP    127.0.0.1:61691        127.0.0.1:4242         TIME_WAIT       0
  TCP    127.0.0.1:61692        127.0.0.1:61693        TIME_WAIT       0
  TCP    127.0.0.1:61694        127.0.0.1:61695        TIME_WAIT       0

the second group is the output from right between the above test and the
"transfer over chained unix domain socket forwards..."

      PID    PPID    PGID     WINPID   TTY     UID    STIME COMMAND
     3112     376     376       3212  ?       1049577 11:19:29 /usr/bin/sleep
     3520     388     388       3716  ?       1049577 11:19:32 /usr/bin/sleep
     3056    3632    3056       2100  ?       1049577 11:19:31 /home/corinna/sshbuild/bin/sshd
     4048       1    4048       4048  ?       1049577 11:19:31 /home/corinna/sshbuild/bin/ssh
     2372    3632    2372       4024  ?       1049577 11:19:33 /home/corinna/sshbuild/bin/sshd
     2728       1    2728       2728  ?       1049577 11:19:28 /home/corinna/sshbuild/bin/ssh
     2908    3632    2908        328  ?       1049577 11:19:28 /home/corinna/sshbuild/bin/sshd
     3632       1    3632       3632  ?       1049577 11:17:49 /home/corinna/sshbuild/bin/sshd

Active Connections

  Proto  Local Address          Foreign Address        State           PID
  TCP    127.0.0.1:3301         0.0.0.0:0              LISTENING       2640
  TCP    127.0.0.1:3301         127.0.0.1:61714        CLOSE_WAIT      2640
  TCP    127.0.0.1:3302         0.0.0.0:0              LISTENING       328
  TCP    127.0.0.1:3302         127.0.0.1:61713        CLOSE_WAIT      328
  TCP    127.0.0.1:4242         0.0.0.0:0              LISTENING       3632
  TCP    127.0.0.1:4242         127.0.0.1:61696        ESTABLISHED     3632
  TCP    127.0.0.1:4242         127.0.0.1:61708        ESTABLISHED     3632
  TCP    127.0.0.1:4242         127.0.0.1:61715        CLOSE_WAIT      3632
  TCP    127.0.0.1:61673        127.0.0.1:4242         TIME_WAIT       0
  TCP    127.0.0.1:61679        127.0.0.1:4242         TIME_WAIT       0
  TCP    127.0.0.1:61680        127.0.0.1:61681        TIME_WAIT       0
  TCP    127.0.0.1:61682        127.0.0.1:61683        TIME_WAIT       0
  TCP    127.0.0.1:61686        127.0.0.1:4242         TIME_WAIT       0
  TCP    127.0.0.1:61687        127.0.0.1:61688        TIME_WAIT       0
  TCP    127.0.0.1:61689        127.0.0.1:61690        TIME_WAIT       0
  TCP    127.0.0.1:61691        127.0.0.1:4242         TIME_WAIT       0
  TCP    127.0.0.1:61692        127.0.0.1:61693        TIME_WAIT       0
  TCP    127.0.0.1:61694        127.0.0.1:61695        TIME_WAIT       0
  TCP    127.0.0.1:61696        127.0.0.1:4242         ESTABLISHED     2640
  TCP    127.0.0.1:61697        127.0.0.1:61698        TIME_WAIT       0
  TCP    127.0.0.1:61699        127.0.0.1:61700        TIME_WAIT       0
  TCP    127.0.0.1:61701        127.0.0.1:3302         TIME_WAIT       0
  TCP    127.0.0.1:61702        127.0.0.1:3301         TIME_WAIT       0
  TCP    127.0.0.1:61703        127.0.0.1:4242         TIME_WAIT       0
  TCP    127.0.0.1:61704        127.0.0.1:61705        TIME_WAIT       0
  TCP    127.0.0.1:61706        127.0.0.1:61707        TIME_WAIT       0
  TCP    127.0.0.1:61708        127.0.0.1:4242         ESTABLISHED     2988
  TCP    127.0.0.1:61709        127.0.0.1:61710        TIME_WAIT       0
  TCP    127.0.0.1:61711        127.0.0.1:61712        TIME_WAIT       0
  TCP    127.0.0.1:61713        127.0.0.1:3302         FIN_WAIT_2      3380
  TCP    127.0.0.1:61714        127.0.0.1:3301         FIN_WAIT_2      2728
  TCP    127.0.0.1:61715        127.0.0.1:4242         FIN_WAIT_2      328
  TCP    127.0.0.1:61716        127.0.0.1:61717        TIME_WAIT       0
  TCP    127.0.0.1:61718        127.0.0.1:61719        TIME_WAIT       0

This may well be a problem local to Windows.  Btw., the large number of
AF_INET sockets is a result of the way how Cygwin implements AF_LOCAL
sockets:  They are emulated by local AF_INET sockets since WIndows
doesn't know the concept of AF_LOCAL sockets.

Note that there are still sleep processes running.  So on a hunch I just
added a `sleep 30' between the two tests and, lo and behold, the
forwarding.sh test completes successfully every time:

diff --git a/regress/forwarding.sh b/regress/forwarding.sh
index f799d49..1489407 100644
--- a/regress/forwarding.sh
+++ b/regress/forwarding.sh
@@ -120,6 +120,8 @@ for p in 1 2; do
 	wait
 done
 
+sleep 30
+
 for p in 2; do
 	trace "transfer over chained unix domain socket forwards and check result"
 	rm -f $OBJ/unix-[123].fwd


Thanks,
Corinna


-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20150223/ba968482/attachment-0001.bin>


More information about the openssh-unix-dev mailing list