Test port not available

Tue May 5 09:48:40 EST 2009

While building OpenSSH 5.2p1, "make tests" was failing on my system with 
the error "no sshd running on port 4242".  After much head scratching, 
cursing and rooting around in the test scripts I finally figured out the 
real cause... something is already running on port 4242 (in my case, the 
Juniper Network Connect client).

This got me thinking, it might be nice if the code used to set the port 
was a bit more resilient and/or the error message a bit more informative.

On the first point, maybe this code in test-exec.sh:

   if [ ! -z "$TEST_SSH_PORT" ]; then
     PORT="$TEST_SSH_PORT"
   else
     PORT=4242
   fi

could be something similar to:

   if [ ! -z "$TEST_SSH_PORT" ]; then
     PORT="$TEST_SSH_PORT"
   else
     first_port=4200
     last_port=4300

     test_port=$first_port
     while [ "$test_port" -le "$last_port" ]; do
       netstat -na | grep "[:.]$test_port " >/dev/null 2>&1 || { PORT=$test_port; break; }
       test_port=`expr $test_port + 1`
     done

     if [ -z "$PORT" ]; then
       echo "Unable to find usable test port between $first_port and $last_port.  Define \$TEST_SSH_PORT."
       exit 2
     fi
   fi

I realize the 'netstat | grep' command above may not be the best, most 
portable way to look for an available port, but the idea is the same... 
ie. run something that checks to see if the requested port is available or 
not.  FYI, I did run the above code on several boxes (solaris, hpux, 
macosx, fedora, redhat, centos, cygwin) and it seemed to work fine.

Another possible way to check would be to run sshd -D and look at the 
output.  In my case it says:

   Bind to port 4242 on 127.0.0.1 failed: Address already in use.
   Cannot bind any address.

Which (again in my particular case), would have been a much more helpful 
error message then "no sshd running on port 4242".

Of course if it works, then a separate ssh command would need to run to 
close it... or I suppose just kill it.

Or... maybe the sshd -t option could be enhanced (or a new option created) 
that actually checks to see if the port is available, but doesn't actually 
start a server.  I suspect this really means it would try to bind to that 
port and then just close if it works, otherwise report the error. 
Problem is, on some system I believe it can take a while before a port can 
be reused... so a passive check would be better.

If the idea of checking for multiple ports isn't acceptable, then what 
about at least enhancing the error message a bit:

    no sshd running on port 4242, try setting $TEST_SSH_PORT to a different port

PS. As a side note... while reading the code I noticed I could set 
TEST_SSH_LOGFILE to a filename for any output from various commands 
instead of sending all the output to /dev/null.   Is there a particular 
reason a logfile isn't created by default?

I mean, in the grand scheme of things, is the output produced in this file 
really that large or for some other reason unwanted?  When I first came 
across the initial error I looked for a test log file... something that 
might contain a bit more info on why the failure occurred.  It wasn't until 
I couldn't find anything and had to resort to reading the test scripts 
that I discovered a log file is created if TEST_SSH_LOGFILE is defined.