Call for testing: OpenSSH 8.7

Tom G. Christensen tgc at jupiterrise.com
Wed Aug 18 22:28:56 AEST 2021


On 18/08/2021 11:02, Darren Tucker wrote:
> I have not been able to reproduce this.  I've tried:
>   - disabling HAVE_PSELECT on a Linux system,
>   - disabling HAVE_PSELECT on a 32bit Solaris 10 VM
>   - disabling HAVE_PSELECT on a 64bit Solaris 11 VM
>   - restoring an old Solaris 7 backup onto a qemu 32bit sparc VM
> 
> Can I get some more details?  Compiler, OpenSSL version, configure 
> options, exact command used to invoke the test?  Oh, and are they 
> multiprocessor systems (maybe it's a race)?
> 

The Solaris 7 system has:
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/tgcware/libexec/gcc/sparc-sun-solaris2.7/4.5.4/lto-wrapper
Target: sparc-sun-solaris2.7
Configured with: ../gcc-4.5.4/configure --enable-obsolete 
--prefix=/usr/tgcware --with-local-prefix=/usr/tgcware/gcc45 
--bindir=/usr/tgcware/gcc45/bin --mandir=/usr/tgcware/gcc45/man 
--infodir=/usr/tgcware/gcc45/info --disable-nls --enable-shared 
--enable-threads=posix --with-gmp=/usr/tgcware --with-mpfr=/usr/tgcware 
--with-mpc=/usr/tgcware --with-cloog=/usr/tgcware 
--with-ppl=/usr/tgcware --without-gnu-ld --with-ld=/usr/ccs/bin/ld 
--with-gnu-as --with-as=/usr/tgcware/bin/gas 
--enable-languages=all,ada,obj-c++ --with-x --enable-java-awt=xlib 
--with-cpu=v7
Thread model: posix
gcc version 4.5.4 (tgcware 4.5.4-2)

$ openssl version
OpenSSL 1.0.2u  20 Dec 2019
$ file /usr/tgcware/lib/libssl.so
/usr/tgcware/lib/libssl.so:     ELF 32-bit MSB dynamic lib SPARC32PLUS 
Version 1, V8+ Required, dynamically linked, stripped
$

The system is a multi-processor system with 4x336Mhz US-II cpus.

The Solaris 9 system has:
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/tgcware/libexec/gcc/sparc-sun-solaris2.9/4.9.4/lto-wrapper
Target: sparc-sun-solaris2.9
Configured with: ../gcc-4.9.4/configure --enable-obsolete 
--prefix=/usr/tgcware --with-local-prefix=/usr/tgcware/gcc49 
--bindir=/usr/tgcware/gcc49/bin --mandir=/usr/tgcware/gcc49/man 
--infodir=/usr/tgcware/gcc49/info --disable-nls --enable-shared 
--enable-threads=posix --with-gmp=/usr/tgcware --with-mpfr=/usr/tgcware 
--with-mpc=/usr/tgcware --with-cloog=/usr/tgcware 
--with-isl=/usr/tgcware --with-cloog-backend=isl --without-gnu-ld 
--with-ld=/usr/ccs/bin/ld --with-gnu-as --with-as=/usr/tgcware/bin/gas 
--enable-languages=all,ada,obj-c++,go --with-x --enable-java-awt=xlib 
--with-cpu=v9 --with-pkgversion='tgcware 4.9.4-1' 
--with-bugurl=http://jupiterrise.com/tgcware
Thread model: posix
gcc version 4.9.4 (tgcware 4.9.4-1)

$ openssl version
OpenSSL 1.1.1k  25 Mar 2021
$ file /usr/tgcware/lib/libssl.so
/usr/tgcware/lib/libssl.so:     ELF 32-bit MSB dynamic lib SPARC32PLUS 
Version 1, V8+ Required, dynamically linked, stripped
$

The OS is running in a branded zone under Solaris 10 and the host system 
is a multi-processor system with 4x900Mhz US-III+ cpus.

On both systems for the purposes of testing I am building openssh like this:
./configure CC=gcc LDFLAGS="-L/usr/tgcware/lib -R/usr/tgcware/lib" 
CPPFLAGS="-I/usr/tgcware/include" --prefix=/tmp/ossh
make -j4

Then running the testsuite with 'make tests' or for just the rekey tests 
'make tests LTESTS=rekey SKIP_UNIT=1'

I don't have any single processor SPARC systems I can test with but I 
can off-line cpus. I just did that on the Solaris 7 system and with just 
a single cpu online and no revert the rekey test ran to completion with 
no hangs.


> Also a copy of the ssh.log and sshd.log from a hung instance (off-list 
> is fine)?
> 

This is from the Solaris 9 system with all 4 cpus online.
It hung almost immediately:
make[1]: Entering directory 
`/export/home/tgc/buildpkg/openssh/src/openssh-git/regress'
run test rekey.sh ...
client rekey KexAlgorithms=diffie-hellman-group1-sha1
client rekey KexAlgorithms=diffie-hellman-group14-sha1
client rekey KexAlgorithms=diffie-hellman-group14-sha256

At this point all ssh(d) processses are idle.

I've uploaded the logs here:
https://jupiterrise.com/tmp/?C=M;O=D
They should be at the top of the list.

>>     the other, and then a <defunct> child of the still running child.
>>     With truss I see that the client is still doing poll().
> 
> if you truss the sshd that's still alive and hung what's it doing?
> 

 From ps these are the relevant processes:

  F   UID   PID  PPID %C PRI NI   SZ  RSS    WCHAN S TT        TIME COMMAND
  0  3000 27640 27639  0  59 20 5376 3040 300d5f53020 S pts/13    0:00 
/bin/bash 
/export/home/tgc/buildpkg/openssh/src/openssh-git/regress/test-exec.sh 
/export/home/tgc/buildpkg/openssh/src/openssh-git/regress 
/export/home/tgc/buildpkg/openssh/src/openssh-git/regress/rekey.sh
0  3000 27640 27639  0  59 20 5376 3040 300d5f53020 S pts/13    0:00 
/bin/bash 
/export/home/tgc/buildpkg/openssh/src/openssh-git/regress/test-exec.sh 
/export/home/tgc/buildpkg/openssh/src/openssh-git/regress 
/export/home/tgc/buildpkg/openssh/src/openssh-git/regress/rekey.sh
  0  3000 27772 27640  0  59 20 7200 4640 301226e40b2 S pts/13    0:02 
/export/home/tgc/buildpkg/openssh/src/openssh-git/ssh 
-E/export/home/tgc/buildpkg/openssh/src/openssh-git/regress/ssh.log 
-oRekeyLimit=256k -oCompression=no -v -F 
/export/home/tgc/buildpkg/openssh/src/openssh-git/regress/ssh_proxy 
somehost cat > 
/export/home/tgc/buildpkg/openssh/src/openssh-git/regress/copy
  0  3000 27773 27772  0  59 20 7336 4696 30043c38b02 S pts/13    0:00 
/export/home/tgc/buildpkg/openssh/src/openssh-git/sshd -i -f 
/export/home/tgc/buildpkg/openssh/src/openssh-git/regress/sshd_proxy 
-E/export/home/tgc/buildpkg/openssh/src/openssh-git/regress/sshd.log
  0  3000 27775 27773  0  59 20 7616 2512 300f315a502 S pts/13    0:00 
/export/home/tgc/buildpkg/openssh/src/openssh-git/sshd -i -f 
/export/home/tgc/buildpkg/openssh/src/openssh-git/regress/sshd_proxy 
-E/export/home/tgc/buildpkg/openssh/src/openssh-git/regress/sshd.log
  0  3000 27776 27775  0   0       0    0                   Z  0:00 
<defunct>


Not much to see with truss:

$ truss -p 27772
poll(0xFFBFCD28, 1, -1)         (sleeping...)

$ truss -p 27773
poll(0xFFBFDB5C, 1, -1)         (sleeping...)

$ truss -p 27775
poll(0xFFBFD8C8, 1, -1)         (sleeping...)


-tgc


More information about the openssh-unix-dev mailing list