Debugging ssh-keygen dsa on Solaris8

Scott Burch somar at tiny.net
Thu Jul 26 01:01:05 EST 2001


Lutz,

Responding to some of your statements:

1) Both openssl 0.9.6b and openssh 2.9p2 were compiled by myself on
Solaris 8 as 32 bit binaries, I did not compile them for 64 bit.
(The exact same compiler and compiler flags were used...the only extra
flag I used was -g so I could use dbx).
2) If I use 0.9.5a of openssl I do not have any of these problems with
the exact same compiler and options. (This is why I think some code
change is causing this problem.
3) I can't get this work on any of our machines, however we do use
custom jumpstart for all of our systems....and maybe something in our
jumpstart image is causing this problem. I may have to build a new
jumpstart image from scratch from the April 2001 Solaris 8 release and
see if that makes a difference. This also happens on our 2.6 machines
(which are also built from jumpstart).

Lukas Ruf reported similar problems. He has a better situation than me,
however....he says compiles work fine on every system except one...in my
case I can't get the DSA stuff to work on any of the machines I am
working with unless I use openssl 0.9.5a. Unless anyone else has any
other ideas..I will try a new jumpstart image and see if that makes a
difference. I would ultimately like to figure out what in Solaris is
causing this not too work....it's starting to seem like some patch on
our systems is causing the problem, but I'm at a loss as to what it
might be...the system I am building on has all the recommended patches
from July 11th and is based on the January 2001 Solaris 8 release.

-Scott

On Tue, Jul 24, 2001 at 02:58:31PM -0500, Scott Burch wrote:
> (If there is anything else I can do to help let me know. The system is
> 5.8 Generic_108528-08 with the recommended patch cluster from July 11th.
> This is an Ultra10 workstation) I also have the same problem using gcc
> 2.95.3 on Solaris 8 and Solaris 2.6.

You are receiving a BUS error, which means that something is not
properly
aligned (e.g. omething is on a "odd" 4byte boundary while it should be
on a
8 byte boundary).
Please understand that I don't have Solaris around, so I can only give
you a wild guess. I would think, that the OpenSSL library was compiled
with some "64bit-alignment flag" (or maybe for some 64bit processor),
while
OpenSSH was compiled without this flag. Hence the members of the
structure
are not properly aligned and it will later fail in the OpenSSL library.

> Reading ssh-keygen
...
> program terminated by signal BUS (invalid address alignment)
> Current function is DSA_new_method (optimized)
>   127         ret->flags=ret->meth->flags;
> (/opt/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx)
> where                           
> =>[1] DSA_new_method(meth = ???) (optimized), at 0x4b6b8 (line ~127) in
> "dsa_lib.c"

At this point a member of a structure is accessed. malloc() always tends
to return data aligned for the worst case, so it only fails within
a structure, which is not properly aligned.

In any case: if the alignement (and hence the position) of members in 
a structure is wrong, the program must fail anyway, as the routines
accessing the members will pick up wrong data.

Best regards,
        Lutz



More information about the openssh-unix-dev mailing list