openssh and defensive programming (or lack thereof)

Wed Dec 19 15:43:09 EST 2001

On Tue, 18 Dec 2001, James Ralston wrote:

> Before I start, a disclaimer: I've spent (literally) a few weeks
> pondering this issue and how to best raise it (where I define "best"
> as meaning "most likely to lead to a productive dialog" instead of
> "most likely to fill the openssh-unix-dev mailing list with random
> flamage").  Despite that, some of what I say will no doubt be
> abrasive, or perhaps even offensive.  That's not my intent.  My intent
> is to help make openssh a better, more secure software product.
>
> On Wed, 14 Nov 2001, Markus Friedl wrote:
>
> >> ------- Additional Comments From ralston at pobox.com  2001-11-13
> >> 15:59 ------- The question is irrelevant; regardless of how one
> >> chooses to answer it, the answer does not make sshd's behavior (of
> >> not making sure all inherited descriptors are closed) any less
> >> broken.
> >
> > no, i don't understand. if the program calling sshd breaks because
> > sshd does not close fd 142 on startup, then the program is broken.
> > it must close its filedescriptors.
>
> Ok, fine.  Let's declare that a startup procedure/program that leaves
> descriptors other than [0,1,2] open when it invokes sshd is broken.
>

First off let me ensure something is crystal clear..

1) If Markus applied every patch for  every broken OS to the OpenBSD code
it would be impossible to audit and we return back to the dark aged of SSH
1.2.x days where hacks were applied to hack until the code was soo bad
that no one  understood why it worked or did not work. (Which is why the
original hack to v1 was applied.  No one took the @#$%^&* time to
understand nor care!) [There is a reason we have 'portable' releases.]

2) This is an issue with how each OS handles tty and file descriptors.  In
which (I will freely admit) is not something I fully can grasp on a
programming level, but still does not invalid what has been pointed out by
Markus and others as what the core issue is.

3) I'm all for Defensive programming when it is correct, but applying a
bandaid to an application and proclaiming 'This is defensive programming'
is pure and utter horse crap.  And to steal a line from Princess Bride,
"Anyone telling you differently is selling something."

No one (including myself) has ever gotten to the HEART of the issue (Even
if Markus has brough it up over and over).  It keeps being outskirted by
everyone who has been bitching about it.  *WHY* is this occuring on some
platforms?  *WHAT* are we *NOT* doing *CORRECTLY* to have it work on some
systems, and fail on others?

No one has yet come up with answers to such questions. I repeat.. *NO
ONE*..Of course people have suggested bandaids or what they would 'like'
OpenSSH to do, but that does not solve the bug.  IT just covers it up.

Why can I take MySQL right from the raw .tar.gz file steal the startup
script from Redhat and not have a problem restarting it on my Sparc SS20
running OpenBSD?  Yet compiling the *SAME* code on Redhat/Mandrake/etc
using the *SAME* start up script it hangs.  Are we vhang() in the worng
place?  are we not following SysV rules for group processes..  SOMEONE
GIVE ME NON-BULLSHIT reason and not 'Because telnet does not do it'..
Which is not a valid reason.

Are you starting to see the point?  If not I suggest you re-read this
email and dig through some of Linus' rants about solving the bug and not
covering them up.  He make a lot of very good points.

- Ben
Disclaimer: I am not a foaming at the mouth Linus, Linux, nor OpenBSD
fan.  Hell.. I rip on every OS I use and develop on equally.  Don't
believe me drop by #unixhelp on efnet and ask around.  You'll find it is
not limited to OS rants either (try perl, php, C, stupid users, people who
don't 'RTFM', or don't care to do their research, etc).