A couple of questions about OpenSSH codebase

Fri Feb 2 10:56:38 AEDT 2024

On Fri, 2 Feb 2024, PRIVET SUNSET wrote:

> Hello!
> I'm sorry in advance if I'm asking stupid questions, this is my first time
> dealing with a development list, so please excuse me if something is wrong
> with this message...
> 
> I'm pretty interested in the OpenSSH codebase, and a couple of questions
> arose while I was investigating it, and I guess this is the place where I
> can find answers.
> 1. There are a lot of allocations, even for short lived objects like
> sshbufs and sshkeys. Creating an sshbuf always requires at least one
> allocation, two allocations if it is created with sshbuf_new(). There are a
> lot of times when they are allocated and freed within the same function.
> Same thing with bitmaps. What is the reason behind not allocating them on
> the stack?

A few reasons, there may be others:

0) most of the time performance doesn't matter
1) heap allocations are cheap. malloc()/free() are among the most highly
   optimised code there is.
2) if you avoid the raw pointer interface, the sshbuf API is quite safe
   against accidental misuse and adversarial data. It makes sense to use
   it wherever possible because it's familiar, easier to reason about and
   reduces the risk of attack.
3) stack space is limited and bad things happen when you exceed this
   limit. If you exceed the heap limit then malloc will return NULL and
   you can decide what to do next. If you blow past the end of the stack
   then you get a SEGV at best and undefined behaviour at worst.

> 2. A lot of stuff in sshbuf's functions is checked against max_size. What
> is the reason behind setting the max_size in the first place? If
> sshbuf instance is not read-only and doesn't have children, why it cannot
> allocate more memory than it's max_size?

OpenSSH doesn't _need_ to buffer a lot of data, so setting a limit defends
against mistakes that could cause ssh/sshd to consume a lot of memory.
Setting this limit much less than INT_MAX makes any integer overflow
mistakes we might have made much harder to exploit.

> 3. There are a lot of very defensive checks in sshbuf code. A lot (if not
> all) of sshbuf_* functions that take a pointer to another sshbuf first
> check it with sshbuf_check_sanity(). As far as i understand, sshbuf object
> cannot become insane since all its functions preserve all invariants. It
> also cannot become insane through client code, since its members are hidden.

They could become inconsistent because we made a mistake in one of the
paths that preserve invariants though, or by memory corruption (e.g.
rowhammer). If you look through the history of the sshbuf.c file, you'll
see a couple of cases where I did make mistakes with invariant enforcement
(thankfully not exploitable in OpenSSH).

> 4. What is the reason behind not implementing sshkey as a tagged union? I
> mean encapsulating all key-type-specific mutual exclusive members in a
> union type, which will be embedded in struct sshkey, and accessed with
> checks against key type.

C doesn't really support tagged unions, just unions where everyone agrees
to follow the tagging correctly. This is fine as long as nobody makes
a mistake and nothing scribbles over memory but breaks terribly when one
of those thing happens.

Implementing sshkey as a struct is cheap (a few more pointers and we don't
load many keys anyway), simpler to reason about and slightly safer.

> Again sorry if those are noob questions. I would be grateful if someone
> would give me any insight.

Happy to answer. If you take anything from my reply, note the number of
times that I mentioned "defending against programmer mistakes" as a
reason :)

-d