port-linux.c bug with oom_adjust_restore() - causes real bad oom_adj - which can cause DoS conditions.

Gert Doering gert at greenie.muc.de
Tue May 31 22:25:10 EST 2011


On Tue, May 31, 2011 at 12:11:13PM +0100, Cal Leeming [Simplicity Media Ltd] wrote:
> Could you point out the line of code where oom_adj_save is set to the
> original value, because I've looked everywhere, and from what I can
> tell, it's only ever set to INT_MIN, and no where else is it changed.
> (C is not my strongest language tho, so I most likely have overlooked
> something). This is where I got thrown off.

oom_adjust_setup() does this:

                if ((fp = fopen(oom_adj_path, "r+")) != NULL) {
                        if (fscanf(fp, "%d", &oom_adj_save) != 1)
                                verbose("error reading %s: %s", oom_adj_path,

the "fscanf()" call will read an integer ("%d") from the file named,
and write that number into the variable being pointed to (&oom_adj_save).

The loop is a bit tricky to read as it takes different paths into 
account, and will exit after the first successful update.

fscanf() will return the number of successful conversions, so if it
was able to read "one number", the return value is "1", and it will
jump to the else block

                        else {
                                if (fprintf(fp, "%d\n", value) <= 0)
                                        verbose("error writing %s: %s",
                                           oom_adj_path, strerror(errno));
                                        verbose("Set %s from %d to %d",
                                           oom_adj_path, oom_adj_save, value);

where it will overwrite what is in that file with the new value
("value"), and then print the "Set ... from -17 to -17" message that
you saw.

> > The question here is why sshd is sometimes started with -17 and sometimes
> > with 0 - that's the bug, not that sshd keeps what it's given.
> >
> > (Ask yourself: if sshd had no idea about oom_adj at all, would this make
> > it buggy by not changing the value?)
> This was what I was trying to pinpoint down before. I had came to this
> conclusion myself, sent it to the Debian bug list, and they dismissed
> on the grounds that it was an openssh problem...

I must admit that I have no idea what is causing it, but from the logs,
it very much looks like sshd is started with "-17" in there - but only
in the problem case.

> So far, the buck has been passed from kernel-mm to debian-kernel, to
> openssh, and now back to debian-kernel lol. The most annoying thing,
> is that you can't get this bug to happen unless you physically test on
> a machine which requires the bnx2 firmwire, so I get the feeling this
> won't get resolved :/

And *that* strongly points to a bug in the bnx2 stuff - if other programs
change their behaviour based on the existance of a given driver, that
does not smell very healthy.

> > Anyway, as a workaround for your system, you can certainly set
> >
> >  oom_adj_save = 0;
> >
> > in the beginning of port-linux.c / oom_adjust_restore(), to claim that
> > "hey, this was the saved value to start with" and "restore" oom_adj to 0
> > then - but that's just hiding the bug, not fixing it.
> I'm disappointed this wasn't the correct fix, I honestly thought I had
> patched it right :(

Well, that's the short hand - "just ignore everything that happened at
init / save time, and forcibly write back '0', no matter what was there

> But, on the other hand, ssh users should really never have a default
> oom_adj of -17, so maybe 0 should be set as default anyway? If this is
> not the case, could you give reasons why??

Well, I would say "the default value in there is a matter of local policy",
so what if someone wants to make sure that whatever is run from sshd is
always privileged regarding oom, even if a local firefox etc. is running
amock and you need to ssh-in and kill the GUI stuff...

One might opt to run sshd (and all its children) at "-5" (slightly special
treatment), or "0" (no special treatment), or even at "-17" - but that's
local policy.


Since this seems to be inherited, it might even work if you just change
the sshd startup script, and insert

  echo 0 >/proc/self/oom_adj

in there, right before it starts the sshd...  "local policy at work".


USENET is *not* the non-clickable part of WWW!
Gert Doering - Munich, Germany                             gert at greenie.muc.de
fax: +49-89-35655025                        gert at net.informatik.tu-muenchen.de

More information about the openssh-unix-dev mailing list