port-linux.c bug with oom_adjust_restore() - causes real bad oom_adj - which can cause DoS conditions.

Cal Leeming [Simplicity Media Ltd] cal.leeming at simplicitymedialtd.co.uk
Tue May 31 22:18:06 EST 2011

On Tue, May 31, 2011 at 1:11 PM, Darren Tucker <dtucker at zip.com.au> wrote:
> On Tue, May 31, 2011 at 9:11 PM, Cal Leeming [Simplicity Media Ltd]
> <cal.leeming at simplicitymedialtd.co.uk> wrote:
> [...]
>> Could you point out the line of code where oom_adj_save is set to the
>> original value, because I've looked everywhere, and from what I can
>> tell, it's only ever set to INT_MIN,
> It's read from /proc/self/oom_adj at startup time in oom_adjust_setup():
>               if ((fp = fopen(oom_adj_path, "r+")) != NULL) {
>                        if (fscanf(fp, "%d", &oom_adj_save) != 1)
> This is the reason for the "Set /proc/self/oom_adj from -17 to -17"
> probably why Gert commented on it.
> Basically, sshd sets the listening process to -17, then restores
> whatever was previously set for all forked processes.  If oom_adj was
> previously -17, sshd will restore that.
> [...]
>> This was what I was trying to pinpoint down before. I had came to this
>> conclusion myself, sent it to the Debian bug list, and they dismissed
>> on the grounds that it was an openssh problem...
> I'd suggest "grep -rl oom_adj /etc" and see if one of your system
> startup scripts sets it.  Failing that, I'd try cold booting your
> machine without your problem module, modprobe it and check
> /proc/self/oom_adj and see if the modprobe or module loading somehow
> changes that (I can't imagine that it would, but you seem to have a
> really strange case here...).

Oh trust me, I looked *everywhere*. Even to the extent of running
tripwire from a bare bones system, and looking manually at every
change made. I also looked for loads of different keywords (-17, oom,
proc, self) etc. Spent hours on it :/

As for the comment about the modprobe, I already did all this (full
debug can be found at
), and found that when the bnx2 module isn't loaded, the problem goes
away.. When it is loaded, the problem comes back.

This is what I mean by it being a very VERY strange problem.

My guess is that the bnx2 firmware does some sort of
kernel-to-userspace weirdness, which causes user land apps (which have
to go through the bnx2 in the network stack) to somehow inherit the
-17 that all kernel processes get... Sadly, I don't know enough about
how the kernel works to even begin to debug the problem.. plus (from
what I can tell) the firmware (*.fw) is closed source..

I have a very strong feeling that the buck will probably just get
passed around until it eventually gets forgotten about :( I wish I
knew more about kernel development to try and fix this issue!

> --
> Darren Tucker (dtucker at zip.com.au)
> GPG key 8FF4FA69 / D9A3 86E9 7EEE AF4B B2D4  37C9 C982 80C7 8FF4 FA69
>     Good judgement comes with experience. Unfortunately, the experience
> usually comes from bad judgement.

More information about the openssh-unix-dev mailing list