SIGCHLD race condition? (fwd)

Paul Menage pmenage at ensim.com
Wed Sep 26 10:43:41 EST 2001


Can anyone offer any advice on this issue? We've tried patching sshd to 
have a maximum 10 second timeout when calling select() in serverloop.c, 
and this doesn't appear to have had any ill effects.

Thanks,

Paul

------- Forwarded Message

Date:    Tue, 18 Sep 2001 16:49:40 -0700
From:    Paul Menage <pmenage at ensim.com>
To:      mouring at etoh.eviladmin.org
cc:      Paul Menage <pmenage at ensim.com>, openssh-unix-dev at mindrot.org
Subject: Re: SIGCHLD race condition? 

>
>Can you test against 2.9p2 or the current snapshots.. There has been some
>SIGCHLD changes since 2.5.2pX series.
>

The signal handling strategy has changed, but the race condition in
wait_until_can_do_something(), between checking child_terminated and
calling select(), is still there.

I can reproduce exactly the same lockup with RedHat/RawHide 2.9p2. 

Would putting a maximum timeout for select() break anything? If not,
then it would at least prevent the system from deadlocking permanently, 
even if it's not a very elegant fix.

Paul

------- End of Forwarded Message






More information about the openssh-unix-dev mailing list