openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd
Damien Miller
djm at mindrot.org
Thu Aug 23 14:01:15 AEST 2018
On Wed, 22 Aug 2018, Peter Stuge wrote:
> kevin martin wrote:
> > not sure why having the systemd notify code in openssh as a
> > configure time option would be such a bad thing.
>
> At the very least it introduces a dependency on libsystemd into sshd,
> which is undesirable for reasons of security and convenience. The
> principle of "you are done when you can not remove any more" confirms
> that it is unwise to add dependencies without very careful consideration.
>
>
> I've read through the debian and Red Hat bug reports.
>
> There are two different but related problems here:
>
> 1. For systemctl [re]start, when a .service file has Type=simple,
> systemd assumes that service startup can never fail, and immediately
> considers this service successfully started when the exec() of sshd
> has succeeded.
>
> That's debatable design within systemd, but it's hard for systemd to
> know when a given service has actually started successfully, and
> services which fit that assumption do exist.
>
> So when sshd detects an error on startup and exits with an error code
> shortly after being started, systemd considers the service to first
> have started successfully and then to have exited with an error, so
> it then restarts the service. Repeat.
>
> When service limits are exhausted the service ends up in a failed state.
>
> Meanwhile, the systemctl [re]start command doesn't report any error
> to the administrator, because systemd considers the service to have
> [re]started successfully once. This is "error messages are lost".
>
>
> 2. For systemctl reload, systemd can and arguably should send SIGHUP
> to sshd. More uncertainty and assumptions within systemd follows;
> sshd re-exec:s, meaning that the PID stays the same, so systemd
> doesn't receive SIGCHLD and so even if 1. is fixed, here systemd will
> not understand that there an error during startup of the new sshd is
> to be considered a failed reload. Ie. the above problems apply here
> again. The systemctl reload sshd command is always immediately
> successful, even if re-exec:ed sshd detects an error in the config
> file.
Thanks for the detailed write up, Peter.
I agree: what is happening here seems to be mostly bad assumptions and
inflexibility inside systemd.
I'm surprised that systemd made these design decisions, because sshd is
not doing anything historically unique with regards to startup or reload
behaviour and "works with existing daemons" seems to be requirement #0
if you're writing an init system.
Maybe the other daemon vendors didn't push back against this, but I'm
willing to.
-d
More information about the openssh-unix-dev
mailing list