Inexplicable EOF
David Newall
openssh at davidnewall.com
Sat Sep 26 16:47:48 AEST 2020
Hi All,
I'm seeing something which I would not believe could happen had I not
seen it for myself. It's intermittent and thus not easily repeatable.
On client, I ran: "ssh -i cert -x -T -oBatchMode=yes example.com
batchjob" and ssh exited cleanly, immediately, and displaying no
output. When I saw it, I repeated the command, with the same results, a
few times.
On example.com, cert.pub is installed in .ssh/authorized_keys. Batchjob
is executable and in PATH, and starts (after #! /bin/sh) with "echo
"`date`: $0 $*" >> /tmp/log". Nothing appeared /tmp/log for those
occassions when client's ssh exited cleanly, immediately and without
displaying output. Nothing appeared in /var/log/syslog to indicate that
something was awry.
The one clue that I can think of, if it is a clue, is that I have two
DNS A records published for example.com, one for the primary instance
and the other for a hot-backup. Both are virtual servers hosted on
different Linux servers. The primary is running and the hot-backup is
not. DRBD is used to keep the hot-backup synchronised with the
primary. The idea is that I can switch server roles without needing to
reconfigure client. My experience is that openssh tries all A records
in sequence until it connects and that this is a viable strategy.
I feel that openssh would print something if it was unable to get a
connection, and, that having got a connection, example.com would print
something to /tmp/log. For ssh to cleanly exit with nothing output and
nothing logged stumps me.
Example.com is x64 architecture running Ubuntu 16.04.5. It's reasonably
up-to-date except for openssh-server, which is 1:7.2p2-4ubuntu2.4.
Client is Raspberry Pi 3B+ running Raspbian 9.4. Openssh-client is
1:7.4p1-10+deb9u3.
Both machines have plenty of free memory.
It's uncommon for me to reach out for ideas, but I'm reaching out now.
Any ideas?
Thanks in advance.
David
More information about the openssh-unix-dev
mailing list