[Bug 3639] server thread aborts during client login after receiving SSH2_MSG_KEXINIT due to ssh_sandbox_violation
bugzilla-daemon at mindrot.org
bugzilla-daemon at mindrot.org
Mon Dec 18 18:00:16 AEDT 2023
https://bugzilla.mindrot.org/show_bug.cgi?id=3639
--- Comment #26 from JM <jtm.moon.forum.user+mindrot at gmail.com> ---
tl;dr a seccomp sandbox violation `20` occurs from a `read` (still).
This is just a more detailed retelling of what was previously
discussed.
Scroll to end for thoughts...
### problem specifics
Failed `read` in parent process
read(fd, s + pos, n - pos);
which is
read(5, '\014\221b', 4);
returns `0`.
Failed `read` will cause the following audit event (journalctl -f -x)
Dec 17 15:59:35 host1 kernel: audit: type=1326
audit(1702857575.824:3180): auid=4294967295 uid=107 gid=65534
ses=4294967295 pid=1920 comm="sshd"
exe="/root/Projects/openssh-9.2p1-WIP/sshd" sig=31 arch=40000028
syscall=20 compat=1 ip=0xf7afd80c code=0x0
And the same when compiled with
`CFLAGS="-DDSANDBOX_SECCOMP_FILTER_DEBUG"`
Dec 17 22:37:50 pifuboo audit[10678]: SECCOMP auid=4294967295
uid=107 gid=65534 ses=4294967295 pid=10678 comm="sshd"
exe="/root/Projects/openssh-9.2p1-WIP/sshd" sig=31 arch=40000028
syscall=20 compat=1 ip=0xf77de80c code=0x0
Dec 17 22:37:50 pifuboo audit[10678]: ANOM_ABEND auid=4294967295
uid=107 gid=65534 ses=4294967295 pid=10678 comm="sshd"
exe="/root/Projects/openssh-9.2p1-WIP/sshd" sig=31 res=1
The failed linux system call `20` is `epoll_create1` according to
`ausyscall`
$ ausyscall 20
epoll_create1
So the `read` at some point calls syscall `20`. See section "Summary
Thoughts" about this.
Here is the failed `read` call
https://github.com/openssh/openssh-portable/blob/V_9_2_P1/atomicio.c#L66
It is always the `read` call with values `fd=5`, `n=4`.
`read` returns `0`.
`errno` is not changed after `read` returns.
The stack just before the failed `read` call is:
#1 0x004701f8 in atomicio6
(f=f at entry=0xf7c65478 <read>, fd=fd at entry=5, _s=0xfffeead8,
_s at entry=0xfffeead0, n=n at entry=4, cb=cb at entry=0x0,
cb_arg=cb_arg at entry=0x0)
at atomicio.c:67
#2 0x00470284 in atomicio
(f=f at entry=0xf7c65478 <read>, fd=fd at entry=5,
_s=_s at entry=0xfffeead0, n=n at entry=4)
at atomicio.c:101
#3 0x00434520 in mm_request_receive
(sock=5, m=m at entry=0x4f2b88)
at monitor_wrap.c:149
#4 0x00431178 in monitor_read
(ssh=ssh at entry=0x4f3388, pmonitor=pmonitor at entry=0x4f2498,
ent=0x4e0114 <mon_dispatch_proto20>, pent=pent at entry=0xfffeeb78)
at monitor.c:501
#5 0x00433b5c in monitor_child_preauth
(ssh=ssh at entry=0x4f3388, pmonitor=0x4f2498)
at monitor.c:301
#6 0x0040a388 in privsep_preauth
(ssh=0x4f3388)
at sshd.c:502
#7 main
(ac=<optimized out>, av=0x4e31a0)
at sshd.c:2240
(line numbers in `atomicio.c` slightly off due to insertion of
`raise(SIGINT);`)
The debug log from the start of client connection is:
debug3: fd 5 is not O_NONBLOCK
debug1: Server will not fork when running in debugging mode.
debug3: send_rexec_state: entering fd = 8 config len 3296
debug3: ssh_msg_send: type 0
debug3: send_rexec_state: done
debug1: rexec start in 5 out 5 newsock 5 pipe -1 sock 8
process 32277 is executing new program:
/root/Projects/openssh-9.2p1/sshd
[Thread debugging using libthread_db enabled]
Using host libthread_db library
"/lib/arm-linux-gnueabihf/libthread_db.so.1".
debug3: recv_rexec_state: entering fd = 5
debug3: ssh_msg_recv entering
debug3: recv_rexec_state: done
debug2: parse_server_config_depth: config rexec len 3296
debug3: rexec:14 setting Port 55222
debug3: rexec:22 setting HostKey
/root/Projects/openssh-9.2p1/ssh_host_ecdsa_key
debug3: rexec:23 setting HostKey
/root/Projects/openssh-9.2p1/ssh_host_ed25519_key
debug3: rexec:24 setting HostKey
/root/Projects/openssh-9.2p1/ssh_host_rsa_key
debug3: rexec:45 setting AuthorizedKeysFile .ssh/authorized_keys
debug3: rexec:113 setting Subsystem sftp
/usr/libexec/sftp-server
debug1: sshd version OpenSSH_9.2, OpenSSL 1.1.1w 11 Sep 2023
debug1: private host key #0: ecdsa-sha2-nistp256
SHA256:7OUDaY7vmsaJPDkqGWPmdiw5kjY4bVSwCd94nJqT7/o
debug1: private host key #1: ssh-ed25519
SHA256:CuPO+bnbHMCkaNEybTHeYSjdNpiNdAlntO9gh0V9lxs
debug1: private host key #2: ssh-rsa
SHA256:ZYZLLhbWdOMFKDGw3pcn954Wz6RhwtDoI5WjJsZpXhk
debug1: inetd sockets after dupping: 3, 3
debug3: process_channel_timeouts: setting 0 timeouts
debug3: channel_clear_timeouts: clearing
Connection from 192.168.124.214 port 57930 on 192.168.124.214 port
55222 rdomain ""
debug1: Local version string SSH-2.0-OpenSSH_9.2
debug1: Remote protocol version 2.0, remote software version
OpenSSH_8.4p1 Raspbian-5+deb11u2
debug1: compat_banner: match: OpenSSH_8.4p1 Raspbian-5+deb11u2 pat
OpenSSH* compat 0x04000000
debug2: fd 3 setting O_NONBLOCK
debug3: ssh_sandbox_init: preparing seccomp filter sandbox
[Detaching after fork from child process 32308]
debug2: Network child is on pid 32308
debug3: preauth child monitor started
debug3: privsep user:group 107:65534 [preauth]
debug1: permanently_set_uid: 107/65534 [preauth]
debug3: ssh_sandbox_child: setting PR_SET_NO_NEW_PRIVS [preauth]
debug3: ssh_sandbox_child: attaching seccomp filter program
[preauth]
debug1: monitor_read_log: child log fd closed
debug3: mm_request_receive: entering
Dumping `/proc/$parentpid/status` just before the `read` failure shows:
Seccomp: 0
Seccomp_filters: 0
Dumping `/proc/$childpid/status` just before the `read` failure shows:
Seccomp: 3
Seccomp_filters: 1
File descriptor 5 of the parent process is a STREAM (according to
`lsof`)
$ lsof -p 11715
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE
NAME
sshd 11715 root cwd DIR 179,2 4096 2
/
sshd 11715 root rtd DIR 179,2 4096 2
/
sshd 11715 root txt REG 179,2 1318404 72912
/root/Projects/openssh-9.2p1-WIP/sshd
sshd 11715 root mem REG 179,2 42628 2737
/lib/arm-linux-gnueabihf/libnss_files-2.31.so
sshd 11715 root mem REG 179,2 116324 4913
/lib/arm-linux-gnueabihf/libgcc_s.so.1
sshd 11715 root mem REG 179,2 137364 2748
/lib/arm-linux-gnueabihf/libpthread-2.31.so
sshd 11715 root mem REG 179,2 13864 2668
/lib/arm-linux-gnueabihf/libdl-2.31.so
sshd 11715 root mem REG 179,2 1315688 2667
/lib/arm-linux-gnueabihf/libc-2.31.so
sshd 11715 root mem REG 179,2 95880 11965
/lib/arm-linux-gnueabihf/libz.so.1.2.11
sshd 11715 root mem REG 179,2 2150824 11138
/usr/lib/arm-linux-gnueabihf/libcrypto.so.1.1
sshd 11715 root mem REG 179,2 9796 2753
/lib/arm-linux-gnueabihf/libutil-2.31.so
sshd 11715 root mem REG 179,2 210340 1499
/lib/arm-linux-gnueabihf/libcrypt.so.1.1.0
sshd 11715 root mem REG 179,2 17708 14672
/usr/lib/arm-linux-gnueabihf/libarmmem-v7l.so
sshd 11715 root mem REG 179,2 146888 2655
/lib/arm-linux-gnueabihf/ld-2.31.so
sshd 11715 root 0u CHR 1,3 0t0 5
/dev/null
sshd 11715 root 1u CHR 1,3 0t0 5
/dev/null
sshd 11715 root 2u CHR 136,4 0t0 7
/dev/pts/4
sshd 11715 root 3u IPv4 8165255 0t0 TCP
localhost:55522->localhost:48024 (ESTABLISHED)
sshd 11715 root 5u unix 0x00000000c618f590 0t0 8165266
type=STREAM
### Other miscellaneous observations:
* the child process quickly becomes "defunct"
Oddly, I can see that a child process is created by debug-printing PIDs
at certain points. e.g. a debug log message prints
debug2: Network child is on pid 11719
But later, just before the failed `read`, that child process is in a
"defunct" state. e.g. command `ps -ef` shows
$ ps -ef
...
sshd: [accepted]
[sshd] <defunct>
I suspect the child process is immediately dying and the later parent
process `read` then fails.
* Years ago on this same system, I locally built 8.4p1, 8.6p1, 9.0p1,
that have run just fine.
8.4p1 was built Feb 2021
8.6p1 was built Jun 2022
9.0p1 was built Jun 2022
Yet, I downloaded those same old versions today and they failed. Each
hit the same child process abort.
* I verified the address of `f` is function `read`
(https://github.com/openssh/openssh-portable/blob/V_9_2_P1/atomicio.c#L66)
with code snippet:
if (f == read) {
debug3("read(%d, '%s', %u); (errno=%u)", fd, s + pos, n - pos,
errno);
} else {
debug3("(f=%p) (%d, '%s', %u); (errno=%u)", f, fd, s + pos, n
- pos, errno);
}
> Could you add try adding a similar printf+getpid+exit
* verified within the `sshd` process, `__NR_epoll_create1 = 357` and
`__NR_getpid = 20` via `debug3` prints, e.g. code
debug3("__NR_getpid = %d", __NR_getpid);
debug3("__NR_epoll_create1 = %d", __NR_epoll_create1);
int _pid = getpid();
debug3("getpid() = %d", _pid);
* built with `./configure --with-sandbox=no` and it runs okay (no child
process aborts)
* other sandboxes failed to compile due to missing headers or kernel
capabilities (and I didn't feel like chasing these down)
* systrace
* rlimit
* capsicum
* Various `fcntl` `GET` checks of file descriptor 5. `errno` was set to
`0` before each call to `fcntl`.
fcntl(5, F_GETFD) returned 1 (0x00000001) (errno=0)
fcntl(5, F_GETOWN_EX) returned 0 (0x00000000) (errno=0)
owner.type=0, owner.pid=0
fcntl(5, F_GETOWN) returned 0 (0x00000000) (errno=0)
fcntl(5, F_GETPIPE_SZ) returned -1 (0xffffffff) (errno=9)
fcntl(5, F_GET_SEALS) returned -1 (0xffffffff) (errno=22)
fcntl(5, F_GETLEASE ) returned 2 (0x00000002) (errno=0)
* for posterity, if anyone else can repro this,
then manually add this code in `atomicio.c` function `atomicio6` to
cause a GDB break:
if (fd == 5 && n == 4 && pos == 0 && errno == 32) {
raise(SIGINT);
}
Those are the happenstance values before the failed `read` call.
Add the prior snippet just before code:
res = (f) (fd, s + pos, n - pos);
gdb command:
$ gdb --args "$(realpath .)/sshd" -ddddd -f sshd_config
### Summary Thoughts
> So perhaps the problem here is that either it's picking up 32bit vs 64bit headers, or that the binary is some kind of 32bit compatibility mode but the sandbox is expecting the 64bit syscalls.
Looking at the above `grep -r -Ee '__NR_getpid|__NR_epoll_create1'`,
maybe the resident compiled libc
used header `/usr/include/asm-generic/unistd.h` where `#define
__NR_epoll_create1 20`,
and openssh-server builds is uses header
`/usr/include/arm-linux-gnueabihf/asm/unistd-eabi.h`
where `#define __NR_getpid (__NR_SYSCALL_BASE + 20)` (or vice-versa).
Looking at the files
$ l /usr/include/arm-linux-gnueabihf/asm/unistd-eabi.h
-rw-r--r-- 1 root root 19938 Apr 5 2023
/usr/include/arm-linux-gnueabihf/asm/unistd-eabi.h
$ l /usr/include/asm-generic/unistd.h
-rw-r--r-- 1 root root 31480 Apr 5 2023
/usr/include/asm-generic/unistd.h
Looking at the compiled `libc`
$ find /usr -name 'libc.so'
/usr/lib/arm-linux-gnueabihf/libc.so
$ ls -l /usr/lib/arm-linux-gnueabihf/libc.so
-rw-r--r-- 1 root root 289 Oct 3 12:55
/usr/lib/arm-linux-gnueabihf/libc.so
So maybe my include or library pathing is was reconfigured (messed up)
some time in October 2023 (oh man, how screwed am I?)
### in the meantime ...
I have a workaround using `--with-sandbox=no`.
If you'd like me to try something else then please let me know.
Otherwise, I've spent a fair amount of time in this rabbit hole and
need to get back to other things (i.e. other rabbit holes ).
-James Moon
(https://github.com/jtmoon79/)
(https://twitter.com/jtmoon1979/)
--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
More information about the openssh-bugs
mailing list