[Bug 3822] New: --with-linux-memlock-onfault causes segfaults with some combinations of PAM modules

bugzilla-daemon at mindrot.org bugzilla-daemon at mindrot.org
Thu May 8 00:52:58 AEST 2025


https://bugzilla.mindrot.org/show_bug.cgi?id=3822

            Bug ID: 3822
           Summary: --with-linux-memlock-onfault causes segfaults with
                    some combinations of PAM modules
           Product: Portable OpenSSH
           Version: 10.0p2
          Hardware: Other
               URL: https://bugs.debian.org/1103418
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P5
         Component: PAM support
          Assignee: unassigned-bugs at mindrot.org
          Reporter: cjwatson at debian.org

I noticed the new --with-linux-memlock-onfault option when upgrading
Debian to 10.0, and it seemed like a good idea so I enabled it. 
However, shortly afterwards I got a report of somewhat random but
frequent segfaults in sshd-session (https://bugs.debian.org/1103418). 
It took me a while to track this down to establish that it wasn't the
fault of a Debian patch; in fact it turns out to be reproducible with
an unmodified OpenSSH portable tree, provided that I've configured
--with-pam and --with-linux-memlock-onfault, and that I have "enough"
PAM modules enabled.  The segfaults are generally somewhere under
pam_authenticate when a PAM module tries to allocate memory.  They
usually look something like this (there are stack traces with more
detail in the linked Debian bug report - this is just for
illustration):

  __printf_buffer (libc.so.6 + 0x6261d)
  __vasprintf_internal (libc.so.6 + 0x87a6b)
  ___asprintf_chk (libc.so.6 + 0x11abef)
  n/a (pam_ecryptfs.so + 0x23d7)
  pam_sm_authenticate (pam_ecryptfs.so + 0x2ace)
  n/a (libpam.so.0 + 0x44de)
  pam_authenticate (libpam.so.0 + 0x3be3)
  n/a (/home/cjwatson/openssh/sshd-session + 0x3dbf0)

But the code in pam_ecryptfs looks innocent.  And in mlockall(2), I see
this paragraph which looks relevant:

       If MCL_FUTURE has been specified, then a later system call
(e.g.,
       mmap(2), sbrk(2), malloc(3)), may fail if it would cause the
number of
       locked bytes to exceed the permitted maximum (see below).  In
the same
       circumstances, stack growth may likewise fail: the kernel will
deny
       stack expansion and deliver a SIGSEGV signal to the process.

I think this is probably because RLIMIT_MEMLOCK is typically not all
that huge (8 MiB by default on Debian), and PAM modules, the libraries
they depend on, and whatever memory they allocate on the stack might
well exceed that.  Commenting out the call to memlock_onfault_setup in
platform_pre_session_start makes the segfaults go away.  I tried just
disabling MCL_FUTURE for the sshd-session process, but that didn't seem
to help.

Perhaps it would make sense to lock memory in the session process only
once the session has been established (do_authenticated or server_loop2
or so)?  By that point the session process is just processing and
forwarding packets and shouldn't be doing anything like as much memory
allocation as it does while establishing a connection.  It's perhaps
not quite as good for the stated rationale of
6c49e5f7dcaf886b4a702a6c003cae9dca04d3ea (always being able to connect
even while the system is under heavy kcompactd load), but it should
still help with established connections.

For now I'm just going to disable --with-linux-memlock-onfault, but I
thought I should let you know in case you have any other ideas.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the openssh-bugs mailing list