[Bug 3670] New: [ssh-agent] 100% CPU spin in cleanup_handler signal handler

bugzilla-daemon at mindrot.org bugzilla-daemon at mindrot.org
Sat Mar 9 07:19:41 AEDT 2024


https://bugzilla.mindrot.org/show_bug.cgi?id=3670

            Bug ID: 3670
           Summary: [ssh-agent] 100% CPU spin in cleanup_handler signal
                    handler
           Product: Portable OpenSSH
           Version: 9.6p1
          Hardware: amd64
                OS: Mac OS X
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: ssh-agent
          Assignee: unassigned-bugs at mindrot.org
          Reporter: benhamilton at google.com

On macOS 13.3, I got the following 100% CPU spin in `ssh-agent`'s
`cleanup_handler()`:

```
                      8438 _sigtramp  (in libsystem_platform.dylib) +
29  [0x7ff819a3e5ed]
                        8438 cleanup_handler  (in ssh-agent) + 9 
[0x10d0c5429]
                          8438 cleanup_socket  (in ssh-agent) + 81 
[0x10d0c3d11]
                            8438 sshlog  (in ssh-agent) + 116 
[0x10d0f3504]
                              8438 sshlogv  (in ssh-agent) + 127 
[0x10d0f35af]
                                8438 snprintf  (in libsystem_c.dylib) +
156  [0x7ff8198c60d4]
                                  8438 vsnprintf_l  (in
libsystem_c.dylib) + 41  [0x7ff8198c6020]
                                    8438 _vsnprintf  (in
libsystem_c.dylib) + 256  [0x7ff8198e87ce]
                                      8438 __vfprintf  (in
libsystem_c.dylib) + 113  [0x7ff8198b9ef7]
                                        8438 localeconv_l  (in
libsystem_c.dylib) + 52  [0x7ff8198be2f4]
                                          8438
_os_unfair_lock_lock_slow  (in libsystem_platform.dylib) + 258 
[0x7ff819a3cb67]
                                            8438
_os_unfair_lock_recursive_abort  (in libsystem_platform.dylib) + 23 
[0x7ff819a4237b]
```

The issue is that all work performed from inside a signal handler must
be async-signal safe. It cannot obtain mutexes or talk to any global 

https://man7.org/linux/man-pages/man7/signal-safety.7.html

In particular, `snprintf()` invoked by `sshlog()` in `cleanup_socket()`
is not async-signal safe.

To fix this, `cleanup_socket()` should:

1) Set a global boolean to true
2) Signal a file descriptor which the main `ssh-agent` `poll()` loop
can use to wake up

Then, the main `poll()` loop can check if that boolean is set to true,
and if so, clean up and exit.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the openssh-bugs mailing list