The software r5py is a Python interface to the R5 traffic routing engine written in Java. The communication between the two languages is done by JPype.
A simulation running in r5py was mysteriously crashing with the following error in its Java backend:
[login1:2465519:0:2465731] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x18)
==== backtrace (tid:2465731) ====
0 0x0000000000012990 __funlockfile() :0
=================================
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00001512d49d5d20 (sent by kill), pid=2465519, tid=2465731
#
# JRE version: OpenJDK Runtime Environment Temurin-21.0.7+6 (21.0.7+6) (build 21.0.7+6-LTS)
# Java VM: OpenJDK 64-Bit Server VM Temurin-21.0.7+6 (21.0.7+6-LTS, mixed mode, sharing, tiered, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# J 1982 c2 java.lang.ThreadLocal.get(Ljava/lang/Thread;)Ljava/lang/Object; java.base@21.0.7 (35 bytes) @ 0x00001512d49d5d20 [0x00001512d49d5cc0+0x0000000000000060]
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e" (or dumping to /rhea/scratch/brussel/000/vsc00000/core.2465519)
#
# An error report file with more information is saved as:
# /rhea/scratch/brussel/000/vsc00000/hs_err_pid2465519.log
[5.861s][warning][os] Loading hsdis library failed
#
# If you would like to submit a bug report, please visit:
# https://github.com/adoptium/adoptium-support/issues
#
Aborted (core dumped)
The logs showed that very early, at the time of executing import r5py
, several warnings appear regarding signal handlers being modified:
Warning: SIGSEGV handler modified!
Warning: SIGILL handler modified!
Warning: SIGFPE handler modified!
Warning: SIGBUS handler modified!
Signal Handlers:
SIGSEGV: ucs_error_signal_handler in libucs.so.0, mask=00000000000000000000000000000000, flags=SA_ONSTACK|SA_SIGINFO, unblocked
*** Handler was modified!
*** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
SIGBUS: ucs_error_signal_handler in libucs.so.0, mask=00000000000000000000000000000000, flags=SA_ONSTACK|SA_SIGINFO, unblocked
*** Handler was modified!
*** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
SIGFPE: ucs_error_signal_handler in libucs.so.0, mask=00000000000000000000000000000000, flags=SA_ONSTACK|SA_SIGINFO, unblocked
*** Handler was modified!
*** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
SIGPIPE: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO, unblocked
SIGXFSZ: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO, unblocked
SIGILL: ucs_error_signal_handler in libucs.so.0, mask=00000000000000000000000000000000, flags=SA_ONSTACK|SA_SIGINFO, unblocked
*** Handler was modified!
*** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
SIGUSR2: SR_handler in libjvm.so, mask=00000000000000000000000000000000, flags=SA_RESTART|SA_SIGINFO, unblocked
SIGHUP: ucs_debug_signal_handler in libucs.so.0, mask=00000000000000000000000000000000, flags=none, unblocked
SIGINT: signal_handler in libpython3.12.so.1.0, mask=00000000000000000000000000000000, flags=SA_ONSTACK, unblocked
SIGTERM: SIG_DFL, mask=00000000000000000000000000000000, flags=none, unblocked
SIGQUIT: SIG_DFL, mask=00000000000000000000000000000000, flags=none, unblocked
SIGTRAP: SIG_DFL, mask=00000000000000000000000000000000, flags=none, unblocked
Consider using jsig library.
These warning are generated by Java (libjvm
), which catches that all these signals are being intercepted by UCX (libucs
); and even recommends using the jsig library to do this type of change properly.
We are not intentionally intercepting those signals in UCX though. That is just it’s default behaviour. Which is known to cause issues in other contexts (JuliaParallel/MPI.jl#337). Solution is to disable signal interception in UCX by setting the UCX_ERROR_SIGNALS
environment variable empty:
UCX_ERROR_SIGNALS=""