Skip to content

Commit 7c1d428

Browse files
authored
feat: introduce optional handler strategy (#1027)
* reintroduce #ifdef symmetry regarding the usage of the handler_strategy (which must work on all UNIXes) and the query towards the handler_strategy option which must only work on Linux. * ensure we "leave" the signal-handler when we invoke the CLR/Mono runtime handler * ensure the page_allocator is only enabled when we have an actual native crash, we don't allocate before * continuing the signal chain at the end is something we want for both strategies, because CHAIN_AT_START will reach this execution-path only if the runtime-handler decided that it was an actual native crash. * add trace logs to at-start chaining, so we can see the behavior in the field when debugging is enabled * ensure page-allocator is only referenced on UNIXes * add integration test for managed and native crash * ignore 32-bit Linux build in the integration test * extract skip_condition to check what provokes the invalid syntax * clean up sigaltstack initialization * overwrite it only when the flag `SS_DISABLED` is set and the query didn't result in an error * if the query was successful but the flag is anything but `SS_DISABLED`, only log the size and flags of the current stack * if the query failed then log the corresponding error * clean up run assertion on output, so we can actually see what's happening. * create a non-faulty `sigaltstack` for the GHA runner * disable the test on ASAN runs since that would require an instrumented runtime :-) dotnet/runtime#13458 * Add changelog.
1 parent 67cc95c commit 7c1d428

File tree

9 files changed

+308
-33
lines changed

9 files changed

+308
-33
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
**Features**:
66

77
- Provide version information for non-static Windows binaries. ([#1076](https://github.com/getsentry/sentry-native/pull/1076), [crashpad#110](https://github.com/getsentry/crashpad/pull/110))
8+
- Add an alternative handler strategy to `inproc` to support `.NET` on Linux and `Mono` on Android (specifically, [.NET MAUI](https://github.com/dotnet/android/issues/9055#issuecomment-2261347912)). ([#1027](https://github.com/getsentry/sentry-native/pull/1027))
89

910
**Fixes**:
1011

include/sentry.h

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -768,6 +768,14 @@ typedef enum {
768768
SENTRY_USER_CONSENT_REVOKED = 0,
769769
} sentry_user_consent_t;
770770

771+
/**
772+
* The crash handler strategy.
773+
*/
774+
typedef enum {
775+
SENTRY_HANDLER_STRATEGY_DEFAULT = 0,
776+
SENTRY_HANDLER_STRATEGY_CHAIN_AT_START = 1,
777+
} sentry_handler_strategy_t;
778+
771779
/**
772780
* Creates a new options struct.
773781
* Can be freed with `sentry_options_free`.
@@ -1492,6 +1500,36 @@ SENTRY_EXPERIMENTAL_API void sentry_options_set_traces_sample_rate(
14921500
SENTRY_EXPERIMENTAL_API double sentry_options_get_traces_sample_rate(
14931501
sentry_options_t *opts);
14941502

1503+
#ifdef SENTRY_PLATFORM_LINUX
1504+
1505+
/**
1506+
* Returns the currently set strategy for the handler.
1507+
*
1508+
* This option does only work for the `inproc` backend on `Linux` and `Android`.
1509+
*
1510+
* The main use-case is when the Native SDK is used in the context of the
1511+
* CLR/Mono runtimes which convert some POSIX signals into managed-code
1512+
* exceptions and discontinue the signal chain.
1513+
*
1514+
* If this happens and we invoke the previous handler at the end (i.e., after
1515+
* our handler processed the signal, which is the default strategy) we will end
1516+
* up sending a native crash in addition to the managed code exception (which
1517+
* will either generate another crash-event if uncaught or could be handled in
1518+
* the managed code and neither terminate the application nor create a crash
1519+
* event). To correctly process the signals of CLR/Mono applications, we must
1520+
* invoke the runtime handler at the start of our handler.
1521+
*/
1522+
SENTRY_EXPERIMENTAL_API sentry_handler_strategy_t
1523+
sentry_options_get_handler_strategy(const sentry_options_t *opts);
1524+
1525+
/**
1526+
* Sets the handler strategy.
1527+
*/
1528+
SENTRY_EXPERIMENTAL_API void sentry_options_set_handler_strategy(
1529+
sentry_options_t *opts, sentry_handler_strategy_t handler_strategy);
1530+
1531+
#endif // SENTRY_PLATFORM_LINUX
1532+
14951533
/* -- Session APIs -- */
14961534

14971535
typedef enum {

src/backends/sentry_backend_inproc.c

Lines changed: 40 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -16,27 +16,6 @@
1616
#include "transports/sentry_disk_transport.h"
1717
#include <string.h>
1818

19-
/**
20-
* Android's bionic libc seems to allocate alternate signal handler stacks for
21-
* every thread and also references them from their internal maintenance
22-
* structs.
23-
*
24-
* The way we currently set up our sigaltstack seems to interfere with this
25-
* setup and causes crashes whenever an ART signal handler touches the thread
26-
* that called `sentry_init()`.
27-
*
28-
* In addition to this problem, it also means there is no need for our own
29-
* sigaltstack on Android since our signal handler will always be running on
30-
* an alternate stack managed by bionic.
31-
*
32-
* Note: In bionic the sigaltstacks for 32-bit devices have a size of 16KiB and
33-
* on 64-bit devices they have 32KiB. The size of our own was set to 64KiB
34-
* independent of the device. If this is a problem, we need figure out
35-
* together with Google if there is a way in which our configs can coexist.
36-
*
37-
* Both breakpad and crashpad are way more defensive in the setup of their
38-
* signal stacks and take existing stacks into account (or reuse them).
39-
*/
4019
#define SIGNAL_DEF(Sig, Desc) { Sig, #Sig, Desc }
4120

4221
#define MAX_FRAMES 128
@@ -108,8 +87,8 @@ startup_inproc_backend(
10887

10988
// set up an alternate signal stack if noone defined one before
11089
stack_t old_sig_stack;
111-
if (sigaltstack(NULL, &old_sig_stack) == -1 || old_sig_stack.ss_sp == NULL
112-
|| old_sig_stack.ss_size == 0) {
90+
int ret = sigaltstack(NULL, &old_sig_stack);
91+
if (ret == 0 && old_sig_stack.ss_flags == SS_DISABLE) {
11392
SENTRY_TRACEF("installing signal stack (size: %d)", SIGNAL_STACK_SIZE);
11493
g_signal_stack.ss_sp = sentry_malloc(SIGNAL_STACK_SIZE);
11594
if (!g_signal_stack.ss_sp) {
@@ -118,9 +97,11 @@ startup_inproc_backend(
11897
g_signal_stack.ss_size = SIGNAL_STACK_SIZE;
11998
g_signal_stack.ss_flags = 0;
12099
sigaltstack(&g_signal_stack, 0);
121-
} else {
122-
SENTRY_TRACEF(
123-
"using existing signal stack (size: %d)", old_sig_stack.ss_size);
100+
} else if (ret == 0) {
101+
SENTRY_TRACEF("using existing signal stack (size: %d, flags: %d)",
102+
old_sig_stack.ss_size, old_sig_stack.ss_flags);
103+
} else if (ret == -1) {
104+
SENTRY_WARNF("Failed to query signal stack size: %s", strerror(errno));
124105
}
125106

126107
// install our own signal handler
@@ -554,21 +535,47 @@ handle_ucontext(const sentry_ucontext_t *uctx)
554535
}
555536

556537
#ifdef SENTRY_PLATFORM_UNIX
557-
// give us an allocator we can use safely in signals before we tear down.
558-
sentry__page_allocator_enable();
559-
560538
// inform the sentry_sync system that we're in a signal handler. This will
561539
// make mutexes spin on a spinlock instead as it's no longer safe to use a
562540
// pthread mutex.
563541
sentry__enter_signal_handler();
564542
#endif
565543

566-
sentry_value_t event = make_signal_event(sig_slot, uctx);
567-
568544
SENTRY_WITH_OPTIONS (options) {
569-
sentry__write_crash_marker(options);
545+
#ifdef SENTRY_PLATFORM_LINUX
546+
// On Linux (and thus Android) CLR/Mono converts signals provoked by
547+
// AOT/JIT-generated native code into managed code exceptions. In these
548+
// cases, we shouldn't react to the signal at all and let their handler
549+
// discontinue the signal chain by invoking the runtime handler before
550+
// we process the signal.
551+
if (sentry_options_get_handler_strategy(options)
552+
== SENTRY_HANDLER_STRATEGY_CHAIN_AT_START) {
553+
SENTRY_TRACE("defer to runtime signal handler at start");
554+
// there is a good chance that we won't return from the previous
555+
// handler and that would mean we couldn't enter this handler with
556+
// the next signal coming in if we didn't "leave" here.
557+
sentry__leave_signal_handler();
558+
559+
// invoke the previous handler (typically the CLR/Mono
560+
// signal-to-managed-exception handler)
561+
invoke_signal_handler(
562+
uctx->signum, uctx->siginfo, (void *)uctx->user_context);
563+
564+
// let's re-enter because it means this was an actual native crash
565+
sentry__enter_signal_handler();
566+
SENTRY_TRACE(
567+
"return from runtime signal handler, we handle the signal");
568+
}
569+
#endif
570570

571+
#ifdef SENTRY_PLATFORM_UNIX
572+
// use a signal-safe allocator before we tear down.
573+
sentry__page_allocator_enable();
574+
#endif
575+
576+
sentry_value_t event = make_signal_event(sig_slot, uctx);
571577
bool should_handle = true;
578+
sentry__write_crash_marker(options);
572579

573580
if (options->on_crash_func) {
574581
SENTRY_TRACE("invoking `on_crash` hook");
@@ -580,7 +587,7 @@ handle_ucontext(const sentry_ucontext_t *uctx)
580587
sentry_envelope_t *envelope = sentry__prepare_event(
581588
options, event, NULL, !options->on_crash_func);
582589
// TODO(tracing): Revisit when investigating transaction flushing
583-
// during hard crashes.
590+
// during hard crashes.
584591

585592
sentry_session_t *session = sentry__end_current_session_with_status(
586593
SENTRY_SESSION_STATUS_CRASHED);

src/sentry_options.c

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ sentry_options_new(void)
5757
opts->shutdown_timeout = SENTRY_DEFAULT_SHUTDOWN_TIMEOUT;
5858
opts->traces_sample_rate = 0.0;
5959
opts->max_spans = 0;
60+
opts->handler_strategy = SENTRY_HANDLER_STRATEGY_DEFAULT;
6061

6162
return opts;
6263
}
@@ -594,3 +595,20 @@ sentry_options_set_backend(sentry_options_t *opts, sentry_backend_t *backend)
594595
sentry__backend_free(opts->backend);
595596
opts->backend = backend;
596597
}
598+
599+
#ifdef SENTRY_PLATFORM_LINUX
600+
601+
sentry_handler_strategy_t
602+
sentry_options_get_handler_strategy(const sentry_options_t *opts)
603+
{
604+
return opts->handler_strategy;
605+
}
606+
607+
void
608+
sentry_options_set_handler_strategy(
609+
sentry_options_t *opts, sentry_handler_strategy_t handler_strategy)
610+
{
611+
opts->handler_strategy = handler_strategy;
612+
}
613+
614+
#endif // SENTRY_PLATFORM_LINUX

src/sentry_options.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ typedef struct sentry_options_s {
7171
long user_consent;
7272
long refcount;
7373
uint64_t shutdown_timeout;
74+
sentry_handler_strategy_t handler_strategy;
7475
} sentry_options_t;
7576

7677
/**
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
namespace dotnet_signal;
2+
3+
using System;
4+
using System.Runtime.InteropServices;
5+
6+
class Program
7+
{
8+
[DllImport("crash", EntryPoint = "native_crash")]
9+
static extern void native_crash();
10+
11+
[DllImport("crash", EntryPoint = "enable_sigaltstack")]
12+
static extern void enable_sigaltstack();
13+
14+
[DllImport("sentry", EntryPoint = "sentry_options_new")]
15+
static extern IntPtr sentry_options_new();
16+
17+
[DllImport("sentry", EntryPoint = "sentry_options_set_handler_strategy")]
18+
static extern IntPtr sentry_options_set_handler_strategy(IntPtr options, int strategy);
19+
20+
[DllImport("sentry", EntryPoint = "sentry_options_set_debug")]
21+
static extern IntPtr sentry_options_set_debug(IntPtr options, int debug);
22+
23+
[DllImport("sentry", EntryPoint = "sentry_init")]
24+
static extern int sentry_init(IntPtr options);
25+
26+
static void Main(string[] args)
27+
{
28+
var githubActions = Environment.GetEnvironmentVariable("GITHUB_ACTIONS") ?? string.Empty;
29+
if (githubActions == "true") {
30+
// Set up our own `sigaltstack` for this thread if we're running on GHA because of a failure to run any
31+
// signal handler after the initial setup. This behavior is locally non-reproducible and likely runner-related.
32+
// I ran this against .net7/8/9 on at least 10 different Linux setups, and it worked on all, but on GHA
33+
// it only works if we __don't__ accept the already installed `sigaltstack`.
34+
enable_sigaltstack();
35+
}
36+
37+
// setup minimal sentry-native
38+
var options = sentry_options_new();
39+
sentry_options_set_handler_strategy(options, 1);
40+
sentry_options_set_debug(options, 1);
41+
sentry_init(options);
42+
43+
var doNativeCrash = args is ["native-crash"];
44+
if (doNativeCrash)
45+
{
46+
native_crash();
47+
}
48+
else
49+
{
50+
try
51+
{
52+
Console.WriteLine("dereference a NULL object from managed code");
53+
var s = default(string);
54+
var c = s.Length;
55+
}
56+
catch (NullReferenceException exception)
57+
{
58+
Console.WriteLine("dereference another NULL object from managed code");
59+
var s = default(string);
60+
var c = s.Length;
61+
}
62+
}
63+
}
64+
}
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
#include <signal.h>
2+
#include <stdlib.h>
3+
void native_crash(void)
4+
{
5+
*(int *)10 = 100;
6+
}
7+
8+
void enable_sigaltstack(void)
9+
{
10+
const size_t signal_stack_size = 16384;
11+
stack_t signal_stack;
12+
signal_stack.ss_sp = malloc(signal_stack_size);
13+
if (!signal_stack.ss_sp) {
14+
return;
15+
}
16+
signal_stack.ss_size = signal_stack_size;
17+
signal_stack.ss_flags = 0;
18+
sigaltstack(&signal_stack, 0);
19+
}
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
<Project Sdk="Microsoft.NET.Sdk">
2+
<PropertyGroup>
3+
<OutputType>Exe</OutputType>
4+
<TargetFramework>net8.0</TargetFramework>
5+
<ImplicitUsings>enable</ImplicitUsings>
6+
<Nullable>enable</Nullable>
7+
</PropertyGroup>
8+
</Project>

0 commit comments

Comments
 (0)