[kernel] Don't panic when racing during secondary CPU shutdown
platform_halt_secondary_cpus's job is to shutdown any and all online
secondary CPUs. It does this by querying the set of online CPUs,
applying a mask to mask off the primary (CPU-0) and then calling
mp_unplug_cpu_mask.
platform_halt_secondary_cpus could be called concurrently by multiple
threads. If that happens, it's possible for the calls to race and
result in the "loser" calling mp_unplug_cpu_mask with a mask
containing CPUs that are already offline, which would result in
ZX_ERR_BAD_STATE.
This CL changes platform_halt_secondary_cpus to ignore a
ZX_ERR_BAD_STATE from mp_unplug_cpu_mask if it finds that the
secondary CPUs have already been offlined.
Tested by running several instances of dm_reboot_bootloader_test in a
loop for several hours.
Fixed: 81355
Change-Id: I599fc650e5541a5ac2c529acce46f0b2a5343f2e
Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/559962
Reviewed-by: Rasha Eqbal <rashaeqbal@google.com>
Commit-Queue: Nick Maniscalco <maniscalco@google.com>
diff --git a/zircon/kernel/platform/halt_helper.cc b/zircon/kernel/platform/halt_helper.cc
index d05526d..a625a12 100644
--- a/zircon/kernel/platform/halt_helper.cc
+++ b/zircon/kernel/platform/halt_helper.cc
@@ -43,5 +43,16 @@
// "Unplug" online secondary CPUs before halting them.
cpu_mask_t primary = cpu_num_to_mask(BOOT_CPU_ID);
cpu_mask_t mask = mp_get_online_mask() & ~primary;
- return mp_unplug_cpu_mask(mask, deadline);
+ zx_status_t status = mp_unplug_cpu_mask(mask, deadline);
+ if (status == ZX_OK) {
+ return ZX_OK;
+ }
+
+ // mp_unplug_cpu_mask failed. Perhaps another thread was trying to shutdown the secondary CPUs.
+ // If the primary CPU is the only one left, then we've done our job.
+ if (status == ZX_ERR_BAD_STATE && mp_get_online_mask() == primary) {
+ return ZX_OK;
+ }
+
+ return status;
}