Nmi branch test#1

Open

dg1197 wants to merge 155 commits into

rex-rs:rex-linuxfrom

jampflah:nmi_branch_test

dg1197 commented May 2, 2026

No description provided.

djwillia and others added 30 commits

February 16, 2026 02:44


          BPF_PROG_LOAD_DJW

e72dadd


          not-working code

b55891d


          update to 5.15

272f3f5


          alloc executable mem for prog

21511f3


          take out printk

b20d1d3


          fix bpf_trace_printk by always enable trace event

0d014c5


          ELF Loader in kernel

5d3aa69


          ELF loader woring for example program

58b46fa


          free program memory

00f0b96


          Page permission solved

310e084


          refactor page permission code

64ec0ef


          might have fixed the page fault (permission error)

fa99bab


          fix problem with empty section

ccf42e1


          Fixed memory conflict for distributed apps

0db49c4


          Rewritten to resolve conflicts

ebec9a8


          map support

a091d49


          load the ELF as "inner-unikernel base"

7e46a02


          implement subprog-loading functions to support multiple programs in t…

78177f6

…he same file


          cleanup debug statements

1a1647a


          Patch GOT at load time

f9cfdfe

When we compile the rust programs with PIE, the compiler creates the
Global Offset Table (GOT) to put the address of the extern variables.
The GOT is supposed to be fixed at program load time by the dynamic
loader. However, we do not have a dynamic loader and therefore, the GOT
entries are un-patched and contain absolute addresses. This causes
problem when the program is triggered in the kernel -- the use of
absolute address will cause the code going to non-existing pages.

Add a new GOT fix step when the base program is loaded.

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          Add specialized trace_printk for inner-unikernel

a9b5ab7

Add a new trace_printk function only used by inner-unikernel programs.
This function always pads a null character at the end.

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          fixup .rela.dyn data

a0fbfe2

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          remove generated files

d88927f

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          allow larger programs

1765b2d

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          kernel support for cleanup mechanism

4cb8627

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          Add support for rust panic handling and stack unwinding

b3d7339

Add a new iu_dispatcher_func to dispatch inner-unikernel programs so
that rust panics can be handled. The dispatch have a prototype of:

extern asmlinkage unsigned int iu_dispatcher_func(
        const void *ctx,
        const struct bpf_insn *insnsi,
        unsigned int (*bpf_func)(const void *,
                                 const struct bpf_insn *));

which shares the same signature as bpf_dispatcher_nop_func but differs
in linkage, as it is implemented directly in assembly.

The function will save the stack pointer and frame pointer to designated
per-cpu variables before calling into the program.

If the execution is successful (i.e. no exceptions), the function will
just return normally.

   +-----------------------+
   | iu_dispatcher_func:   |
   | movq %rsp %gs:iu_sp   |
   | movq %rbp %gs:iu_fp   |                +-----------+
   | call *%rdx            |--------------->| iu_prog1: |
   |                       |                | ...       |
   | iu_exit:              |<---------------| ret       |
   | ret                   |                +-----------+
   | ...                   |
   +-----------------------+

Under exceptional cases (where a rust panic is fired), rust_begin_unwind
(i.e. panic handler) will transfer the control flow to the iu_landingpad
function, which, after dumping some information to the kernel ring
buffer, will issue a direct jump to iu_panic_trampoline, a global label
in the middle of iu_dispatcher_func. The trampoline code restores the
old stack pointer and frame pointer value, effectively unwinding the
stack.  It then sets a return value of -EINVAL and jumps to iu_exit to
return from iu_dispatcher_func.

	 +-----------------------+
         | iu_dispatcher_func:   |
         | movq %rsp, %gs:iu_sp  |
         | movq %rbp, %gs:iu_fp  |                +-----------+
         | call *%rdx            |--------------->| iu_prog1: |
         |                       |         +------| ...       |
   +---->| iu_exit:              |         |      | ret       |
   |     | ret                   |         |      +-----------+
   |     |                       |         |
   |     | iu_panic_trampoline:  |<-----+  | panic!()
   |     | movq %gs:iu_sp, %rsp  |      |  |
   |     | movq %gs:iu_fp, %rbp  |      |  |      +-------------------------+
   |     | movq $(-EINVAL), %rax |      |  +----->| iu_landingpad:          |
   +-----| jmp iu_exit           |      |         | ...                     |
         +-----------------------+      +---------| jmp iu_panic_trampoline |
                                                  +-------------------------+

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          Link inner-unikernel invocation with new dispatcher

77fdb69

This right now only works for program invocations where
bpf_dispatcher_nop_func is used originally. It does cover all tracing
programs (i.e. these invoked via trace_call_bpf). Other program types
(e.g.  XDP) are not supported.

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          Save callee-saved registers in new dispatcher

400402d

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          x86: iu_unwind_64.S: fix indentation

33b0e12

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          allocate vmapped per-cpu iu_stack

a8f0c5f

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>

chinrw and others added 24 commits

February 16, 2026 02:44


          perf: remove unused a.out obj

216aed8


          rex: cleanup rex_load_prog{,_base}

f8fd4f2

 - Remove debug prints
 - Remove commented out code

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          rex: directly compute final value of relocations

c736140

Apparently lld does things differently from bfd and mold -- it puts a 0
at the relative relocation address instead of the addend. Let's just
directly compute the final value with *ABS*+addend to make it more
robust.

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          kbuild: add simple meson support

6543b82

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          kbuild/meson: s/build_dir/kbuild_dir/g

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          kbuild/meson: fix build file format

cc90426

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          kbuild/meson: use nproc instead of '32'

e5ced5a

Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu>


          kbuild/meson: explicitly capture and check nproc

edc1297

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          kbuild/meson: enable build_always_stable to ensure vmlinux is up-to-date

ffb8efe

Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu>


          rex: remove unnecessary zeroing of rex program segments

The previous __vmalloc() invocation already has __GFP_ZERO flag set so
there is no need to zero the memory again. Plus, the address calculation
is incorrect, which causes accidental zeroing of real data.

Fixes: 23903f1 ("Rewritten to resolve conflicts")
Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          rex: introduce rex_log_buf per-cpu buffer for formatting and printing

765d3cd

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          x86/rex: print rex panic message from per-cpu log buffer

72b7a53

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          rex: do not count total memory size and page counts incermentally

63fb4f1

We previously counted the total memory and page counts needed for the
program incrementally. This causes problems when the linker (e.g. mold)
generates a gap page between LOAD segments, as that gap page will not be
counted.

Instead, directly calculate the total memory and page counts by aligning
the largest memeory address found in the LOAD segments to page boundary.

Fixes: 88b2c24 ("Fixed memory conflict for distributed apps")
Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          samples/kprobes: add missing MODULE_DESCRIPTION() to kprobe_target

cd6db28

This fixes the following warning from modpost:

  WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kprobes/kprobe_target.o

Fixes: c16cb95 ("samples/kprobe: add kprobe target module")
Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          kbuild/meson: do not clean meson.build during

e7f41e6

Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>


          Sanity testig done

a9d3c21

Signed-off-by: dmo <dganesh3@illiois.edu>


          Sanity testing done

8382a19

Signed-off-by: dmo <dganesh3@illiois.edu>


          Add support for sched_ext_ops_cfs

51970aa


          print statements on KCFI errors - reaching scx_enable

e9ba1a8

Signed-off-by: dmo <dganesh3@illiois.edu>


          Merging origin/sched_ext_ops_cfg.

64fbf3c


          Fix merge duplication error.

08e7e17


          Plumbing Rex scheduler name through Rex sched_ext path

4c9ae66

Signed-off-by: dmo <dganesh3@illiois.edu>


          Fix leak on SCX_EXIT_ERROR_STALL (mode 1 scx simple soft stall).

8d2cf1f


          Merge branch 'rex-linux' into nmi_branch_test

90e2e16

Signed-off-by: Dhanush Ganesh <dganesh3@illinois.edu>

jinghao-jia pushed a commit that referenced this pull request


          gpio: mvebu: fix NULL pointer dereference in suspend/resume

b9ad50d

mvebu_pwm_suspend() and mvebu_pwm_resume() are called for all GPIO
banks during suspend/resume, but not all banks have PWM functionality.
GPIO banks without PWM have mvchip->mvpwm set to NULL.

Calling mvebu_pwm_suspend() with mvpwm == NULL causes a NULL pointer
dereference when it tries to access mvpwm->blink_select.

  Unable to handle kernel NULL pointer dereference at virtual address 00000020 when write
  [00000020] *pgd=00000000
  Internal error: Oops: 815 [#1] PREEMPT ARM
  Modules linked in:
  CPU: 0 UID: 0 PID: 406 Comm: sh Not tainted 6.12.74-rt12-yocto-standard-g4e96f98fb7db-dirty #353
  Hardware name: Marvell Armada 370/XP (Device Tree)
  PC is at regmap_mmio_read+0x38/0x54
  LR is at regmap_mmio_read+0x38/0x54
  pc : [<c05fd2ac>]    lr : [<c05fd2ac>]    psr: 200f0013
  sp : f0c11d10  ip : 00000000  fp : c100d2f0
  r10: c14fb854  r9 : 00000000  r8 : 00000000
  r7 : c1799c00  r6 : 00000020  r5 : 00000020  r4 : c179c7c0
  r3 : f0a231a0  r2 : 00000020  r1 : 00000020  r0 : 00000000
  Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
  Control: 10c5387d  Table: 135ec059  DAC: 00000051
  Call trace:
   regmap_mmio_read from _regmap_bus_reg_read+0x78/0xac
   _regmap_bus_reg_read from _regmap_read+0x60/0x154
   _regmap_read from regmap_read+0x3c/0x60
   regmap_read from mvebu_gpio_suspend+0xa4/0x14c
   mvebu_gpio_suspend from dpm_run_callback+0x54/0x180
   dpm_run_callback from device_suspend+0x124/0x630
   device_suspend from dpm_suspend+0x124/0x270
   dpm_suspend from dpm_suspend_start+0x64/0x6c
   dpm_suspend_start from suspend_devices_and_enter+0x140/0x8e8
   suspend_devices_and_enter from pm_suspend+0x2fc/0x308
   pm_suspend from state_store+0x6c/0xc8
   state_store from kernfs_fop_write_iter+0x10c/0x1f8
   kernfs_fop_write_iter from vfs_write+0x270/0x468
   vfs_write from ksys_write+0x70/0xf0
   ksys_write from ret_fast_syscall+0x0/0x54

Add a NULL check for mvchip->mvpwm before calling the PWM
suspend/resume functions.

Fixes: 757642f ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
Link: https://patch.msgid.link/20260608084334.2960803-1-yun.zhou@windriver.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

jinghao-jia pushed a commit that referenced this pull request


          net/mlx5: Fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list

894e036

mlx5_query_nic_vport_mac_list() sizes its firmware command buffer using
the PF's log_max_current_uc/mc_list capabilities. When querying a VF
vport with a larger configured max (via devlink), the firmware response
can overflow this buffer:

 BUG: KASAN: slab-out-of-bounds in mlx5_query_nic_vport_mac_list+0x453/0x4c0 [mlx5_core]
 Read of size 4 at addr ff1100013ffc8a12 by task kworker/u96:2/385

 CPU: 12 UID: 0 PID: 385 Comm: kworker/u96:2 Not tainted 7.0.0-rc6+ #1 PREEMPT
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)
 Workqueue: mlx5_esw_wq esw_vport_change_handler [mlx5_core]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x69/0xa0
  print_report+0x176/0x4e4
  kasan_report+0xc8/0x100
  mlx5_query_nic_vport_mac_list+0x453/0x4c0 [mlx5_core]
  esw_update_vport_addr_list+0x2e3/0xda0 [mlx5_core]
  esw_vport_change_handle_locked+0xa1f/0x1060 [mlx5_core]
  esw_vport_change_handler+0x6a/0x90 [mlx5_core]
  process_one_work+0x87f/0x15e0
  worker_thread+0x62b/0x1020
  kthread+0x375/0x490
  ret_from_fork+0x4dc/0x810
  ret_from_fork_asm+0x11/0x20
  </TASK>

Fix by querying the vport's own HCA caps to size the buffer correctly.
Refactor the function to allocate and return the MAC list internally,
removing the caller's dependency on knowing the correct max.

Fixes: e16aea2 ("net/mlx5: Introduce access functions to modify/query vport mac lists")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260604135849.458060-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

jinghao-jia pushed a commit that referenced this pull request


          net/rds: fix NULL deref in rds_ib_send_cqe_handler() on masked atomic…

34080db

… completion

rds_ib_xmit_atomic() always programs a masked atomic opcode
(IB_WR_MASKED_ATOMIC_CMP_AND_SWP or IB_WR_MASKED_ATOMIC_FETCH_AND_ADD)
for every RDS atomic cmsg.  But the completion-side switch in
rds_ib_send_unmap_op() only handles the non-masked opcodes, so a masked
atomic completion falls through to default and returns rm == NULL while
send->s_op is left set.  rds_ib_send_cqe_handler() then dereferences the
NULL rm via rm->m_final_op, oopsing in softirq context.  An unprivileged
AF_RDS sendmsg() of an atomic cmsg over an active RDS/IB connection
triggers it; on hardware that natively accepts masked atomics (mlx4,
mlx5) no extra setup is needed.

  RDS/IB: rds_ib_send_unmap_op: unexpected opcode 0xd in WR!
  Oops: general protection fault [#1] SMP KASAN
  KASAN: null-ptr-deref in range [0x0000000000000190-0x0000000000000197]
  RIP: rds_ib_send_cqe_handler+0x25c/0xb10 (net/rds/ib_send.c:282)
  Call Trace:
   <IRQ>
   rds_ib_send_cqe_handler (net/rds/ib_send.c:282)
   poll_scq (net/rds/ib_cm.c:274)
   rds_ib_tasklet_fn_send (net/rds/ib_cm.c:294)
   tasklet_action_common (kernel/softirq.c:943)
   handle_softirqs (kernel/softirq.c:573)
   run_ksoftirqd (kernel/softirq.c:479)
   </IRQ>
  Kernel panic - not syncing: Fatal exception in interrupt

Handle the masked atomic opcodes in the same case as the non-masked
ones: they map to the same struct rds_message.atomic union member, so
the existing container_of()/rds_ib_send_unmap_atomic() body is correct
for them.

Fixes: 20c72bd ("RDS: Implement masked atomic operations")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260606192447.1179255-2-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

jinghao-jia pushed a commit that referenced this pull request


          netfilter: nf_conntrack: destroy stale expectfn expectations on unreg…

c300941

…ister

NAT helpers such as nf_nat_h323 store a raw pointer to module text in
exp->expectfn (e.g. ip_nat_q931_expect). nf_ct_helper_expectfn_unregister()
only unlinks the callback descriptor and never walks the expectation table,
so an expectation pending at module removal survives with a dangling
exp->expectfn into freed module text.

When the expected connection arrives, init_conntrack() invokes
exp->expectfn(), now a stale pointer into the unloaded module. Reproduced
on a KASAN build by loading the H.323 helpers, creating a Q.931
expectation, unloading nf_nat_h323, then connecting to the expected port:

 Oops: int3: 0000 [#1] SMP KASAN NOPTI
 RIP: 0010:0xffffffffa06102d1
  init_conntrack.isra.0 (net/netfilter/nf_conntrack_core.c:1862)
  nf_conntrack_in (net/netfilter/nf_conntrack_core.c:2049)
  ipv4_conntrack_local (net/netfilter/nf_conntrack_proto.c:223)
  nf_hook_slow (net/netfilter/core.c:619)
  __ip_local_out (net/ipv4/ip_output.c:120)
  __tcp_transmit_skb (net/ipv4/tcp_output.c:1715)
  tcp_connect (net/ipv4/tcp_output.c:4374)
  tcp_v4_connect (net/ipv4/tcp_ipv4.c:345)
  __sys_connect (net/socket.c:2167)
 Modules linked in: nf_conntrack_h323 [last unloaded: nf_nat_h323]

Reaching the dangling state requires CAP_SYS_MODULE in the initial user
namespace to remove a NAT helper that still has live expectations, so this
is a robustness fix; leaving an expectation pointing at freed text is wrong
regardless.

Add nf_ct_helper_expectfn_destroy(), which walks the expectation table and
drops every expectation whose ->expectfn matches the descriptor being torn
down. Call it from each NAT helper's exit path after the existing RCU grace
period, so no expectation outlives the code it points at and no extra
synchronize_rcu() is introduced. With the fix, the same reproducer runs to
completion without the Oops.

Fixes: f587de0 ("[NETFILTER]: nf_conntrack/nf_nat: add H.323 helper port")
Reported-by: Xiang Mei <xmei5@asu.edu>
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

jinghao-jia pushed a commit that referenced this pull request


          drm/xe/display: fix oops in suspend/shutdown without display

68938cc

The xe driver keeps track of whether to probe display, and whether
display hardware is there, using xe->info.probe_display. It gets set to
false if there's no display after intel_display_device_probe(). However,
the display may also be disabled via fuses, detected at a later time in
intel_display_device_info_runtime_init().

In this case, the xe driver does for_each_intel_crtc() on uninitialized
mode config in xe_display_flush_cleanup_work(), leading to a NULL
pointer dereference, and generally calls display code with display info
cleared.

Check for intel_display_device_present() after
intel_display_device_info_runtime_init(), and reset
xe->info.probe_display as necessary. Also do unset_display_features()
for completeness, although display runtime init has already done
that. This will need to be unified across all cases later.

Move intel_display_device_info_runtime_init() call slightly earlier,
similar to i915, to avoid a bunch of unnecessary setup for no display
cases.

Note #1: The xe driver has no business doing low level display plumbing
like for_each_intel_crtc() to begin with. It all needs to happen in
display code.

Note #2: The actual bug is present already in commit 44e6949
("drm/xe/display: Implement display support"), but the oops was likely
introduced later at commit ddf6492 ("drm/xe/display: Make display
suspend/resume work on discrete").

Fixes: 44e6949 ("drm/xe/display: Implement display support")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7904
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/6150
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/20260515160920.1082842-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 7c3eb9f)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>

jinghao-jia pushed a commit that referenced this pull request


          ipv6: Fix a potential NPD in cleanup_prefix_route()

b70c687

addrconf_get_prefix_route() can return the fib6_null_entry sentinel
entry which has a NULL fib6_table pointer. Therefore, before setting the
route's expiration time, check that we are not working with this entry,
as otherwise a NPD will be triggered [1].

Note that the other callers of addrconf_get_prefix_route() are not
susceptible to this bug:

1. addrconf_prefix_rcv(): Requests a route with the 'RTF_ADDRCONF |
   RTF_PREFIX_RT' flags which are not set on fib6_null_entry.

2. modify_prefix_route(): Fixed by commit a747e02 ("ipv6: avoid
   possible NULL deref in modify_prefix_route()").

3. __ipv6_ifa_notify(): Calls ip6_del_rt() which specifically checks for
   fib6_null_entry and returns an error.

[1]
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
[...]
Call Trace:
<TASK>
__kasan_check_byte (mm/kasan/common.c:573)
lock_acquire.part.0 (kernel/locking/lockdep.c:5842 (discriminator 1))
_raw_spin_lock_bh (kernel/locking/spinlock.c:182 (discriminator 1))
cleanup_prefix_route (net/ipv6/addrconf.c:1280)
ipv6_del_addr (net/ipv6/addrconf.c:1342)
inet6_addr_del.isra.0 (net/ipv6/addrconf.c:3119)
inet6_rtm_deladdr (net/ipv6/addrconf.c:4812)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6997)
netlink_rcv_skb (net/netlink/af_netlink.c:2555)
netlink_unicast (net/netlink/af_netlink.c:1344)
netlink_sendmsg (net/netlink/af_netlink.c:1899)
__sock_sendmsg (net/socket.c:802 (discriminator 4))
____sys_sendmsg (net/socket.c:2698)
___sys_sendmsg (net/socket.c:2752)
__sys_sendmsg (net/socket.c:2784)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)

Fixes: 5eb902b ("net/ipv6: Remove expired routes with a separated list of routes.")
Reported-by: Ji'an Zhou <eilaimemedsnaimel@gmail.com>
Reviewed-by: David Ahern <dahern@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260609145448.768318-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet