KASAN reports a NULL instruction fetch (RIP=0x0) from
dc_stream_program_cursor_position():
BUG: kernel NULL pointer dereference, address:
0000000000000000
RIP: 0010:0x0
Call Trace:
dc_stream_program_cursor_position+0x344/0x920 [amdgpu]
amdgpu_dm_atomic_commit_tail+...
[ +1.041013] BUG: kernel NULL pointer dereference, address:
0000000000000000
[ +0.000027] #PF: supervisor instruction fetch in kernel mode
[ +0.000013] #PF: error_code(0x0010) - not-present page
[ +0.000012] PGD 0 P4D 0
[ +0.000017] Oops: Oops: 0010 [#1] SMP KASAN NOPTI
[ +0.000017] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Tainted: G E 6.18.0+ #3 PREEMPT(voluntary)
[ +0.000023] Tainted: [E]=UNSIGNED_MODULE
[ +0.000010] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
[ +0.000016] Workqueue: events drm_mode_rmfb_work_fn
[ +0.000022] RIP: 0010:0x0
[ +0.000017] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ +0.000015] RSP: 0018:
ffffc9000017f4c8 EFLAGS:
00010246
[ +0.000016] RAX:
0000000000000000 RBX:
ffff88810afdda80 RCX:
1ffff110457000d1
[ +0.000014] RDX:
1ffffffff87b75bd RSI:
0000000000000000 RDI:
ffff88810afdda80
[ +0.000014] RBP:
ffffc9000017f538 R08:
0000000000000000 R09:
ffff88822b800690
[ +0.000013] R10:
0000000000000000 R11:
0000000000000000 R12:
ffffffffc3dbac20
[ +0.000014] R13:
0000000000000000 R14:
ffff88811ab80000 R15:
dffffc0000000000
[ +0.000014] FS:
0000000000000000(0000) GS:
ffff888434599000(0000) knlGS:
0000000000000000
[ +0.000015] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ +0.000013] CR2:
ffffffffffffffd6 CR3:
000000010ee88000 CR4:
0000000000350ef0
[ +0.000014] Call Trace:
[ +0.000010] <TASK>
[ +0.000010] dc_stream_program_cursor_position+0x344/0x920 [amdgpu]
[ +0.001086] ? __pfx_mutex_lock+0x10/0x10
[ +0.000015] ? unwind_next_frame+0x18b/0xa70
[ +0.000019] amdgpu_dm_atomic_commit_tail+0x1124/0xfa20 [amdgpu]
[ +0.001040] ? ret_from_fork_asm+0x1a/0x30
[ +0.000018] ? filter_irq_stacks+0x90/0xa0
[ +0.000022] ? __pfx_amdgpu_dm_atomic_commit_tail+0x10/0x10 [amdgpu]
[ +0.001058] ? kasan_save_track+0x18/0x70
[ +0.000015] ? kasan_save_alloc_info+0x37/0x60
[ +0.000015] ? __kasan_kmalloc+0xc3/0xd0
[ +0.000013] ? __kmalloc_cache_noprof+0x1aa/0x600
[ +0.000016] ? drm_atomic_helper_setup_commit+0x788/0x1450
[ +0.000017] ? drm_atomic_helper_commit+0x7e/0x290
[ +0.000014] ? drm_atomic_commit+0x205/0x2e0
[ +0.000015] ? process_one_work+0x629/0xf80
[ +0.000016] ? worker_thread+0x87f/0x1570
[ +0.000020] ? srso_return_thunk+0x5/0x5f
[ +0.000014] ? __kasan_check_write+0x14/0x30
[ +0.000014] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? _raw_spin_lock_irq+0x8a/0xf0
[ +0.000015] ? __pfx__raw_spin_lock_irq+0x10/0x10
[ +0.000016] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __kasan_check_write+0x14/0x30
[ +0.000014] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __wait_for_common+0x204/0x460
[ +0.000015] ? sched_clock_noinstr+0x9/0x10
[ +0.000014] ? __pfx_schedule_timeout+0x10/0x10
[ +0.000014] ? local_clock_noinstr+0xe/0xd0
[ +0.000015] ? __pfx___wait_for_common+0x10/0x10
[ +0.000014] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __wait_for_common+0x204/0x460
[ +0.000014] ? __pfx_schedule_timeout+0x10/0x10
[ +0.000015] ? __kasan_kmalloc+0xc3/0xd0
[ +0.000015] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? wait_for_completion_timeout+0x1d/0x30
[ +0.000015] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? drm_crtc_commit_wait+0x32/0x180
[ +0.000015] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? drm_atomic_helper_wait_for_dependencies+0x46a/0x800
[ +0.000019] commit_tail+0x231/0x510
[ +0.000017] drm_atomic_helper_commit+0x219/0x290
[ +0.000015] ? __pfx_drm_atomic_helper_commit+0x10/0x10
[ +0.000016] drm_atomic_commit+0x205/0x2e0
[ +0.000014] ? __pfx_drm_atomic_commit+0x10/0x10
[ +0.000013] ? __pfx_drm_connector_free+0x10/0x10
[ +0.000014] ? __pfx___drm_printfn_info+0x10/0x10
[ +0.000017] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? drm_atomic_set_crtc_for_connector+0x49e/0x660
[ +0.000015] ? drm_atomic_set_fb_for_plane+0x155/0x290
[ +0.000015] drm_framebuffer_remove+0xa9b/0x1240
[ +0.000014] ? finish_task_switch.isra.0+0x15a/0x840
[ +0.000015] ? __switch_to+0x385/0xda0
[ +0.000015] ? srso_safe_ret+0x1/0x20
[ +0.000013] ? __pfx_drm_framebuffer_remove+0x10/0x10
[ +0.000016] ? kasan_print_address_stack_frame+0x221/0x280
[ +0.000015] drm_mode_rmfb_work_fn+0x14b/0x240
[ +0.000015] process_one_work+0x629/0xf80
[ +0.000012] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __kasan_check_write+0x14/0x30
[ +0.000019] worker_thread+0x87f/0x1570
[ +0.000013] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ +0.000014] ? __pfx_try_to_wake_up+0x10/0x10
[ +0.000017] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? kasan_print_address_stack_frame+0x227/0x280
[ +0.000017] ? __pfx_worker_thread+0x10/0x10
[ +0.000014] kthread+0x396/0x830
[ +0.000013] ? __pfx__raw_spin_lock_irq+0x10/0x10
[ +0.000015] ? __pfx_kthread+0x10/0x10
[ +0.000012] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __kasan_check_write+0x14/0x30
[ +0.000014] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? recalc_sigpending+0x180/0x210
[ +0.000015] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __pfx_kthread+0x10/0x10
[ +0.000014] ret_from_fork+0x31c/0x3e0
[ +0.000014] ? __pfx_kthread+0x10/0x10
[ +0.000013] ret_from_fork_asm+0x1a/0x30
[ +0.000019] </TASK>
[ +0.000010] Modules linked in: rfcomm(E) cmac(E) algif_hash(E) algif_skcipher(E) af_alg(E) snd_seq_dummy(E) snd_hrtimer(E) qrtr(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) xt_mark(E) xt_tcpudp(E) nft_compat(E) nf_tables(E) x_tables(E) bnep(E) snd_hda_codec_alc882(E) snd_hda_codec_atihdmi(E) snd_hda_codec_realtek_lib(E) snd_hda_codec_hdmi(E) snd_hda_codec_generic(E) iwlmvm(E) snd_hda_intel(E) binfmt_misc(E) snd_hda_codec(E) snd_hda_core(E) mac80211(E) snd_intel_dspcfg(E) snd_intel_sdw_acpi(E) snd_hwdep(E) snd_pcm(E) libarc4(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_rawmidi(E) amd_atl(E) intel_rapl_msr(E) snd_seq(E) intel_rapl_common(E) iwlwifi(E) jc42(E) snd_seq_device(E) btusb(E) snd_timer(E) btmtk(E) btrtl(E) edac_mce_amd(E) eeepc_wmi(E) polyval_clmulni(E) btbcm(E) ghash_clmulni_intel(E) asus_wmi(E) ee1004(E) platform_profile(E) btintel(E) snd(E) nls_iso8859_1(E) aesni_intel(E) soundcore(E) i2c_piix4(E) cfg80211(E) sparse_keymap(E) wmi_bmof(E) bluetooth(E) k10temp(E) rapl(E)
[ +0.000300] i2c_smbus(E) ccp(E) joydev(E) input_leds(E) gpio_amdpt(E) mac_hid(E) sch_fq_codel(E) msr(E) parport_pc(E) ppdev(E) lp(E) parport(E) efi_pstore(E) nfnetlink(E) dmi_sysfs(E) autofs4(E) cdc_ether(E) usbnet(E) amdgpu(E) amdxcp(E) hid_generic(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_exec(E) drm_panel_backlight_quirks(E) gpu_sched(E) drm_suballoc_helper(E) video(E) drm_buddy(E) usbhid(E) drm_display_helper(E) r8152(E) hid(E) mii(E) cec(E) ahci(E) rc_core(E) igc(E) libahci(E) wmi(E)
[ +0.000294] CR2:
0000000000000000
[ +0.000013] ---[ end trace
0000000000000000 ]---
The crash happens when we unconditionally call into the timing generator
manual trigger hook:
pipe_ctx->stream_res.tg->funcs->program_manual_trigger(...)
On some configurations the timing generator (tg), its funcs table, or the
program_manual_trigger callback can be NULL. Guard all of these before
calling the hook. If the first pipe matching the stream cannot trigger,
keep scanning to find another matching pipe with a valid hook.
The issue was originally found on Vg20/DCE 12.1
Mario successfully tested on Polaris 11/DCE 11.2
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Alexander Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Fixes:
ba448f9ed62c ("drm/amd/display: mouse event trigger to boost RR when idle")
Suggested-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>