twx-linux

Author	SHA1	Message	Date
Christian König	7a8f285cb5	drm/amdgpu: fix amdgpu_cs_p1_user_fence commit `35588314e9` upstream. The offset is just 32bits here so this can potentially overflow if somebody specifies a large value. Instead reduce the size to calculate the last possible offset. The error handling path incorrectly drops the reference to the user fence BO resulting in potential reference count underflow. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-23 11:01:10 +02:00
Ilpo Järvinen	ead3dbc92b	drm/amdgpu: Use RMW accessors for changing LNKCTL [ Upstream commit `ce7d88110b` ] Don't assume that only the driver would be accessing LNKCTL. ASPM policy changes can trigger write to LNKCTL outside of driver's control. And in the case of upstream bridge, the driver does not even own the device it's changing the registers for. Use RMW capability accessors which do proper locking to avoid losing concurrent updates to the register value. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: `a2e73f56fa` ("drm/amdgpu: Add support for CIK parts") Fixes: `62a3755341` ("drm/amdgpu: add si implementation v10") Link: https://lore.kernel.org/r/20230717120503.15276-6-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-19 12:20:15 +02:00
Srinivasan Shanmugam	a15f309eb9	drm/amdgpu: Update min() to min_t() in 'amdgpu_info_ioctl' [ Upstream commit `a0cc8e1512` ] Fixes the following: WARNING: min() should probably be min_t(size_t, size, sizeof(ip)) + ret = copy_to_user(out, &ip, min((size_t)size, sizeof(ip))); And other style fixes: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Missing a blank line after declarations Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-19 12:20:12 +02:00
Arnd Bergmann	826ef15769	drm/amdgpu: avoid integer overflow warning in amdgpu_device_resize_fb_bar() [ Upstream commit `822130b5e8` ] On 32-bit architectures comparing a resource against a value larger than U32_MAX can cause a warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1344:18: error: result of comparison of constant 4294967296 with expression of type 'resource_size_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare] res->start > 0x100000000ull) ~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ As gcc does not warn about this in dead code, add an IS_ENABLED() check at the start of the function. This will always return success but not actually resize the BAR on 32-bit architectures without high memory, which is exactly what we want here, as the driver can fall back to bank switching the VRAM access. Fixes: `31b8adab32` ("drm/amdgpu: require a root bus window above 4GB for BAR resize") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-19 12:20:11 +02:00
Greg Kroah-Hartman	79ea9eb723	Revert "drm/amdgpu: install stub fence into potential unused fence pointers" This reverts commit `04bd3a362d` which is commit `187916e6ed` upstream. It is reported to cause lots of log spam, so it should be dropped for now. Reported-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/27d08b24-3581-4451-b8db-5df144784d6a@roeck-us.net Reported-by: From: Chia-I Wu <olvaffe@gmail.com> Link: https://lore.kernel.org/r/CAPaKu7RTgAMBLHbwtp4zgiBSDrTFtAj07k5qMzkuLQy2Zr+sZA@mail.gmail.com Cc: Christian König <christian.koenig@amd.com> Cc: Lang Yu <Lang.Yu@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-02 09:18:13 +02:00
shanzhulig	b870b9a47f	drm/amdgpu: Fix potential fence use-after-free v2 [ Upstream commit `2e54154b9f` ] fence Decrements the reference count before exiting. Avoid Race Vulnerabilities for fence use-after-free. v2 (chk): actually fix the use after free and not just move it. Signed-off-by: shanzhulig <shanzhulig@gmail.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-26 15:26:46 +02:00
Lang Yu	04bd3a362d	drm/amdgpu: install stub fence into potential unused fence pointers [ Upstream commit `187916e6ed` ] When using cpu to update page tables, vm update fences are unused. Install stub fence into these fence pointers instead of NULL to avoid NULL dereference when calling dma_fence_wait() on them. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-26 15:26:43 +02:00
Chia-I Wu	968e27fd03	amdgpu: validate offset_in_bo of drm_amdgpu_gem_va [ Upstream commit `9f0bcf49e9` ] This is motivated by OOB access in amdgpu_vm_update_range when offset_in_bo+map_size overflows. v2: keep the validations in amdgpu_vm_bo_map v3: add the validations to amdgpu_vm_bo_map/amdgpu_vm_bo_replace_map rather than to amdgpu_gem_va_ioctl Fixes: `9f7eb5367d` ("drm/amdgpu: actually use the VM map parameters") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-07-27 08:43:52 +02:00
Bas Nieuwenhuizen	0336c8f072	drm/amdgpu: Validate VM ioctl flags. commit `a2b308044d` upstream. None have been defined yet, so reject anybody setting any. Mesa sets it to 0 anyway. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-07-27 08:43:31 +02:00
Sukrut Bellary	2a2641a842	drm:amd:amdgpu: Fix missing buffer object unlock in failure path [ Upstream commit `60ecaaf548` ] smatch warning - 1) drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:3615 gfx_v9_0_kiq_resume() warn: inconsistent returns 'ring->mqd_obj->tbo.base.resv'. 2) drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:6901 gfx_v10_0_kiq_resume() warn: inconsistent returns 'ring->mqd_obj->tbo.base.resv'. Signed-off-by: Sukrut Bellary <sukrut.bellary@linux.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-06-21 15:45:36 +02:00
Chia-I Wu	0038055135	drm/amdgpu: fix xclk freq on CHIP_STONEY commit `b447b079cf` upstream. According to Alex, most APUs from that time seem to have the same issue (vbios says 48Mhz, actual is 100Mhz). I only have a CHIP_STONEY so I limit the fixup to CHIP_STONEY Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-06-14 11:09:49 +02:00
Guchun Chen	c1420276be	drm/amdgpu: disable sdma ecc irq only when sdma RAS is enabled in suspend commit `8b229ada26` upstream. sdma_v4_0_ip is shared on a few asics, but in sdma_v4_0_hw_fini, driver unconditionally disables ecc_irq which is only enabled on those asics enabling sdma ecc. This will introduce a warning in suspend cycle on those chips with sdma ip v4.0, while without sdma ecc. So this patch correct this. [ 7283.166354] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu] [ 7283.167001] RSP: 0018:ffff9a5fc3967d08 EFLAGS: 00010246 [ 7283.167019] RAX: ffff98d88afd3770 RBX: 0000000000000001 RCX: 0000000000000000 [ 7283.167023] RDX: 0000000000000000 RSI: ffff98d89da30390 RDI: ffff98d89da20000 [ 7283.167025] RBP: ffff98d89da20000 R08: 0000000000036838 R09: 0000000000000006 [ 7283.167028] R10: ffffd5764243c008 R11: 0000000000000000 R12: ffff98d89da30390 [ 7283.167030] R13: ffff98d89da38978 R14: ffffffff999ae15a R15: ffff98d880130105 [ 7283.167032] FS: 0000000000000000(0000) GS:ffff98d996f00000(0000) knlGS:0000000000000000 [ 7283.167036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7283.167039] CR2: 00000000f7a9d178 CR3: 00000001c42ea000 CR4: 00000000003506e0 [ 7283.167041] Call Trace: [ 7283.167046] <TASK> [ 7283.167048] sdma_v4_0_hw_fini+0x38/0xa0 [amdgpu] [ 7283.167704] amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu] [ 7283.168296] amdgpu_device_suspend+0x103/0x180 [amdgpu] [ 7283.168875] amdgpu_pmops_freeze+0x21/0x60 [amdgpu] [ 7283.169464] pci_pm_freeze+0x54/0xc0 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-17 11:48:16 +02:00
Guchun Chen	20ca90ceda	drm/amdgpu/gfx: disable gfx9 cp_ecc_error_irq only when enabling legacy gfx ras commit `4a76680311` upstream. gfx9 cp_ecc_error_irq is only enabled when legacy gfx ras is assert. So in gfx_v9_0_hw_fini, interrupt disablement for cp_ecc_error_irq should be executed under such condition, otherwise, an amdgpu_irq_put calltrace will occur. [ 7283.170322] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu] [ 7283.170964] RSP: 0018:ffff9a5fc3967d00 EFLAGS: 00010246 [ 7283.170967] RAX: ffff98d88afd3040 RBX: ffff98d89da20000 RCX: 0000000000000000 [ 7283.170969] RDX: 0000000000000000 RSI: ffff98d89da2bef8 RDI: ffff98d89da20000 [ 7283.170971] RBP: ffff98d89da20000 R08: ffff98d89da2ca18 R09: 0000000000000006 [ 7283.170973] R10: ffffd5764243c008 R11: 0000000000000000 R12: 0000000000001050 [ 7283.170975] R13: ffff98d89da38978 R14: ffffffff999ae15a R15: ffff98d880130105 [ 7283.170978] FS: 0000000000000000(0000) GS:ffff98d996f00000(0000) knlGS:0000000000000000 [ 7283.170981] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7283.170983] CR2: 00000000f7a9d178 CR3: 00000001c42ea000 CR4: 00000000003506e0 [ 7283.170986] Call Trace: [ 7283.170988] <TASK> [ 7283.170989] gfx_v9_0_hw_fini+0x1c/0x6d0 [amdgpu] [ 7283.171655] amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu] [ 7283.172245] amdgpu_device_suspend+0x103/0x180 [amdgpu] [ 7283.172823] amdgpu_pmops_freeze+0x21/0x60 [amdgpu] [ 7283.173412] pci_pm_freeze+0x54/0xc0 [ 7283.173419] ? __pfx_pci_pm_freeze+0x10/0x10 [ 7283.173425] dpm_run_callback+0x98/0x200 [ 7283.173430] __device_suspend+0x164/0x5f0 v2: drop gfx11 as it's fixed in a different solution by retiring cp_ecc_irq funcs(Hawking) Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-17 11:48:16 +02:00
Hamza Mahfooz	eed63477ae	drm/amdgpu: fix an amdgpu_irq_put() issue in gmc_v9_0_hw_fini() commit `922a76ba31` upstream. As made mention of in commit `08c677cb0b` ("drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v10_0_hw_fini") and commit `13af556104` ("drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v11_0_hw_fini"). It is meaningless to call amdgpu_irq_put() for gmc.ecc_irq. So, remove it from gmc_v9_0_hw_fini(). Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522 Fixes: 3029c855d79f ("drm/amdgpu: Fix desktop freezed after gpu-reset") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-17 11:48:15 +02:00
Chia-I Wu	7f497a9451	drm/amdgpu: add a missing lock for AMDGPU_SCHED [ Upstream commit `2397e3d8d2` ] mgr->ctx_handles should be protected by mgr->lock. v2: improve commit message v3: add a Fixes tag Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Fixes: `52c6a62c64` ("drm/amdgpu: add interface for editing a foreign process's priority v3") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-17 11:48:12 +02:00
Alex Deucher	4279e87da6	drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc15 commit `0dcdf8498e` upstream. Properly skip non-existent registers as well. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2442 Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-03-17 08:45:06 +01:00
Alex Deucher	2f45b20c39	Revert "drm/amdgpu: make display pinning more flexible (v2)" This reverts commit `6302709784` which is commit `81d0bcf990` upstream. This commit causes hiberation regressions on some platforms on kernels older than 6.1.x (6.1.x and newer kernels works fine) so let's revert it from 5.15 and older stable kernels. This should be reverted from 6.0.x as well, but that kernel is no longer supported. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=216917 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: kolAflash@kolahilft.de Cc: jrf@mailbox.org Cc: mario.limonciello@amd.com Cc: stable@vger.kernel.org # 5.15.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-24 07:20:01 +01:00
Alex Deucher	6302709784	drm/amdgpu: make display pinning more flexible (v2) commit `81d0bcf990` upstream. Only apply the static threshold for Stoney and Carrizo. This hardware has certain requirements that don't allow mixing of GTT and VRAM. Newer asics do not have these requirements so we should be able to be more flexible with where buffers end up. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2270 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2291 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2255 Acked-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-14 10:16:40 +01:00
Alex Deucher	dfc01905b8	drm/amdgpu: handle polaris10/11 overlap asics (v2) commit `1d4624cd72` upstream. Some special polaris 10 chips overlap with the polaris11 DID range. Handle this properly in the driver. v2: use local flags for other function calls. Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-14 10:16:40 +01:00
Nathan Chancellor	a42a23bdae	drm/amdgpu: Fix type of second parameter in trans_msg() callback [ Upstream commit `f0d0f10873` ] With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG), indirect call targets are validated against the expected function pointer prototype to make sure the call target is valid to help mitigate ROP attacks. If they are not identical, there is a failure at run time, which manifests as either a kernel panic or thread getting killed. A proposed warning in clang aims to catch these at compile time, which reveals: drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:412:15: error: incompatible function pointer types initializing 'void ()(struct amdgpu_device , u32, u32, u32, u32)' (aka 'void ()(struct amdgpu_device , unsigned int, unsigned int, unsigned int, unsigned int)') with an expression of type 'void (struct amdgpu_device , enum idh_request, u32, u32, u32)' (aka 'void (struct amdgpu_device , enum idh_request, unsigned int, unsigned int, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict] .trans_msg = xgpu_ai_mailbox_trans_msg, ^~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:435:15: error: incompatible function pointer types initializing 'void ()(struct amdgpu_device , u32, u32, u32, u32)' (aka 'void ()(struct amdgpu_device , unsigned int, unsigned int, unsigned int, unsigned int)') with an expression of type 'void (struct amdgpu_device , enum idh_request, u32, u32, u32)' (aka 'void (struct amdgpu_device , enum idh_request, unsigned int, unsigned int, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict] .trans_msg = xgpu_nv_mailbox_trans_msg, ^~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. The type of the second parameter in the prototype should be 'enum idh_request' instead of 'u32'. Update it to clear up the warnings. Link: https://github.com/ClangBuiltLinux/linux/issues/1750 Reported-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-01-14 10:16:16 +01:00
Xiongfeng Wang	7c1ddf7c66	drm/amdgpu: Fix PCI device refcount leak in amdgpu_atrm_get_bios() [ Upstream commit `ca54639c77` ] As comment of pci_get_class() says, it returns a pci_device with its refcount increased and decreased the refcount for the input parameter @from if it is not NULL. If we break the loop in amdgpu_atrm_get_bios() with 'pdev' not NULL, we need to call pci_dev_put() to decrease the refcount. Add the missing pci_dev_put() to avoid refcount leak. Fixes: `d38ceaf99e` ("drm/amdgpu: add core driver (v4)") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-01-14 10:15:35 +01:00
Yang Yingliang	3725a8f26b	drm/amdgpu: fix pci device refcount leak [ Upstream commit `b85e285e3d` ] As comment of pci_get_domain_bus_and_slot() says, it returns a pci device with refcount increment, when finish using it, the caller must decrement the reference count by calling pci_dev_put(). So before returning from amdgpu_device_resume\|suspend_display_audio(), pci_dev_put() is called to avoid refcount leak. Fixes: `3f12acc8d6` ("drm/amdgpu: put the audio codec into suspend state before gpu reset V3") Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-01-14 10:15:31 +01:00
Alex Deucher	d3f5be8246	drm/amdgpu: Partially revert "drm/amdgpu: update drm_display_info correctly when the edid is read" [ Upstream commit `602ad43c3c` ] This partially reverts `20543be93c`. Calling drm_connector_update_edid_property() in amdgpu_connector_free_edid() causes a noticeable pause in the system every 10 seconds on polled outputs so revert this part of the change. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2257 Cc: Claudio Suarez <cssk@net-c.es> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-08 11:23:54 +01:00
Claudio Suarez	00570fafc2	drm/amdgpu: update drm_display_info correctly when the edid is read [ Upstream commit `20543be93c` ] drm_display_info is updated by drm_get_edid() or drm_connector_update_edid_property(). In the amdgpu driver it is almost always updated when the edid is read in amdgpu_connector_get_edid(), but not always. Change amdgpu_connector_get_edid() and amdgpu_connector_free_edid() to keep drm_display_info updated. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Claudio Suarez <cssk@net-c.es> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: `602ad43c3c` ("drm/amdgpu: Partially revert "drm/amdgpu: update drm_display_info correctly when the edid is read"") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-08 11:23:54 +01:00
Christian König	feb97cf45e	drm/amdgpu: always register an MMU notifier for userptr commit `b39df63b16` upstream. Since switching to HMM we always need that because we no longer grab references to the pages. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> CC: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-12-02 17:40:06 +01:00
Danijel Slivka	5bf8c7798b	drm/amdgpu: set vm_update_mode=0 as default for Sienna Cichlid in SRIOV case [ Upstream commit `65f8682b9a` ] For asic with VF MMIO access protection avoid using CPU for VM table updates. CPU pagetable updates have issues with HDP flush as VF MMIO access protection blocks write to mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL register during sriov runtime. v3: introduce virtualization capability flag AMDGPU_VF_MMIO_ACCESS_PROTECT which indicates that VF MMIO write access is not allowed in sriov runtime Signed-off-by: Danijel Slivka <danijel.slivka@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-10 18:14:23 +01:00
Alex Deucher	243c8f42ba	Revert "drm/amdgpu: make sure to init common IP before gmc" This reverts commit `7b0db849ea` which is commit `a8671493d2` upstream. The patches that this patch depends on were not backported properly and the patch that caused the regression that this patch set fixed was reverted in commit `412b844143` ("Revert "PCI/portdrv: Don't disable AER reporting in get_port_device_capability()""). This isn't necessary and causes a regression so drop it. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2216 Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: <stable@vger.kernel.org> # 5.10 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-26 13:25:56 +02:00
Shuah Khan	357db159e9	Revert "drm/amdgpu: use dirty framebuffer helper" This reverts commit `867b2b2b68` which is commit `66f99628eb` upstream. With this commit, dmesg fills up with the following messages and drm initialization takes a very long time. This commit has bee reverted from 5.4 [drm] Fence fallback timer expired on ring sdma0 [drm] Fence fallback timer expired on ring gfx [drm] Fence fallback timer expired on ring sdma0 [drm] Fence fallback timer expired on ring gfx [drm] Fence fallback timer expired on ring sdma0 [drm] Fence fallback timer expired on ring sdma0 [drm] Fence fallback timer expired on ring sdma0 [drm] Fence fallback timer expired on ring gfx Cc: <stable@vger.kernel.org> # 5.10 Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-26 13:25:56 +02:00
Shuah Khan	98ab15bfdc	Revert "drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega" This reverts commit `9f55f36f74` which is commit `e3163bc8ff` upstream. This commit causes repeated WARN_ONs from drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amd gpu_dm.c:7391 amdgpu_dm_atomic_commit_tail+0x23b9/0x2430 [amdgpu] dmesg fills up with the following messages and drm initialization takes a very long time. Cc: <stable@vger.kernel.org> # 5.10 Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-26 13:25:56 +02:00
hongao	1c7d957c5d	drm/amdgpu: fix initial connector audio value [ Upstream commit `4bb71fce58` ] This got lost somewhere along the way, This fixes audio not working until set_property was called. Signed-off-by: hongao <hongao@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 13:25:49 +02:00
Hamza Mahfooz	867b2b2b68	drm/amdgpu: use dirty framebuffer helper [ Upstream commit `66f99628eb` ] Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs struct. Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-28 11:10:39 +02:00
Luben Tuikov	09867977fc	drm/amdgpu: Fix check for RAS support [ Upstream commit `084e2640e5` ] Use positive logic to check for RAS support. Rename the function to actually indicate what it is testing for. Essentially, make the function a predicate with the correct name. Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: `6c20490663` ("drm/amdgpu: Don't enable LTR if not supported") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-28 11:10:38 +02:00
Jingwen Chen	fda04a0bab	drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOV commit `9a458402fb` upstream. [Why] This fixes `892deb4826` ("drm/amdgpu: Separate vf2pf work item init from virt data exchange"). we should read pf2vf data based at mman.fw_vram_usage_va after gmc sw_init. commit `892deb4826` breaks this logic. [How] calling amdgpu_virt_exchange_data in amdgpu_virt_init_data_exchange to set the right base in the right sequence. v2: call amdgpu_virt_init_data_exchange after gmc sw_init to make data exchange workqueue run v3: clean up the code logic v4: add some comment and make the code more readable Fixes: `892deb4826` ("drm/amdgpu: Separate vf2pf work item init from virt data exchange") Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-28 11:10:37 +02:00
Alex Deucher	7b0db849ea	drm/amdgpu: make sure to init common IP before gmc [ Upstream commit `a8671493d2` ] Move common IP init before GMC init so that HDP gets remapped before GMC init which uses it. This fixes the Unsupported Request error reported through AER during driver load. The error happens as a write happens to the remap offset before real remapping is done. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: `8795e182b0` ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-28 11:10:20 +02:00
Victor Skvortsov	9d18013dac	drm/amdgpu: Separate vf2pf work item init from virt data exchange [ Upstream commit `892deb4826` ] We want to be able to call virt data exchange conditionally after gmc sw init to reserve bad pages as early as possible. Since this is a conditional call, we will need to call it again unconditionally later in the init sequence. Refactor the data exchange function so it can be called multiple times without re-initializing the work item. v2: Cleaned up the code. Kept the original call to init_exchange_data() inside early init to initialize the work item, afterwards call exchange_data() when needed. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed By: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-28 11:10:19 +02:00
Peng Ju Zhou	87a4e51fb8	drm/amdgpu: indirect register access for nv12 sriov [ Upstream commit `8b8a162da8` ] unify host driver and guest driver indirect access control bits names Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-28 11:10:19 +02:00
Alex Deucher	9f55f36f74	drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega [ Upstream commit `e3163bc8ff` ] This mirrors what we do for other asics and this way we are sure the sdma doorbell range is properly initialized. There is a comment about the way doorbells on gfx9 work that requires that they are initialized for other IPs before GFX is initialized. However, the statement says that it applies to multimedia as well, but the VCN code currently initializes doorbells after GFX and there are no known issues there. In my testing at least I don't see any problems on SDMA. This is a prerequisite for fixing the Unsupported Request error reported through AER during driver load. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: `8795e182b0` ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-28 11:10:19 +02:00
Chengming Gui	187908079d	drm/amd/amdgpu: skip ucode loading if ucode_size == 0 [ Upstream commit `39c84b8e92` ] Restrict the ucode loading check to avoid frontdoor loading error. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-20 12:38:32 +02:00
Qu Huang	dfb27648ee	drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly [ Upstream commit `b8983d4252` ] The mmVM_L2_CNTL3 register is not assigned an initial value Signed-off-by: Qu Huang <jinsdb@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:32:03 +02:00
Candice Li	0410256867	drm/amdgpu: Check num_gfx_rings for gfx v9_0 rb setup. [ Upstream commit `c351938350` ] No need to set up rb when no gfx rings. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:32:02 +02:00
YiPeng Chai	c19656cd95	drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini [ Upstream commit `9d705d7741` ] V1: The amdgpu_xgmi_remove_device function will send unload command to psp through psp ring to terminate xgmi, but psp ring has been destroyed in psp_hw_fini. V2: 1. Change the commit title. 2. Restore amdgpu_xgmi_remove_device to its original calling location. Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:32:02 +02:00
Dusica Milinkovic	afa169f79d	drm/amdgpu: Increase tlb flush timeout for sriov [ Upstream commit `373008bfc9` ] [Why] During multi-vf executing benchmark (Luxmark) observed kiq error timeout. It happenes because all of VFs do the tlb invalidation at the same time. Although each VF has the invalidate register set, from hardware side the invalidate requests are queue to execute. [How] In case of 12 VF increase timeout on 12*100ms Signed-off-by: Dusica Milinkovic <Dusica.Milinkovic@amd.com> Acked-by: Shaoyun Liu <shaoyun.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-05 10:28:58 +02:00
Leo Li	15d0aeb017	drm/amdgpu: Check BO's requested pinning domains against its preferred_domains commit `f5ba140436` upstream. When pinning a buffer, we should check to see if there are any additional restrictions imposed by bo->preferred_domains. This will prevent the BO from being moved to an invalid domain when pinning. For example, this can happen if the user requests to create a BO in GTT domain for display scanout. amdgpu_dm will allow pinning to either VRAM or GTT domains, since DCN can scanout from either or. However, in amdgpu_bo_pin_restricted(), pinning to VRAM is preferred if there is adequate carveout. This can lead to pinning to VRAM despite the user requesting GTT placement for the BO. v2: Allow the kernel to override the domain, which can happen when exporting a BO to a V4L camera (for example). Signed-off-by: Leo Li <sunpeng.li@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-21 15:15:24 +02:00
Ruili Ji	03b9e01659	drm/amdgpu: To flush tlb for MMHUB of RAVEN series commit `5cb0e3fb2c` upstream. amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:40 vmid:8 pasid:32769, for process test_basic pid 3305 thread test_basic pid 3305) amdgpu: in page starting at address 0x00007ff990003000 from IH client 0x12 (VMC) amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00840051 amdgpu: Faulty UTCL2 client ID: MP1 (0x0) amdgpu: MORE_FAULTS: 0x1 amdgpu: WALKER_ERROR: 0x0 amdgpu: PERMISSION_FAULTS: 0x5 amdgpu: MAPPING_ERROR: 0x0 amdgpu: RW: 0x1 When memory is allocated by kfd, no one triggers the tlb flush for MMHUB0. There is page fault from MMHUB0. v2:fix indentation v3:change subject and fix indentation Signed-off-by: Ruili Ji <ruiliji2@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-07-07 17:52:15 +02:00
Dave Airlie	be585921f2	drm/amdgpu/cs: make commands with 0 chunks illegal behaviour. commit `31ab27b14d` upstream. Submitting a cs with 0 chunks, causes an oops later, found trying to execute the wrong userspace driver. MESA_LOADER_DRIVER_OVERRIDE=v3d glxinfo [172536.665184] BUG: kernel NULL pointer dereference, address: 00000000000001d8 [172536.665188] #PF: supervisor read access in kernel mode [172536.665189] #PF: error_code(0x0000) - not-present page [172536.665191] PGD 6712a0067 P4D 6712a0067 PUD 5af9ff067 PMD 0 [172536.665195] Oops: 0000 [#1] SMP NOPTI [172536.665197] CPU: 7 PID: 2769838 Comm: glxinfo Tainted: P O 5.10.81 #1-NixOS [172536.665199] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CROSSHAIR V FORMULA-Z, BIOS 2201 03/23/2015 [172536.665272] RIP: 0010:amdgpu_cs_ioctl+0x96/0x1ce0 [amdgpu] [172536.665274] Code: 75 18 00 00 4c 8b b2 88 00 00 00 8b 46 08 48 89 54 24 68 49 89 f7 4c 89 5c 24 60 31 d2 4c 89 74 24 30 85 c0 0f 85 c0 01 00 00 <48> 83 ba d8 01 00 00 00 48 8b b4 24 90 00 00 00 74 16 48 8b 46 10 [172536.665276] RSP: 0018:ffffb47c0e81bbe0 EFLAGS: 00010246 [172536.665277] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [172536.665278] RDX: 0000000000000000 RSI: ffffb47c0e81be28 RDI: ffffb47c0e81bd68 [172536.665279] RBP: ffff936524080010 R08: 0000000000000000 R09: ffffb47c0e81be38 [172536.665281] R10: ffff936524080010 R11: ffff936524080000 R12: ffffb47c0e81bc40 [172536.665282] R13: ffffb47c0e81be28 R14: ffff9367bc410000 R15: ffffb47c0e81be28 [172536.665283] FS: 00007fe35e05d740(0000) GS:ffff936c1edc0000(0000) knlGS:0000000000000000 [172536.665284] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [172536.665286] CR2: 00000000000001d8 CR3: 0000000532e46000 CR4: 00000000000406e0 [172536.665287] Call Trace: [172536.665322] ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu] [172536.665332] drm_ioctl_kernel+0xaa/0xf0 [drm] [172536.665338] drm_ioctl+0x201/0x3b0 [drm] [172536.665369] ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu] [172536.665372] ? selinux_file_ioctl+0x135/0x230 [172536.665399] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [172536.665403] __x64_sys_ioctl+0x83/0xb0 [172536.665406] do_syscall_64+0x33/0x40 [172536.665409] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2018 Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-06-09 10:21:24 +02:00
Alice Wong	3ee67465f7	drm/amdgpu/ucode: Remove firmware load type check in amdgpu_ucode_free_bo [ Upstream commit `ab0cd4a9ae` ] When psp_hw_init failed, it will set the load_type to AMDGPU_FW_LOAD_DIRECT. During amdgpu_device_ip_fini, amdgpu_ucode_free_bo checks that load_type is AMDGPU_FW_LOAD_DIRECT and skips deallocating fw_buf causing memory leak. Remove load_type check in amdgpu_ucode_free_bo. Signed-off-by: Alice Wong <shiwei.wong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-06-09 10:20:53 +02:00
Tomasz Moń	1fcfe37d17	drm/amdgpu: Enable gfxoff quirk on MacBook Pro commit `4593c1b6d1` upstream. Enabling gfxoff quirk results in perfectly usable graphical user interface on MacBook Pro (15-inch, 2019) with Radeon Pro Vega 20 4 GB. Without the quirk, X server is completely unusable as every few seconds there is gpu reset due to ring gfx timeout. Signed-off-by: Tomasz Moń <desowin@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:23:28 +02:00
Tianci Yin	e5afacc826	drm/amdgpu/vcn: improve vcn dpg stop procedure [ Upstream commit `6ea239adc2` ] Prior to disabling dpg, VCN need unpausing dpg mode, or VCN will hang in S3 resuming. Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Tianci Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:23:20 +02:00
Tushar Patel	d2e0931e6d	drm/amdkfd: Fix Incorrect VMIDs passed to HWS [ Upstream commit `b7dfbd2e60` ] Compute-only GPUs have more than 8 VMIDs allocated to KFD. Fix this by passing correct number of VMIDs to HWS v2: squash in warning fix (Alex) Signed-off-by: Tushar Patel <tushar.patel@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:23:20 +02:00
Aurabindo Pillai	9b5d1b3413	drm/amd: Add USBC connector ID [ Upstream commit `c5c948aa89` ] [Why&How] Add a dedicated AMDGPU specific ID for use with newer ASICs that support USB-C output Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:23:18 +02:00

1 2 3 4 5 ...

8272 Commits