drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs - linux.git - dakr's fork of kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

diff options

author	Lang Yu <Lang.Yu@amd.com>	2024-04-26 14:56:35 +0800
committer	Alex Deucher <alexander.deucher@amd.com>	2024-05-20 17:17:53 -0400
commit	eb853413d02c8d9b27942429b261a9eef228f005 (patch)
tree	0ba5eb129a4142ed3765aadcf47dce47bc590b2e /drivers/phy
parent	2a705f3e49d20b59cd9e5cc3061b2d92ebe1e5f0 (diff)

drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs

Small APUs(i.e., consumer, embedded products) usually have a small carveout device memory which can't satisfy most compute workloads memory allocation requirements. We can't even run a Basic MNIST Example with a default 512MB carveout. https://github.com/pytorch/examples/tree/main/mnist. Error Log: "torch.cuda.OutOfMemoryError: HIP out of memory. Tried to allocate 84.00 MiB. GPU 0 has a total capacity of 512.00 MiB of which 0 bytes is free. Of the allocated memory 103.83 MiB is allocated by PyTorch, and 22.17 MiB is reserved by PyTorch but unallocated" Though we can change BIOS settings to enlarge carveout size, which is inflexible and may bring complaint. On the other hand, the memory resource can't be effectively used between host and device. The solution is MI300A approach, i.e., let VRAM allocations go to GTT. Then device and host can flexibly and effectively share memory resource. v2: Report local_mem_size_private as 0. (Felix) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Diffstat (limited to 'drivers/phy')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: