Llama.cpp: Split Mode Tensor Fix Incoming?
A fix is reportedly incoming for the llama.cpp project to address crashes related to split mode tensor operations. This issue has been causing instability, particularly for users employing multiple GPUs, with tests showing a significant performance uplift but also frequent crashes due to VRAM exhaustion. The upcoming fix aims to resolve this specific problem, improving stability for multi-GPU setups. AI
IMPACT This fix will improve stability and performance for users running large models on multi-GPU setups with llama.cpp.