A user encountered a CUDA error when attempting to load a Qwen-3.6-27b model with tensor split mode enabled in the latest version of llama.cpp. The error message indicates that the `llama_params_fit` function is not implemented for tensor split mode, leading to a failure in fitting parameters to device memory. This issue occurred on a system with dual 3090 GPUs running Ubuntu Server 24.04 and CUDA 13.0. AI
IMPACT This issue highlights potential compatibility problems when using advanced features like tensor split mode with specific model quantizations and hardware setups in local LLM deployments.
RANK_REASON User-reported technical issue with open-source software and hardware configuration. [lever_c_demoted from research: ic=1 ai=0.7]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →