magic incantation to get llama-bench to work with MTP ?
Users on the r/LocalLLaMA subreddit are seeking a solution to integrate llama-bench with MTP, as standard methods that work with llama-server are failing. The core issue appears to be compatibility, with speculation that llama-bench may not support speculative decoding. AI