PulseAugur
EN
LIVE 11:51:43

Guide released for running Qwen and ASR models locally

Thomas Bley has released new slides detailing how to run large language models locally. The presentation covers multi-token prediction using the Qwen3.6 35B-A3B model with Nextn quantization. It also includes information on speech recognition with Qwen-3-ASR, which now functions with Llama.cpp. AI

IMPACT Provides a guide for local execution of open-source LLMs and ASR models, enabling broader experimentation and use.

RANK_REASON The cluster describes a technical presentation and guide for running open-source models locally, which falls under research and development. [lever_c_demoted from research: ic=1 ai=1.0]

Read on Mastodon — fosstodon.org →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Guide released for running Qwen and ASR models locally

COVERAGE [1]

  1. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    New week, new slides: Run LLMs Locally Now including multi-token prediction using Qwen3.6 35B-A3B with Nextn quantization. Also speech recognition using Qwen-3-

    New week, new slides: Run LLMs Locally Now including multi-token prediction using Qwen3.6 35B-A3B with Nextn quantization. Also speech recognition using Qwen-3-ASR is now working directly with Llama.cpp and included in the slides. https:// codeberg.org/thbley/talks/raw/ branch/ma…