RTX 5080/3090 setup achieves 80 tokens/sec with Qwen-3.6-27b

By PulseAugur Editorial · [1 sources] · 2026-06-13 09:55

A user has detailed their setup for running the Qwen-3.6-27b language model, achieving a speed of 80 tokens per second. This performance was realized using a combination of an RTX 5080 and an RTX 3090 graphics card. AI

IMPACT Demonstrates achievable inference speeds for large language models on consumer-grade hardware.

RANK_REASON User-level hardware setup and performance report for a specific model.

Read on Mastodon — mastodon.social →

infra

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

Mastodon — mastodon.social TIER_1 English(EN) · [email protected] · 2026-06-13 09:55

RTX 5080 and RTX 3090 Setup: 80 Tok/s on Qwen 3.6 27B Q8 https://imil.net/blog/posts/2026/rtx-5080-+-rtx-3090-setup-80+-tok-s-on-qwen-3.6-27b-q8/ # HackerNews #

RTX 5080 and RTX 3090 Setup: 80 Tok/s on Qwen 3.6 27B Q8 https://imil.net/blog/posts/2026/rtx-5080-+-rtx-3090-setup-80+-tok-s-on-qwen-3.6-27b-q8/ # HackerNews # Tech # AI

LINKS imil.net/…/rtx-5080-+-rtx-3090-setup-80+-…

COVERAGE [1]

RTX 5080 and RTX 3090 Setup: 80 Tok/s on Qwen 3.6 27B Q8 https://imil.net/blog/posts/2026/rtx-5080-+-rtx-3090-setup-80+-tok-s-on-qwen-3.6-27b-q8/ # HackerNews #

RELATED ENTITIES

RELATED TOPICS