StepFun 3.7 Flash model shows speed benchmarks on M5 Max chip

By PulseAugur Editorial · [1 sources] · 2026-05-29 04:04

A user on Reddit shared benchmarks for the StepFun 3.7 Flash model, running it on an M5 Max chip with 128GB of RAM. The model demonstrated fast and responsive performance with short context windows under 16k tokens. Performance remained usable for context lengths up to 64k, though memory usage became a factor at higher contexts. AI

IMPACT Provides performance data for local LLM deployment, aiding users in hardware selection and expectation setting.

RANK_REASON User-generated benchmark of an open-source model. [lever_c_demoted from research: ic=1 ai=1.0]

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

StepFun 3.7 Flash model shows speed benchmarks on M5 Max chip

COVERAGE [1]

r/LocalLLaMA TIER_1 English(EN) · /u/Beamsters · 2026-05-29 04:04

StepFun 3.7 Flash - Speed Benchmark in M5 Max

<table> <tr><td> <a href="https://www.reddit.com/r/LocalLLaMA/comments/1tqqebc/stepfun_37_flash_speed_benchmark_in_m5_max/"> <img alt="StepFun 3.7 Flash - Speed Benchmark in M5 Max" src="https://preview.redd.it/322rt8n4304h1.png?width=140&height=104&auto=webp&s=7bf11f…

COVERAGE [1]

StepFun 3.7 Flash - Speed Benchmark in M5 Max

RELATED ENTITIES

RELATED TOPICS