Jetson Orin NX powers Hermes Agent with 65K context and fast inference

By PulseAugur Editorial · [1 sources] · 2026-06-09 11:10

A user has successfully configured a Jetson Orin NX for running the Hermes Agent, achieving impressive performance metrics. The build prioritizes silence and aesthetic appeal while delivering over 10 tokens/sec for text generation and 300 tokens/sec for prompt processing. The setup supports a context window of at least 65,000 tokens, with specific testing showing a Gemma 4 26B model achieving 10.21 tokens/sec at 60,000 tokens of context. AI

IMPACT Demonstrates efficient local LLM deployment on compact hardware, enabling advanced agent capabilities.

RANK_REASON User-driven hardware and software configuration for a specific AI agent.

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Jetson Orin NX powers Hermes Agent with 65K context and fast inference

COVERAGE [1]

r/LocalLLaMA TIER_1 English(EN) · /u/Reddactor · 2026-06-09 11:10

Jetson Orin NX Build for Hermes Agent + Benchmarking

<table> <tr><td> <a href="https://www.reddit.com/r/LocalLLaMA/comments/1u11wvo/jetson_orin_nx_build_for_hermes_agent_benchmarking/"> <img alt="Jetson Orin NX Build for Hermes Agent + Benchmarking" src="https://preview.redd.it/mqihfwevm86h1.jpg?width=140&height=140&crop=1:…

COVERAGE [1]

Jetson Orin NX Build for Hermes Agent + Benchmarking

RELATED ENTITIES

RELATED TOPICS