PulseAugur
EN
LIVE 13:49:33

Guide: Run LLMs on AMD NPUs with FastFlowLM on Fedora

This guide details how to run Large Language Models (LLMs) on AMD NPUs using FastFlowLM on Fedora Linux. It outlines a four-layer setup requiring building XRT, the NPU plugin, and FastFlowLM from source, as pre-built packages are not available for Fedora. Key challenges include ensuring IOMMU is enabled and correctly symlinking XRT components. The guide provides step-by-step instructions for installing dependencies, building and installing XRT and the NPU plugin, and configuring memory lock limits, while emphasizing the critical need to avoid the `amd_iommu=off` kernel parameter. AI

IMPACT Enables running LLMs on AMD NPUs, potentially expanding hardware options for AI inference.

RANK_REASON Guide on setting up specific hardware and software for a particular task.

Read on dev.to — LLM tag →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Guide: Run LLMs on AMD NPUs with FastFlowLM on Fedora

COVERAGE [1]

  1. dev.to — LLM tag TIER_1 English(EN) · Ankit Khandelwal ·

    Running LLMs on AMD NPU with FastFlowLM - Fedora Guide

    <h1> Running LLMs on AMD NPU with FastFlowLM - Fedora Guide </h1> <blockquote> <p>Tested on <strong>Fedora 44</strong>, kernel <strong>7.0.12</strong>, <strong>ROG Flow Z13</strong> (Ryzen AI Max 390 / Strix Halo NPU).<br /><br /> Goal: copy-paste setup that gets <code>flm valida…