ENTITY WebGPU

WebGPU

PulseAugur coverage of WebGPU — every cluster mentioning WebGPU across labs, papers, and developer communities, ranked by signal.

Total · 30d

25

25 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

4

4 over 90d

TIER MIX · 90D

TOPICS

RELATIONSHIPS

SENTIMENT · 30D

7 day(s) with sentiment data

RECENT · PAGE 1/2 · 25 TOTAL

TOOL · CL_112417 · Jun 25 · 20:17

Kuma project compiles PyTorch models for browser execution via WebGPU

A new project called Kuma aims to compile PyTorch models into self-contained WebGPU executables. This approach would allow models to run directly in the browser without needing Python or a server-side runtime. The proje…
TOOL · CL_110863 · Jun 25 · 18:35

230M LFM2.5 model runs in-browser at 1,400 tokens/sec

A 230 million parameter model, LFM2.5, is now capable of running in a web browser at a speed of 1,400 tokens per second. This performance is achieved through custom WebGPU kernels, developed by individuals previously as…
TOOL · CL_110104 · Jun 24 · 19:15

SDXL image generation runs locally in browser via WebGPU

An open-source browser extension has been developed to run the SDXL image generation model locally using WebGPU. This extension allows users to install and run the model directly within their browser without complex set…
TOOL · CL_103941 · Jun 22 · 15:01

Moebius image inpainting model ported to browser using Claude Code

Simon Willison successfully ported the Moebius 0.2B image inpainting model to run in a web browser using Claude Code. The process involved converting the model to ONNX format and leveraging WebGPU for browser-based exec…
RESEARCH · CL_103994 · Jun 22 · 14:37

New AI frameworks enable safer multi-agent systems and browser-based latent collaboration

Researchers have developed a new framework for hierarchical multi-agent reinforcement learning that enforces safety constraints while maintaining efficiency. This approach uses a constraint manifold at a low level to en…
TOOL · CL_97443 · Jun 17 · 17:06

Gemma 4-E2B runs in-browser at 255 tok/s with WebGPU kernels

A demo and WebGPU kernels for Gemma 4-E2B have been released, enabling in-browser operation at approximately 255 tokens per second. The optimization was reportedly aided by Fable 5 before its shutdown. The release inclu…
TOOL · CL_88761 · Jun 13 · 07:54

Browser-based AI controls virtual hand in physics sandbox

A new AI sandbox called Semantic Hand allows users to control a virtual hand in a browser environment using natural language prompts. The system leverages local AI models like Nemotron 3 Nano 4B or Qwen 3.5-4B, running …
TOOL · CL_71883 · Jun 4 · 19:59

Chrome extension enables local Stable Diffusion 1.5 image generation

A new Chrome extension allows users to run Stable Diffusion 1.5 locally within their browser using WebGPU. This tool requires Chrome 113+ and approximately 4-6 GB of RAM, with an initial download of about 2.1 GB for the…
TOOL · CL_66475 · Jun 2 · 09:33

Browser-based real-time voice changer released as MVP

A developer has created a real-time voice changer that operates entirely within a web browser. This tool leverages WebAssembly, ONNX Runtime, and WebGPU for its functionality. The creator has released it as a minimum vi…
TOOL · CL_53185 · May 26 · 21:41

Chrome proposes on-device Embedding API for web apps

Google Chrome is proposing a new Embedding API that would allow web developers to generate vector embeddings for content directly on the user's device. This API aims to leverage on-device AI infrastructure to enable fea…
TOOL · CL_53214 · May 26 · 21:34

Ollama v0.30.0, Qwen3.5 35B, and 1-bit AI on WebGPU

Ollama's v0.30.0 pre-release is set to improve llama.cpp interoperability. Separately, a new Qwen3.5 35B model is available in GGUF and GPTQ formats, optimized for local inference on consumer GPUs. Additionally, PrismML…
RESEARCH · CL_53000 · May 26 · 18:53

PrismML releases compact 1-bit/ternary image diffusion models

PrismML has released new text-to-image diffusion transformer models, Binary and Ternary Bonsai Image 4B. These models are significantly smaller than previous versions, with a size of approximately 3GB, and can run local…
TOOL · CL_52547 · May 26 · 14:25

wllama brings GGUF LLMs to browser via WebAssembly and WebGPU

A new tool called wllama enables users to run GGUF large language models directly within their web browser. Leveraging WebAssembly and WebGPU, wllama bypasses typical browser limitations like the 4GB memory constraint a…
TOOL · CL_41024 · May 20 · 16:21

WebLLM brings AI models to browsers via WebGPU

WebLLM is a new project that enables large language models to run directly within web browsers using WebGPU for hardware acceleration. This client-side execution enhances user privacy and reduces server costs by keeping…
TOOL · CL_40924 · May 19 · 12:09

Browser-native GPU architecture enables MRI digital twins

Researchers have developed a new browser-native GPU architecture for creating interactive MRI digital twins. This decentralized approach bypasses traditional server-side rendering, executing complex 3D simulations direc…
TOOL · CL_25390 · May 10 · 19:36

ProxyFace adds local, emotional avatars to AI chats

ProxyFace is an open-source project that adds a local, expressive avatar to AI interactions. It utilizes a small, on-device emotion model and eye-tracking to make the avatar react to AI output and the user's gaze. The p…
TOOL · CL_15669 · May 5 · 04:00

Researchers create high-fidelity mobile avatars using local blendshapes

Researchers have developed a new method for creating high-fidelity human avatars that can operate on mobile devices. This approach utilizes local linear blendshapes within specific body parts to capture complex appearan…
RESEARCH · CL_15141 · May 5 · 00:02

Run LLMs locally with LFM 2 and Transformers.js, using WebGPU

Thomas Bley has released new slides detailing how to run Large Language Models (LLMs) locally using LFM 2. The presentation also covers using Transformers.js with WebGPU for privacy filters, function calling, and embedd…
RESEARCH · CL_13698 · May 3 · 12:41

Born ML framework achieves pure Go GPU acceleration with v0.8.0 update

Born ML has released version 0.8.0, transitioning its GPU backend from Rust's wgpu-native to a pure Go implementation called gogpu/wgpu. This change eliminates the need for external runtime dependencies like .dll or .so…
RESEARCH · CL_08477 · Apr 29 · 05:05

Nvidia's Nemotron 3 Nano Omni and Llama.cpp enable local LLM execution

Thomas Bley has released new presentation slides detailing how to run large language models locally. The slides cover Nvidia's Nemotron 3 Nano Omni, built-in tools for Llama.cpp, and the use of Transformers.js with WebG…