ENTITY nvidia-smi

nvidia-smi

PulseAugur coverage of nvidia-smi — every cluster mentioning nvidia-smi across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

5 over 90d

Releases · 30d

0 over 90d

Papers · 30d

0 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

4 day(s) with sentiment data

RECENT · PAGE 1/1 · 5 TOTAL

TOOL · CL_120373 · Jul 1 · 15:38

DGX Spark GPU overheating solved by clock-locking with nvidia-smi

A developer has found a workaround for overheating issues with the DGX Spark GPU when running large language models like Ollama and Qwen2.5. The GPU, specifically the GB10, lacks user-accessible power and fan controls, …
TOOL · CL_106135 · Jun 20 · 01:36

KV cache memory problem plagues LLM serving, vLLM's PagedAttention offers solution

The KV cache is a critical component in LLM inference, storing past computations to avoid recomputing them for each new token. However, its memory footprint can become a significant bottleneck, especially in production …
TOOL · CL_87068 · Jun 12 · 06:22

Local LLM Hardware Guide: VRAM, Quantization, and Performance

Running large language models (LLMs) locally, particularly those with 70 billion parameters, presents significant hardware challenges, primarily concerning VRAM capacity. While marketing often suggests minimal requireme…
TOOL · CL_71693 · Jun 4 · 16:45

User doubles LLM inference speed by fixing PCIe slot bottleneck

A user building a multi-GPU setup for local LLM inference discovered a significant performance bottleneck caused by a misconfigured PCIe slot. One of the four RTX 3090 GPUs was incorrectly placed in a slot that only sup…
TOOL · CL_13691 · May 3 · 13:20

Utilyze offers open-source tool for deeper GPU performance insights beyond load

Utilyze is a new open-source tool designed to provide deeper insights into GPU performance beyond simple load percentages. It directly accesses GPU performance counters to measure the actual utilization and efficiency o…

DGX Spark GPU overheating solved by clock-locking with nvidia-smi

KV cache memory problem plagues LLM serving, vLLM's PagedAttention offers solution

Local LLM Hardware Guide: VRAM, Quantization, and Performance

User doubles LLM inference speed by fixing PCIe slot bottleneck

Utilyze offers open-source tool for deeper GPU performance insights beyond load