PulseAugur / Brief
EN
LIVE 09:13:57

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Rigel: Reverse-Engineering the Metal 4.1 Tensor Compute Path on the Apple M4 Max GPU

    Researchers have reverse-engineered the Metal 4.1 tensor compute path on Apple's M4 Max GPU, revealing that the fp8 matmul2d operation is emulated rather than hardware-accelerated. This means the operation runs on the GPU's shader cores, accumulates in at least fp32 precision, and does not utilize a dedicated matrix datapath or the Apple Neural Engine. The findings, detailed in a paper titled "Rigel," were achieved through empirical characterization and microbenchmarking, leading to the development of a fused kernel that outperforms the decomposed path by up to 12.9%. AI

    IMPACT Reveals emulation of key tensor operations on Apple hardware, impacting AI model performance expectations.