Arc Agi
PulseAugur coverage of Arc Agi — every cluster mentioning Arc Agi across labs, papers, and developer communities, ranked by signal.
3 day(s) with sentiment data
-
AI agents lose accuracy when rewriting their own memory, study finds
A new paper from UIUC researchers demonstrates that AI agents experience a significant decrease in accuracy when their memory is consolidated or rewritten by the LLM itself. The study, which tested GPT-5.4 across variou…
-
Neuro-inspired phase encoding boosts Vision Transformer learning efficiency
Researchers have introduced Kuramoto Oscillatory Phase Encoding (KoPE), a novel neuro-inspired mechanism designed to enhance the learning efficiency of Vision Transformers. By incorporating an evolving phase state along…
-
ARC-AGI solver success predicted by structural grid descriptors
Researchers have developed a method using structural grid descriptors to predict the success of symbolic solvers on ARC-AGI tasks. Across numerous runs and distinct solver architectures, these descriptors, measured at 5…
-
New framework measures information flow in AI spatial reasoning
Researchers have introduced a new framework called "interaction locality" to measure how information flows within AI models during spatial reasoning tasks. This framework analyzes whether computations remain localized o…
-
New API uses LLMs for universal text-based optimization
Researchers have developed "optimize_anything," a universal API that uses LLMs to solve a wide range of optimization problems by treating them as text-based improvements. This system demonstrates state-of-the-art result…
-
GIM benchmark evaluates LLMs on integrated cognitive tasks
Researchers have introduced the Grounded Integration Measure (GIM), a new benchmark designed to evaluate large language models by integrating multiple cognitive domains. GIM comprises 820 original problems that require …
-
Poetiq's AI harness beats Opus 4.7 using Gemini 3 Flash
The AI startup Poetiq has developed a self-optimizing harness that achieves new state-of-the-art performance on coding and ARC-AGI benchmarks. This harness, utilizing Google's Gemini 3 Flash model, has surpassed Anthrop…
-
VCBench benchmark tests LLMs for venture capital founder success prediction
Researchers have introduced VCBench, a novel benchmark designed to evaluate the capabilities of large language models in predicting founder success within the venture capital industry. This benchmark includes a dataset …
-
Researcher tackles ARC challenge, seeking non-LLM AGI research paths
The ARC challenge, a test for artificial general intelligence, is being tackled by a researcher focusing on AGI3. This challenge presents a research direction distinct from large language models. The ARC prize aims to a…