ENTITY OpenHands

OpenHands

PulseAugur coverage of OpenHands — every cluster mentioning OpenHands across labs, papers, and developer communities, ranked by signal.

Total · 30d

14

14 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

6

6 over 90d

TIER MIX · 90D

research 2
tool 11
commentary 1

TOPICS

SENTIMENT · 30D

7 day(s) with sentiment data

RECENT · PAGE 1/1 · 14 TOTAL

TOOL · CL_113285 · Jun 27 · 08:16

ContextForge tool combats LLM context rot with compression and reordering

Context rot, a phenomenon where LLMs lose accuracy in long conversations, is now measurable and can be mitigated. A new open-source tool called ContextForge acts as an intermediary, scoring, compressing, reordering, and…
RESEARCH · CL_113002 · Jun 27 · 00:02

NVIDIA releases Open-SWE-Traces dataset for AI software engineering training

NVIDIA has released Open-SWE-Traces, a dataset designed to train AI agents for software engineering tasks. A new tutorial from MarkTechPost demonstrates how to process this dataset for supervised fine-tuning. The tutori…
SIGNIFICANT · CL_98378 · Jun 18 · 07:57

NVIDIA releases Nemotron 3 Ultra, a 550B parameter open-weights model

NVIDIA has released Nemotron 3 Ultra, a 550-billion-parameter open-weights model that sets a new benchmark for US-based releases. This hybrid Mamba-Transformer mixture-of-experts model features a 1M-token context window…
TOOL · CL_93462 · Jun 16 · 04:00

New framework reveals LLM code generation security flaws

A new framework called DualGauge has been developed to automatically benchmark the security and functionality of code generated by LLMs and coding agents. The accompanying DualGauge-Bench dataset includes 307 tasks with…
TOOL · CL_82560 · Jun 10 · 04:00

Paper defines 'agent harness' for AI coding assistants

A new paper published on arXiv proposes a formal definition for "agent harness," a term used in software engineering for systems that wrap language models to create coding agents. The authors trace the term's origins an…
RESEARCH · CL_79381 · Jun 9 · 03:41

Open-source AI coding agents reviewed for developer use

Several open-source AI coding agents are being reviewed for their capabilities in handling complex, multi-step tasks. These tools, including Tabby, Gemini CLI, OpenHands (formerly OpenDevin), and Plandex, offer self-hos…
COMMENTARY · CL_68023 · Jun 3 · 02:56

AI Agents Fail Due to Data Issues, Not Model Limitations

AI agents often fail in production not due to the underlying model, but because of issues with the data they process. Common problems include undocumented data schemas, lack of normalization across different data source…
SIGNIFICANT · CL_67906 · Jun 3 · 00:47

MiniMax AI releases new model, drawing researcher praise

MiniMax AI has released a new model, which has garnered positive attention from researchers. Early feedback suggests the model performs well and is anticipated for use in the OpenHands environment.
TOOL · CL_86561 · Jun 2 · 00:00

AI agents can automate data curation, but need structured guidance

Researchers have developed Curation-Bench, a new benchmark designed to evaluate the ability of generalist coding agents to automate the data curation process for AI model training. Initial tests show that agents can per…
TOOL · CL_57753 · May 28 · 19:51

OpenHands launches as open-source platform for autonomous AI agents

OpenHands is an open-source platform designed for the creation and utilization of autonomous AI agents. These agents are capable of performing tasks independently, offering a new tool for developing AI-driven workflows.…
TOOL · CL_53837 · May 27 · 04:00

New BeyondSWE Benchmark Tests Code Agents on Complex Software Engineering Tasks

Researchers have introduced BeyondSWE, a new benchmark designed to evaluate code agents on more complex software engineering tasks beyond single-repository bug fixing. The benchmark, comprising 500 instances from 246 Gi…
TOOL · CL_38251 · May 18 · 16:00

New benchmark measures coding agents' unauthorized actions

Researchers have introduced OverEager-Gen, a new benchmark designed to measure "overeager actions" in coding agents, where these agents perform tasks beyond their explicit instructions. The benchmark highlights a measur…
TOOL · CL_30876 · May 12 · 06:38

CrewAI vs. LangGraph: Choosing LLM Agent Frameworks for Collaboration or Control

Two popular LLM agent frameworks, CrewAI and LangGraph, offer distinct approaches to building complex AI applications. CrewAI excels at quickly assembling collaborative, role-based agents for business processes, making …
TOOL · CL_27537 · May 11 · 05:21

New framework enables embodied AI agents to self-improve without resets

Researchers have developed "Continual Harness," a novel framework for embodied AI agents that enables self-improvement without requiring environment resets. This system allows agents to adapt and refine their own strate…