ENTITY Grand Portage National Monument

Grand Portage National Monument

PulseAugur coverage of Grand Portage National Monument — every cluster mentioning Grand Portage National Monument across labs, papers, and developer communities, ranked by signal.

Total · 30d

0 over 90d

Releases · 30d

0 over 90d

Papers · 30d

0 over 90d

TIER MIX · 90D

No coverage in the last 90 days.

RELATIONSHIPS

other Group Relative Policy Optimization 50%

TIMELINE

2026-05-08 research_milestone A paper details a fix for gradient starvation in GRPO for binary rewards, significantly improving performance on GSM8K. source

SENTIMENT · 30D

3 day(s) with sentiment data

RECENT · PAGE 1/2 · 32 TOTAL

RESEARCH · CL_29077 · May 12 · 21:21

Open-source AntAngelMed model offers efficient medical AI with 103B parameters

Researchers have introduced AntAngelMed, a 103 billion parameter open-source medical language model. It utilizes a Mixture-of-Experts (MoE) architecture, activating only 6.1 billion parameters per query for enhanced eff…
RESEARCH · CL_27590 · May 10 · 14:06

New methods enhance LLM reasoning for long-context and multilingual tasks

Researchers have developed new methods for improving large language model reasoning capabilities, particularly for long-context and multilingual tasks. One approach, OGLS-SD, uses outcome-guided logit steering to calibr…
TOOL · CL_25615 · May 8 · 12:58

New RL algorithm fix boosts GSM8K accuracy by 45 points

Researchers have identified a critical issue in the Group Relative Policy Optimization (GRPO) algorithm when applied to binary rewards, leading to "gradient starvation." This occurs when all responses in a group are eit…
TOOL · CL_21988 · May 8 · 04:00

New Pair-GRPO algorithms enhance LLM alignment stability and generalization

Researchers have introduced the Pair-GRPO family, a novel theoretical framework designed to enhance the stability and generality of reinforcement learning for aligning large language models. This family includes two var…
TOOL · CL_21953 · May 8 · 04:00

New S-trace method improves RLVR efficiency and credit assignment

Researchers have introduced Selective Eligibility Traces (S-trace), a novel method designed to enhance the reasoning capabilities of large language models within the Reinforcement Learning with Verifiable Rewards (RLVR)…
TOOL · CL_22388 · May 8 · 04:00

MotionGRPO enhances egocentric motion recovery with reinforcement learning

Researchers have introduced MotionGRPO, a new framework designed to improve the recovery of full-body 3D human motion from head-mounted device signals. This method addresses limitations in existing diffusion-based techn…
TOOL · CL_22082 · May 8 · 04:00

New theory explains RLVR optimization dynamics and step-size thresholds

Researchers have developed a theoretical framework for Reinforcement Learning with Verifiable Rewards (RLVR), a technique used to fine-tune large language models with binary feedback. The study introduces a 'Gradient Ga…
TOOL · CL_20388 · May 7 · 04:00

New Balanced Aggregation method improves GRPO training for LLMs

Researchers have identified and proposed a solution for aggregation bias in GRPO-style training, a method used to enhance reasoning and code generation in large language models. The study reveals that standard GRPO's ag…
TOOL · CL_19903 · May 6 · 19:06

vLLM V1 engine rewrite achieves parity with V0 after backend fixes

Hugging Face's vLLM team detailed the process of aligning their new V1 engine with the V0 reference, focusing on ensuring backend parity before addressing Reinforcement Learning (RL) objective changes. They identified a…
RESEARCH · CL_20273 · May 6 · 17:50

OpenSearch-VL offers open recipe for advanced multimodal search agents

Researchers have developed OpenSearch-VL, a novel, fully open-source recipe for training advanced multimodal deep search agents. This approach utilizes a curated pipeline for high-quality training data, a diverse tool e…
TOOL · CL_18543 · May 6 · 04:00

Faithful-Agent framework improves GUI agents' grounding in screen evidence

Researchers have developed a new framework called Faithful-Agent to improve the reliability of vision-language model-based GUI agents. This framework addresses the issue of agents acting unfaithfully by prioritizing gro…
TOOL · CL_18884 · May 6 · 04:00

MICA framework enhances LLM emotional support dialogues with novel RL approach

Researchers have introduced MICA, a novel reinforcement learning framework designed to improve the performance of large language models in multi-turn emotional support dialogues. This critic-free approach addresses chal…
TOOL · CL_18768 · May 6 · 04:00

Pass-rate rewards fail to boost AI code generation, study finds

A new research paper explores the effectiveness of using pass-rate rewards in reinforcement learning for code generation tasks. The study found that while pass-rate rewards can alleviate the issue of sparse rewards, the…
TOOL · CL_18556 · May 6 · 04:00

New framework grounds LLM reasoning in causal models for fact verification

Researchers have developed a new framework that grounds multi-hop reasoning in Large Language Models (LLMs) using Structural Causal Models (SCMs). This approach treats fact verification as a causal inference process, ai…
TOOL · CL_15638 · May 5 · 04:00

VAnim framework generates SVG animations using rendering-aware reinforcement learning

Researchers have introduced VAnim, a novel framework designed to generate Scalable Vector Graphics (SVG) animations from text descriptions. This approach models animation as sparse state updates on an SVG DOM tree, sign…
RESEARCH · CL_15514 · May 4 · 14:14

New benchmark and models advance generalized moment retrieval in videos

Researchers have introduced Generalized Moment Retrieval (GMR), a new framework for video analysis that moves beyond the assumption of a single matching moment per query. This approach aims to retrieve all relevant temp…
RESEARCH · CL_12572 · May 1 · 21:03

AI model finetuning mostly idempotent, DPO can amplify traits

A guide explores advanced techniques for post-training large language models, focusing on Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Group Relative Policy Optimization (GRPO). These methods …
RESEARCH · CL_11403 · Apr 30 · 15:27

New Kernelized Advantage Estimation improves LLM reasoning with nonparametric statistics

Researchers have introduced Kernelized Advantage Estimation (KAE) to enhance the reasoning capabilities of large language models (LLMs) through reinforcement learning. KAE addresses limitations in existing methods like …
RESEARCH · CL_10264 · Apr 30 · 04:00

Meituan deploys Generative Bid Shading to optimize ad spending

Researchers have developed Generative Bid Shading (GBS), a new approach for optimizing ad bidding in real-time advertising. GBS utilizes an autoregressive generative model to create shading ratios and a reward preferenc…
RESEARCH · CL_09211 · Apr 29 · 15:01

IBM releases Granite 4.1 LLMs with 512K context and Apache 2.0 license

IBM has released the Granite 4.1 family of large language models, comprising 3B, 8B, and 30B parameter versions. These models were trained on approximately 15 trillion tokens through a five-stage pre-training process th…

Open-source AntAngelMed model offers efficient medical AI with 103B parameters

New methods enhance LLM reasoning for long-context and multilingual tasks

New RL algorithm fix boosts GSM8K accuracy by 45 points

New Pair-GRPO algorithms enhance LLM alignment stability and generalization

New S-trace method improves RLVR efficiency and credit assignment

MotionGRPO enhances egocentric motion recovery with reinforcement learning

New theory explains RLVR optimization dynamics and step-size thresholds

New Balanced Aggregation method improves GRPO training for LLMs

vLLM V1 engine rewrite achieves parity with V0 after backend fixes

OpenSearch-VL offers open recipe for advanced multimodal search agents

Faithful-Agent framework improves GUI agents' grounding in screen evidence

MICA framework enhances LLM emotional support dialogues with novel RL approach

Pass-rate rewards fail to boost AI code generation, study finds

New framework grounds LLM reasoning in causal models for fact verification

VAnim framework generates SVG animations using rendering-aware reinforcement learning

New benchmark and models advance generalized moment retrieval in videos

AI model finetuning mostly idempotent, DPO can amplify traits

New Kernelized Advantage Estimation improves LLM reasoning with nonparametric statistics

Meituan deploys Generative Bid Shading to optimize ad spending

IBM releases Granite 4.1 LLMs with 512K context and Apache 2.0 license