PulseAugur
LIVE 03:47:15
ENTITY Pfadfinder und Pfadfinderinnen Österreichs

Pfadfinder und Pfadfinderinnen Österreichs

PulseAugur coverage of Pfadfinder und Pfadfinderinnen Österreichs — every cluster mentioning Pfadfinder und Pfadfinderinnen Österreichs across labs, papers, and developer communities, ranked by signal.

Total · 30d
0
0 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
0
0 over 90d
TIER MIX · 90D

No coverage in the last 90 days.

RECENT · PAGE 1/1 · 18 TOTAL
  1. TOOL · CL_22524 ·

    AI model optimizes HAPS base station positioning in windy maritime networks

    Researchers have developed a new framework using deep reinforcement learning to dynamically position High-Altitude Platform Stations (HAPS) in maritime networks. This approach specifically addresses challenges posed by …

  2. TOOL · CL_20509 ·

    HELM system optimizes GPU HBM for generative recommender latency

    Researchers have developed HELM, a system designed to optimize the performance of generative recommender models by dynamically managing High Bandwidth Memory (HBM) allocation between embedding (EMB) and KV caches. Exist…

  3. TOOL · CL_20435 ·

    Counter-Dyna cuts HVAC control training time to 5 weeks

    Researchers have developed Counter-Dyna, a novel method for data-efficient reinforcement learning in HVAC control systems. This approach utilizes counterfactual surrogate models that leverage state-space invariances, si…

  4. TOOL · CL_19903 ·

    vLLM V1 engine rewrite achieves parity with V0 after backend fixes

    Hugging Face's vLLM team detailed the process of aligning their new V1 engine with the V0 reference, focusing on ensuring backend parity before addressing Reinforcement Learning (RL) objective changes. They identified a…

  5. TOOL · CL_18782 ·

    New OGPO algorithm boosts sample efficiency for generative control policies in robotics

    Researchers have introduced Off-policy Generative Policy Optimization (OGPO), a novel algorithm designed for sample-efficient finetuning of generative control policies in robotics. OGPO leverages off-policy critic netwo…

  6. TOOL · CL_18538 ·

    PERSA pipeline uses RLHF to align LLM feedback with instructor style

    Researchers have developed PERSA, a novel approach using Reinforcement Learning from Human Feedback (RLHF) to adapt large language models for generating personalized educational feedback. This method specifically target…

  7. TOOL · CL_16702 ·

    Author demystifies reinforcement learning math with new blog series

    A new blog series aims to demystify the mathematics behind reinforcement learning, starting with foundational concepts and progressing towards advanced algorithms like Proximal Policy Optimization (PPO). The initial pos…

  8. RESEARCH · CL_16149 ·

    AI agents leverage reinforcement learning to enhance software test case generation and code coverage

    Researchers have developed two novel approaches for automated test case generation using large language models (LLMs) and reinforcement learning. The first method, PPO-LLM, employs Proximal Policy Optimization (PPO) to …

  9. RESEARCH · CL_15452 ·

    TUR-DPO enhances LLM alignment by incorporating topology and uncertainty into preference optimization.

    Researchers have introduced TUR-DPO, a novel method for aligning large language models with human preferences. Unlike standard Direct Preference Optimization (DPO), TUR-DPO incorporates topology and uncertainty awarenes…

  10. TOOL · CL_16233 ·

    New research shows high entropy leads to symmetry equivariant policies in Dec-POMDPs

    A new paper explores how high entropy regularization can lead to symmetry-equivariant policies in Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs). The research demonstrates that sufficiently hi…

  11. RESEARCH · CL_11904 ·

    New C++ engine HASE achieves 33M steps/sec for multi-agent RL training

    Researchers have developed a new C++ engine called Hide-And-Seek-Engine (HASE) designed to significantly improve the efficiency of training reinforcement learning agents in decentralized, partially observable environmen…

  12. RESEARCH · CL_08685 ·

    xLSTM networks enhance deep reinforcement learning for automated stock trading

    Researchers have developed a new automated stock trading system utilizing Extended Long Short-Term Memory (xLSTM) networks combined with deep reinforcement learning (DRL). This approach aims to overcome the limitations …

  13. RESEARCH · CL_06928 ·

    AI framework optimizes land use for ecosystem services in Lake Malawi Basin

    Researchers have developed a deep reinforcement learning framework to optimize land-use allocation in the Lake Malawi Basin, aiming to enhance ecosystem service value. The system uses a Proximal Policy Optimization agen…

  14. RESEARCH · CL_06752 ·

    Researchers develop new methods to debias and improve reward models for LLMs

    Researchers have developed new methods to improve the reliability and interpretability of reward models (RMs) used in aligning large language models (LLMs). One approach introduces a causally motivated intervention tech…

  15. RESEARCH · CL_06317 ·

    GradMAP AI learns decentralized grid-edge device control with faster training

    Researchers have developed GradMAP, a novel gradient-based multi-agent proximal learning method designed for coordinating decentralized grid-edge devices. This approach trains independent neural network policies for eac…

  16. RESEARCH · CL_05416 ·

    DVPO and EVPO advance LLM post-training with novel RL optimization techniques

    Researchers have introduced DVPO, a new reinforcement learning framework designed for improving Large Language Model (LLM) post-training, particularly when dealing with noisy or incomplete supervision signals. DVPO util…

  17. RESEARCH · CL_01553 ·

    OpenAI releases Proximal Policy Optimization for simpler, effective reinforcement learning

    OpenAI has released Proximal Policy Optimization (PPO), a new reinforcement learning algorithm that offers comparable or superior performance to existing methods while being simpler to implement and tune. PPO strikes a …

  18. SIGNIFICANT · CL_02559 ·

    OpenAI Five AI defeats Dota 2 world champions in historic esports match

    OpenAI Five has achieved a significant milestone by defeating the world champions of Dota 2 in two consecutive games at the OpenAI Five Finals. This marks the first time an AI has publicly triumphed over professional es…