English(EN) From RLHF to RLHB: The Case for Learning from Human Behavior - with Jeffrey Wang and Joe Reeve of Amplitude

Amplitude提倡通过用户行为学习而非明确反馈来开发AI

作者 PulseAugur 编辑部 · [1 个来源] · 2023-06-08 20:00

以产品分析闻名的Amplitude公司正大力投入将AI整合到其产品中。他们正在探索超越传统基于人类反馈的强化学习（RLHF）的方法，RLHF依赖于明确的、通常成本高昂且可能存在偏见的的用户输入。Amplitude提倡从产品内的真实用户行为中学习，并引用了GitHub Copilot和Midjourney等示例，这些示例通过用户交互自然地收集隐式反馈。这种方法旨在为训练AI模型提供更真实、更具成本效益的数据，可能使AI分析比AI本身更重要。 AI

排序理由该条目讨论了一种新的AI训练数据收集方法，从RLHF转向从真实用户行为中学习，这是一个AI开发中的研究导向型主题。

在 Latent Space Podcast 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

Latent Space Podcast TIER_1 English(EN) · Jeffrey Wang and Joe Reeve · 2023-06-08 20:00

From RLHF to RLHB: The Case for Learning from Human Behavior - with Jeffrey Wang and Joe Reeve of Amplitude

Welcome to the almost 3k latent space explorers that joined us last month! We’re holding our first SF listener meetup with <a href="https://changelog.com/practicalai" target="_blank">Practical AI</a> next Monday;…

报道来源 [1]

From RLHF to RLHB: The Case for Learning from Human Behavior - with Jeffrey Wang and Joe Reeve of Amplitude

相关话题