English(EN) Dwarkesh's two-hour blackboard discussion with Reiner Pope deduces how frontier LLMs are actually trained and served, using just API price lists, public benchma

LLM训练成本逆向工程；微调解锁潜在版权回忆

作者 PulseAugur 编辑部 · [3 个来源] · 2026-05-02 01:42

最近的一篇预印本论文表明，在单一作者的作品上微调大型语言模型（LLM），可以导致模型逐字回忆起其并未明确训练过的版权材料。这种现象似乎源于预训练数据中的潜在信息，而非微调数据集本身。研究表明，在合成文本上进行微调不会产生类似的逐字输出，这可能将版权责任转移给模型开发者。 AI

影响这项研究可能会通过突出LLM中潜在的数据回忆问题，重新定义AI实验室的版权责任。

排序理由该集群讨论了一篇关于LLM行为和版权影响的新预印本论文的发现。

在 Mastodon — fosstodon.org 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。我们如何撰写摘要 →

报道来源 [3]

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-02 01:42

Dwarkesh与Reiner Pope进行的两小时黑板讨论，仅凭API价格表和公开基准测试，推断出前沿LLM的实际训练和部署方式

Dwarkesh's two-hour blackboard discussion with Reiner Pope deduces how frontier LLMs are actually trained and served, using just API price lists, public benchmark numbers, and memory-bandwidth arithmetic. The thesis: you can reverse-engineer frontier architecture from the per-tok…
Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-02 01:42

一项新的预印本研究发现，对前沿大型语言模型进行单一作者小说微调，可以解锁模型从未见过的数十本其他受版权保护书籍的逐字输出

A new preprint finds that finetuning frontier LLMs on one author's novels unlocks verbatim output from dozens of other copyrighted books the model never saw at finetune time. The real finding is the control: synthetic-text finetuning produces near-zero extraction. So the copies a…
Mastodon — mastodon.social TIER_1 English(EN) · [email protected] · 2026-05-02 01:58

Dwarkesh与Reiner Pope进行两小时黑板讨论，仅凭API价格表、公开基准测试推断出前沿LLM的实际训练和部署方式

Dwarkesh's two-hour blackboard discussion with Reiner Pope deduces how frontier LLMs are actually trained and served, using just API price lists, public benchmark numbers, and memory-bandwidth arithmetic. The thesis: you can reverse-engineer frontier architecture from the per-tok…

报道来源 [3]

Dwarkesh与Reiner Pope进行的两小时黑板讨论，仅凭API价格表和公开基准测试，推断出前沿LLM的实际训练和部署方式

一项新的预印本研究发现，对前沿大型语言模型进行单一作者小说微调，可以解锁模型从未见过的数十本其他受版权保护书籍的逐字输出

Dwarkesh与Reiner Pope进行两小时黑板讨论，仅凭API价格表、公开基准测试推断出前沿LLM的实际训练和部署方式

相关实体

相关话题