English(EN) Microsoft trained its MAI models on unlicensed web data despite promising "enterprise grade, clean and commercially licensed data"

Microsoft MAI 模型使用未经许可的网络数据进行训练

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-05 12:10

据报道，Microsoft 使用未经许可的网络数据训练了其 MAI 模型，这与其公开声称仅使用“企业级、干净且已获商业许可的数据”的说法相矛盾。该公司的做法与其他 AI 实验室类似，依赖于合理使用原则，并将数据收集的选择权留给网站所有者。 AI

影响引发了对 AI 模型训练中数据来源和许可实践的质疑。

排序理由文章讨论了公司的实践和声明，而非新发布或事件。

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

The Decoder TIER_1 English(EN) · Matthias Bastian · 2026-06-05 12:10

Microsoft trained its MAI models on unlicensed web data despite promising "enterprise grade, clean and commercially licensed data"

<p><img alt="" class="attachment-full size-full wp-post-image" height="768" src="https://the-decoder.com/wp-content/uploads/2026/06/microsoft_logo_plain.png" style="height: auto; margin-bottom: 10px;" width="1376" /></p> <p> Microsoft sells its LLM training approach as different …