Microsoft's MAI-Thinking-1 model is under scrutiny due to its training data. Despite claims of using clean, commercially licensed data, the model appears to have been trained on information from Common Crawl and the public web. This raises questions about the integrity of its data sourcing and its promise of exclusively using licensed datasets. AI
IMPACT Raises questions about data provenance and ethical AI training practices.
RANK_REASON The cluster discusses a model's training data, which falls under research and safety concerns. [lever_c_demoted from research: ic=1 ai=1.0]
Read on Mastodon — fosstodon.org →
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →