PulseAugur
EN
LIVE 08:33:39

AI firms scan rare books for training data, then destroy them

AI companies are reportedly acquiring physical copies of older books from secondhand bookstores, particularly those not yet digitized. These books are then scanned to train AI models, with the physical copies often being destroyed afterward. This practice raises questions about data sourcing and the preservation of physical media. AI

IMPACT This practice highlights novel data acquisition methods for AI training, potentially impacting the value of physical media and raising ethical considerations.

RANK_REASON The item discusses a practice by AI companies rather than a direct release or research finding.

Read on Mastodon — fosstodon.org →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

AI firms scan rare books for training data, then destroy them

COVERAGE [1]

  1. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    So apparently # AI companies are buying old books at secondhand bookstores which aren't yet available in digital form, scan them in to feed their models, and th

    So apparently # AI companies are buying old books at secondhand bookstores which aren't yet available in digital form, scan them in to feed their models, and then shred the books afterwards. # books # bookstodon https://www. tagesschau.de/kultur/ki-firmen -antiquarische-buecher-1…