Tech companies and libraries are rapidly digitizing millions of physical books for AI training and digital preservation. Google, Anthropic, and institutions like Harvard and Ohio University are scaling up scanning operations, with some firms destroying books after digitization to fuel large language models.
·Ohio University Libraries joins Google Books Library Project for digitization
·Harvard and Google releasing 1 million public domain books as AI training data
·Anthropic scanned and destroyed millions of books for AI model training
·Google Books team moved 90,000 books across continent for scanning infrastructure
·AI startups building secret scanning operations to acquire physical book collections at scale
drawn from Ohio University, Transparency Coalition, The Washington Post, Harici · updated 116d ago