“It just seems dishonest. It seems like theft”: The Atlantic magazine uncovers that the tech giants are illegally scraping millions of songs to train their AI models and not paying grass roots musicians a cent
It reveals the four huge tranches of songs they’re stealing from
The Atlantic magazine have published an article that pithily sums up where we are regarding AI music and the attempts to regulate it so musicians are fairly compensated. Pretty much at the mercy of big tech, concludes wrier Alex Reisner.
Reisner’s investigation has found that that there are four giant datasets that are being shared within the tech development community. One contains 12 million tracks, another 9 million and there are two smaller ones that encompass 100,000 each. They contain every major artist in every genre you could possibly think of. He estimates that it would take 91 years to listen to the largest of those datasets.
And the AI developers are using them quite blatantly. “Three are distributed as a list of links to songs on YouTube or Spotify,” writes Reisner. “AI developers download the actual audio using tools that automate the job, some of which allow developers to bypass logins, advertisements, and mechanisms that might earn money or subscribers for creators. Such tools violate the terms of service of these platforms.”
The fourth dataset is the Free Music Archive, which was started in 2009 by the New Jersey radio station WFMU to provide a service for listeners. Primarily it’s a way for artists to share music with fans, but anyone wishing to use content from FMA for use in say a for-profit video has to pay.
Needless to say, the FMA has been scraped by big tech looking to train its AI models. When its head Hessel Van Oorschot sent Google a letter alleging this, the tech giant’s response was in his words “a big middle finger.” Their letter argued that “we believe everyone benefits from a vibrant content ecosystem.”
The Amsterdam-based Van Oorschot could do little against such blatant bullying. “For me to fly to America and start a lawsuit with Google made no sense,” he said.
Reisner mentioned one artist who was hitting back. Benn Johnson has been a professional musician who noticed in 2025 that tech companies were “scraping my music without my consent, then generating shittier music with it that is inadvertently associated with my name, and then attempting to resell that in the same economy in which I make money.” He developed a tool to “poison” generative-AI by adding noise to audio files that humans can’t hear but that confuses the bots.
Want all the hottest music and gear news, reviews, deals, features and more, direct to your inbox? Sign up here.
But not every musician is as tech-savvy as Johnson. Reisner spoke to another, Derek Clegg, who has 250 songs in the FMA dataset. He said he merely wished to be able to opt out of being scraped. “It just seems dishonest. It seems like theft,” he said. “There’s going to have to be a reckoning.” Let’s hope so...

Beth Simpson is a freelance music expert whose work has appeared in Classic Rock, Classic Pop, Guitarist and Total Guitar magazine. She is the author of 'Freedom Through Football: Inside Britain's Most Intrepid Sports Club' and her second book 'An American Cricket Odyssey' was published in 2025.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.