top of page

The Battle for the Black Box: Why the Music Industry is Fighting to Open AI's Training Data

The headline says it all: "UMG, Sony Challenge Suno’s AI Secrecy Bid." What began in 2024 as a landmark copyright lawsuit by the Recording Industry Association of America (RIAA) against AI music heavyweights Suno and Udio has escalated into an all-out street fight over transparency.

Universal Music Group (UMG) and Sony Music Entertainment are aggressively moving to peel back the curtain on exactly how these AI models were built. At the absolute center of this legal chess match is a single, heavily guarded number: The Model Training Figure.

What is the "Secrecy Bid" All About?

During the legal discovery process, Suno was forced to hand over data detailing the inner workings of its platform to the major labels. Armed with this data and audio-fingerprinting technology (via Audible Magic), UMG and Sony discovered the staggering scale of what had occurred.

The labels immediately petitioned the court to dramatically expand their lawsuit—seeking to add 61,026 specific copyrighted recordings to the case against Suno (and another 30,000+ against Udio), up from just a few hundred in the original complaint.

In response, Suno filed a motion to keep the total number of audio files used to train its AI model completely sealed from the public, claiming that revealing the sheer volume of data would cause them "severe commercial and competitive harm" by letting rivals benchmark against their tech.

The Labels’ Counter-Punch: No More Redactions

UMG and Sony aren't buying it. In their latest filings, the labels urged the federal court to reject Suno's secrecy bid, arguing:

  • The Public Right to Know: There is a "strong presumption of public access" regarding court pleadings.

  • The Scale of Copying: The exact size of the dataset is not a harmless trade secret; it speaks directly to the nature, extent, and willful magnitude of the copyright infringement.

  • The Irony: Suno has already publicly admitted in court that its model was trained on "tens of millions" of recordings, which "presumably included" the labels' property. The majors argue that keeping the precise number a secret now is legally meritless.

The Legal Landscape: A Tale of Two Strategies

The tech-vs-music landscape has split into two fascinating, opposing philosophies:

   ┌─────────────────────────────────────────────────────────┐
   │                  THE MUSIC INDUSTRY SPLIT               │
   └────────────────────────────┬────────────────────────────┘
                                │
         ┌──────────────────────┴──────────────────────┐
         ▼                                             ▼
┌─────────────────────────────────┐           ┌─────────────────────────────────┐
│     THE LITIGATORS (UMG/Sony)   │           │      THE PARTNERS (Warner)      │
├─────────────────────────────────┤           ├─────────────────────────────────┤
│ • Fight to the end in court     │           │ • Settled out of court (2025)   │
│ • Demanding $150k per infraction│           │ • Chose early licensing deals   │
│ • Goal: Set a legal precedent   │           │ • Goal: Build revenue layers    │
└─────────────────────────────────┘           └─────────────────────────────────┘

While UMG and Sony are pushing for scorched-earth litigation that could result in billions of dollars in statutory damages, Warner Music Group quietly exited the lawsuit in late 2025. Warner chose instead to strike a commercial licensing partnership with Suno, betting that working with the technology early is more lucrative than fighting it in a multi-year court battle.

Why the "Training Figure" Matters to Every Creator

This fight is about much more than a corporate dispute between tech startup founders and billionaire record executives. The outcome of the Suno and Udio cases will establish the global default rules for generative AI.

The Core Debate: Suno and Udio maintain that scraping open audio surfaces (like YouTube, SoundCloud, and Bandcamp) to train an AI is legally protected under the Fair Use Doctrine—comparing an AI "learning" from music to a human musician listening to the radio for inspiration. The labels counter that automated, industrial-scale scraping and commercial cloning is straight-up theft.

If the courts force Suno and Udio to make their training data public, it opens a massive door for independent artists, producers, and smaller international catalogs (who don't have the money for RIAA lawyers) to audit these AI companies and demand their own piece of the pie.

What's Next?

The clock is ticking. With fact discovery deadlines looming and a federal judge already vacating a similar secrecy order in the parallel Udio case, the tech industry's "black box" method of building AI is under existential threat.

If UMG and Sony win this round, the music industry will force a future where AI cannot hide behind trade secrecy. Every single track used to train a machine will require a receipt.

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page