MBW Views is a series of opinion pieces written by leading music industry professionals… with something to say.
The following MBW editorial comes from Ed Newton-Rex (pictured), CEO of ethical generative AI company, Fairly trainedand former VP Audio at Stability AI.
Newton-Rex announced in November that he was leaving his position at Stability AI due to concerns over the company’s view that training generative AI models on copyrighted works constitutes a “fair use”. Before joining Stability AI in 2022, Newton-Rex founded Jukedeck, the pioneering music creation AI platform. sell it to TikTok/ByteDance in 2019.
He then became product director of TikTok’s in-house AI lab, before becoming product director of the music app Voisey (sold to Snap in late 2020).
Billboard reported last month, that GoogleLyria’s powerful music generation model, was trained on copyrighted music without the permission of the rights holders. There is a danger that we will become accustomed to stories like this, given the trials piling up in the world of text and image generation. But if true, this is a particularly interesting case, as it would mark a reversal in approach for a company that was once at the forefront of fair music generation models.
To understand how Google’s approach to data training in AI music generation has changed, we need to go back to 2016. AI music was in its infancy. I led one of the few startups in the industry. Out of nowhere, Google announced Magenta, a group working on AI creativity in a number of fields, including music. In a short time, they released interactive demos, created Ableton Live plugins, published articles, and made their code open source. They have played an important role in bringing the creativity of AI to public consciousness.
And here’s the thing: like other AI music startups at the time, they respected creators’ rights. Their models were trained on data they had the right to use, whether classical music that had entered the public domain or music datasets they themselves had commissioned and made public . I know several members of the Magenta team at the time, and it’s clear that they took this approach out of a deep respect for the musicians themselves, drawn in part from their own considerable experience in music.
But something, somewhere in the company, seems to have changed that philosophy.
In November 2023 Google announced Lyria, their latest AI music generation effort. Rumors had been circulating for a while about this secret musical model, which amazed everyone who heard it – and did not disappoint. Lyric generation, vocal generation, high-quality instrumentals, style transfer – it had it all. 2023 had already seemed like a turning point for AI music, and Lyria seemed to confirm that this year would be remembered as the year everything changed.
One of the interesting things about Lyria’s announcement was the fact that the music industry partners were so important. Artists from Charlie Puth to John Legend have licensed their voices, and YouTube’s Music AI Incubatorwhat matters Universal music group as a partner, participated in its development. The message was clear: Google was the company that acted ethically. Haven’t these partnerships proven it?
“In my opinion – and I imagine this is shared by many in the creative industries – this “ask forgiveness, not permission” approach is not the best way to go about acquiring training data for generative AI . This does not give rights holders a level playing field.”
Enter the Billboard history. Four sources report that “Google trained its model on a wide range of music – including copyrighted major label recordings – and then went and showed it to rights holders, rather than asking first permission “.
In my opinion – and I imagine this is shared by many in the creative industries – this “ask forgiveness, not permission” approach is not the best way to go about acquiring training data for generative AI. This does not give rights holders a level playing field for negotiations: “Negotiating with a company as massive as Youtube was made more difficult because he had already taken what he wanted,” according to sources familiar with the ensuing label discussions who spoke to Billboard.
It also raises the question of whether Google recycled the model after removing data from rights holders who said no. The permission-agnostic approach they apparently took during initial training, as well as the documented argument The fact that generative AI training is covered by fair use doesn’t inspire much confidence.
These are not academic concerns. Although Lyria is not yet widely accessible, it is available in beta in DreamTrack, a YouTube Shorts experience, and demos of a much broader set of features have been shared visibly and intentionally across the web, reinforcing without no doubt investor confidence in Google. remains a leader in AI.
It’s hard to argue that it’s not already being used commercially. And when you loudly announce and demo a product, while launching it in beta, you put additional pressure on the rights holders on the other side of the negotiating table. Any rights holder who opts out deprives consumers of what has already been promised to them.
Lyria was touted by Google as responsible generative AI: the announcement claimed that it “protects music artists and the integrity of their work” and “(sets) the standard for the responsible development and deployment of music tools.” musical generation”. I suspect people might perceive these statements differently after Signposts revelations.
Why this about-face in Google’s position? How can a company go from being one of the most respectful of musicians’ rights to one adopting this approach in just seven years? It could be a combination of factors.
“Google isn’t the worst offender when it comes to music training data… But their claim that they set the standard for responsible development of these tools doesn’t stand up to scrutiny.”
The Google Brain team behind Magenta was merged with DeepMind earlier in 2023, so it’s possible that the initial core of the Magenta team has become diluted, along with their musician-centric philosophy. Competitive pressure may have led the team to believe they had to sacrifice training data ethics to stay ahead. Or, perhaps more likely, they have simply seen teams in other areas, internally or externally, training on copyrighted works and getting away with it (for now), and the company’s assessment of legal risks has changed.
Either way, they obviously decided it was worth the risk. I’m sure I’m not the only one hoping that this decision will eventually be overturned.
There are many AI music generation companies today that have remained true to Magenta’s original philosophy of respecting the rights of musicians – we have certified eight to Fairly trained. For individuals and businesses who care about human musicians, the choice of which models to use should be obvious. And I suspect the artists who partnered with Google to launch Lyria aren’t thrilled with Billboard’s revelations either.
Google isn’t the worst offender when it comes to music training data. In their defense, they at least make some effort to obtain licenses before making their platform widely available, which can’t be said of all music AI companies (without naming names). But their claim that they set the standard for responsible development of these tools doesn’t stand up to scrutiny.
I have immense respect for the technological advancements Google’s music AI researchers have made over the years. But I hope those who have been around since the Magenta days can combat what could be formidable internal opposition and return the company to the original Magenta philosophy, which truly set the standard for responsible music development by AI.Music Business Worldwide