Stability AI, the startup behind the AI-powered art generator Stable Diffusion, has released an open AI model for generating sounds and songs that it says were trained exclusively on recordings royalty free.
Called Stable Audio Open, the generative model takes a textual description (e.g., “A rock beat played in a processed studio, a drum session on an acoustic kit”) and produces a recording lasting up to 47 seconds. The model was trained using approximately 486,000 samples from the free music libraries Freesound and Free Music Archive.
Stability AI says the model can be used to create drum beats, instrument riffs, ambient noises and “production assets” for videos, films and TV shows, as well as for “editing” existing songs or apply the style of one song (e.g. smooth jazz) to another.
“One of the main benefits of this open source version is that users can fine-tune the model on their own custom audio data,” Stability AI wrote in an article on his company blog. “For example, a drummer could refine samples from their own drum recordings to generate new rhythms.”
Stable Audio Open has its limitations, however. He can’t produce full songs, melodies or chants – at least not good ones. Stability AI says it’s not optimized for this and suggests users looking for these capabilities opt for the company’s one. stable premium audio service.
Stable Audio Open also cannot be used commercially; its terms of service prohibit it. And it doesn’t work as well across musical styles and cultures or with descriptions in languages other than English – biases that Stability AI attributes to the training data.
“The data source potentially lacks diversity and not all cultures are represented equally in the dataset,” Stability AI wrote in a statement. description of the model. “Samples generated from the model will reflect biases in the training data.”
Stability AI – which has long fought to turn around its declining business – recently became the subject of controversy after its vice president of generative audio, Ed Newton Rex, resigned due to disagreement with the company’s position that training generative AI models on copyrighted works constitutes “fair use.” Stable Audio Open appears to be an attempt to subvert this narrative, while not-so-subtly advertising Stability AI’s paid products.
As music generators, including those from Stability, gain popularity, copyright – and how some generator creators might abuse it – becomes a focal point of attention.
In May, Sony Music, which represents artists such as Billy Joel, Doja Cat and Lil Nas sent a letter to 700 AI companies warning against “unauthorized use” of its content for training audio generators. And in March, the first US law to combat AI abuse in music was passed. signed into law in Tennessee.