Meet the AI jukebox that creates songs

The new OpenAI model generates genre-specific music with lyrics

Photo: OpenAI Jukebox can categorise musicians based on the way they sound.

OpenAI, an independent research organisation aimed at developing “friendly AI,” has been cranking out a lot of impressive work over the past few months. The organisation, for example, recently provided the source code for the language generating tool, Talk to Transformer. Now, OpenAI is adding to its repertoire of AI tools, Jukebox: An AI that generates raw audio of genre-specific songs.

OpenAI recently announced the release of Jukebox, noting it’s an AI that’s able to generate music in the raw audio domain. Raw audio is a file format for storing uncompressed audio in raw form. Researchers unassociated with OpenAI have previously said that generating music with the idiosyncrasies and nuances of a real musical performance is only possible with raw audio. Generating it, however, is difficult when using training data from digital music that’s already been “cleaned up.”

Thus, OpenAI trained convolutional neutral networks with a curated list of 1.2 million songs to generate the raw-sounding music. In the organisation’s paper describing Jukebox, the researchers say that the 1.2 million songs were paired with their corresponding lyrics and metadata, collected from LyricWiki. The metadata for each song included information like genre, artist, album, and any associated playlist keywords.

Essentially, OpenAI’s software engineers trained the convolutional neural networks - which are machine learning algorithms especially good at identifying images and language patterns - with the 1.2 million songs and all of their related metadata. Using that training data, the neural networks made their own songs. In other words, the OpenAI team fed machine learning algorithms all of those songs and their associated metadata, and then had the algorithms spit out raw musical samples that follow the same patterns found in the samples fed to them.

The songs created by Jukebox are stunningly realistic. One of the tracks, for example, was generated by Jukebox after only receiving lyrics co-written by a language modeling tool and OpenAI researchers. Meaning that Jukebox was able to take the provided lyrics and generate an appropriate singing voice, instrumentals, and genre. Jukebox literally created all of that song, except for the lyrics, entirely from scratch.

“While Jukebox represents a step forward in musical quality, coherence, length of audio sample, and ability to condition on artist, genre, and lyrics, there is a significant gap between these generations and human-created music,” researches wrote. “For example, while the generated songs show local musical coherence, follow traditional chord patterns, and can even feature impressive solos, we do not hear familiar larger musical structures such as choruses that repeat.”

Looking forward, OpenAI will be moving toward generating musical collaborations made by humans in conjunction with machines.

“We expect human and model collaborations to be an increasingly exciting creative space,” OpenAI says in its press release. Although the organisation adds that “While Jukebox is an interesting research result, [the musicians who’ve tested it so far] did not find it immediately applicable to their creative process given some of its current limitations.”

Similar articles