In a bid to keep up with Google, Meta has unveiled its own AI-powered music generator called MusicGen. What sets MusicGen apart is that Meta has made it open-source, unlike its competitor. This move allows developers and enthusiasts to access and utilize the tool.
MusicGen takes a text description, such as "An '80s driving pop song with heavy drums and synth pads in the background," and converts it into approximately 12 seconds of audio. Additionally, it offers the option to be guided by reference audio, incorporating both the provided description and melody.
Meta just released MusicGen, a simple and controllable model for music generation
— AK (@_akhaliq) June 9, 2023
MusicGen is a single stage auto-regressive Transformer model trained over a 32kHz EnCodec tokenizer with 4 codebooks sampled at 50 Hz. Unlike existing methods like MusicLM, MusicGen doesn't not… pic.twitter.com/kFCOrAmLSh
To develop MusicGen, Meta trained it on an extensive dataset of 20,000 hours of music. This included 10,000 licensed music tracks, which were deemed of high quality, as well as 390,000 instrument-only tracks sourced from Shutterstock and Pond5, prominent stock media libraries. While Meta has not shared the code used for training the model, pre-trained models are available for anyone with compatible hardware, primarily a GPU with around 16GB of memory.
Might this signify the rise of AI surpassing musicians in their craft? Well, it appears that MusicGen's prowess still falls behind the irreplaceable essence of human musicians. The generated songs are moderately melodic, particularly for basic prompts such as "ambient chiptunes music." They are on par, if not slightly superior, to the results from Google's AI music generator, MusicLM. However, they lack the finesse necessary to win accolades.
For instance, when prompted to generate "jazzy elevator music," MusicGen produced the following sample:
In comparison, here is MusicLM's rendition:
In a more complex test, MusicGen was challenged with the prompt "Lo-fi slow BPM electro chill with organic samples." Surprisingly, MusicGen outperformed MusicLM in terms of musical coherence, delivering a track that could easily find a place on Lofi Girl. Here is MusicGen's sample:
And this is MusicLM's attempt:
To diversify the assessment, it was attempted to generate a piano composition in the style of George Gershwin using both tools. However, Google's MusicLM has implemented a filter that blocks prompts mentioning specific artists, aiming to address copyright concerns. MusicGen, on the other hand, lacks such restrictions. Regrettably, the outcome of the prompt "Background piano music in the style of Gershwin" was unsatisfactory:
Although generative music has made significant strides, moral and lawful concerns still loom. MusicGen, like other AI systems, learns from existing music to replicate similar effects, which has raised issues among artists and users of generative AI. Homemade tracks utilizing generative AI have gone viral, often imitating authentic sounds. However, music labels have promptly flagged such instances to streaming services, citing intellectual property concerns. The implications surrounding copyright infringement by "deep fake" music remain ambiguous, awaiting clarification through ongoing lawsuits that address the rights of artists whose work is unknowingly and involuntarily used to train AI systems.
Meta, taking a different approach, has refrained from imposing usage restrictions on MusicGen. The company assures that all the music used for training MusicGen is covered by legal agreements with the rights holders, including a deal with Shutterstock.
The emergence of AI music generation technology has ushered in a new era, but its impact and legal implications are still being defined. As the technology evolves, artists, labels, and rights holders eagerly await guidance and clarity regarding the boundaries of generative AI in the music industry.