Lauren Johns
- Feb 27

Artificial Intelligence in Music: A Useful Tool or a Replacement?

By: Lauren Johns

In a society of wolf criers, theories about the world ending are nothing new. But modern day technology is acting as fuel for this fire, with the skyrocketing consumption of AI based gadgets. According to Forbes: 50% of mobile users in the U.S. are utilizing AI voice assistants like Alexa and Siri. In addition, the chatbot turned mobile app, “ChatGPT '' gained 1 million users within five days of its release. Due to AI’s popularity and rapid adoption, the global AI market is worth around 150.2 billion, (Radix Blog).

An artist’s illustration of artificial intelligence (AI) language models, credit to: Google DeepMind

These chatbots and virtual assistants are attempting to mimic human conversation and human thought: telling jokes, using basic manners, writing stories and essays, creating art, and even singing and writing music. It does not have a mind of its own, but it's programmed to lessen the burden on us.

Let’s take the singing concept and expand on it a bit. If AI has the power to instantly create music and vocals, what does this mean for the future of the music industry? Will real musicians be replaced in an effort to reduce budget and save time?

Lena Marvin, Institutional Repository and Reference Librarian, is worried about musicians that write atmospheric music: elevator music, hold music and stingers (short clips of music typically found in movies and TV shows).

“When I was in college, I was friends with a guy whose dad wrote the musical stinger for the villain in the soap opera ‘Days of Our Lives.’ Every time that musical stinger played, their family got paid for the royalties. It is absolutely possible that this could go away and be replaced by AI generated sound. Who would you pay in this case? The person who entered the prompt?”

Furthermore, Marvin believes that the evolution of AI will lead to auto generation. Video games for instance, may no longer require sound designers.

“Video game soundtracks may shift to being generated within the video game itself,” Marvin said. “You could actually build it into the programming based on character setting. So if they're in a dungeon next to a creek, it makes creek noises, it makes dungeon noises, it's all automated.”

If you need a visual, no one tells the game Tetris how many blocks to drop, it’s programmed.

“There's this really great podcast called 20,000 Hertz, which is all about the history of sound,” Marvin said. “They did a really wonderful episode about who designed the Windows startup sound.”

She goes on to state that nowadays, a simple noise like that wouldn’t have required a human being.

“There are artists along the way who've been paid for various things that’ve informed all of our listening forever.”

Photo credit to arstechnica

Marvin described a situation in Japan where the motorcycles were too quiet, (a safety hazard) so they had to hire sound designers to create artificial noises.

“Another noise you may not think about is the turn signal in your car,” Marvin said. “It used to be a mechanical thing but now it’s just a noise played through your car speakers.”

Prior to discussing this topic, Marvin recently did a segment on UMSL Radio regarding the history of electronic music and AI generated music, which piqued her interest.

“I’ve been playing with some AI music software like MusicFX, this one requires a sign up for access, but you can use word descriptions to generate music,” Marvin said. “They reference an old feature on Google called, ‘Feeling lucky’, where it will just give random suggestions. Once you decide on a prompt, it creates 30 second tracks.”

Marvin adds that it will not generate music with certain queries that mention specific artists or include vocals.

“I’ve also experimented with MusicLM,” Marvin said. “You can incorporate vocals but there isn’t much language capability. It sounds like a language that isn’t actually a language, venturing into uncanny valley territory”.

For those unfamiliar, “uncanny valley” is a phenomenon where a robot or other being, closely resembles a human but something feels slightly off or peculiar. In this case, the vocals sound like a human singing but are so incomprehensible that it may evoke a sense of unease or dread, (definitions from TechTarget and Dictionary.com).

“The vocals on MusicLM remind me of this song that came out in the 70s’,” Marvin said. “It was meant to emulate how English speaking sounds to Italians. Someone told me it was like listening to English out of focus.”

In relation to peculiar music, if you’ve been on any video-based apps lately, you may have encountered some AI music covers. The thumbnail is usually some AI enhanced image of a famous musician or band and the video’s creator turns into Ursula with all the voice stealing.

“I have seen some of the deep fake videos out there and they're pretty convincing,” said Matthew Henry, Associate Professor of Music and professional drummer. “You have to have enough reference material to make it work, I say around 100 hours worth. People that have had podcasts or something, so fortunately, you couldn’t deep fake me.”

To elaborate, deep fakes are videos where you manipulate faces and audio to replace someone with someone else, (Merriam-Webster).

When asked if “deep fakes” could capture emotions, Henry zeroed in on the lack of interaction aspect.

“I think the deep fakes that are out there, which are really good, you can't interact with them,” Henry said. “You couldn't ask a fake Joe Biden or Donald Trump a question and have them respond. With singing, you can’t stop the AI and say, ‘hey, why'd you sing that note that way?’ It's not like there's a conversation that can happen. It's just a presentation of material.”

To further his point, he explained that sometimes the inflection of the singer could sound like a form of emotion but it’s subjective.

“You can have a shrill sounding voice or a Billie Eilish kind of sounding voice,” Henry said. “I think all of that is going to convey emotion. But is it the emotion in the music or is it the emotion perceived by the listener?”

Tossing the idea of emotion aside, deep fake videos can be used for a wide range of endeavors: creating one’s ideal music collaboration, (replacing Zac Efron and Zendaya with Taylor Swift and Harry Styles in “Rewrite the Stars”), comedic purposes, (System of a Down singing “Barbie Girl”) and/or manipulating the speaking voice of a former president, (deep fake Obama parody by Buzzfeed, profanity warning). If you want to see someone reacting to the popular deep fakes, check out the channel: Nick Higgs the Singer.

To continue down this rabbit hole, there’s also a channel on TikTok called, “There I ruined it”. Unfortunately, their YouTube channel was taken down due to copyright claims.

Regarding copyright, Marvin claims that many view AI as a plagiarism machine.

“In a sense, all culture is remix culture,” Marvin said.

The question of the ages is: can you copyright a voice?

“They say imitation is the biggest form of flattery,” Henry said. “But I think artists are going to have to copyright their voice style. If you sing or create covers that sound really similar to the artist, you might not be able to monetize it unless you pay the artist.”

The concept of creating an album of cover songs comes into play, and the implications of that.

“I'm in a band in St. Louis and if we wanted to have an album where we cover Taylor Swift songs, we’d have to pay the publisher," Henry said. “If we just want to play it at a bar we can do whatever we want. Generally you have to get the permission of the company that publishes Taylor Swift's music.”

He also states that if you want to create a video (about Taylor Swift) and sell it, making money off of something tangible, it’s a copyright battle waiting to happen.

AI album art, taken from: tunelabs

Another issue arises in regards to song ownership.

“We have to redefine what intellectual property is,” Henry said. “If you write a piece of music and you put your name on it and a date then it's copyrighted. But if you give a prompt to the AI and it writes the music and then you put your name on it, do you still own it?”

According to the Cybertalk blog, if a song is generated with very little human interaction, nobody owns it and it becomes public domain. But if you do intervene: enter a prompt and cater it to your unique style, you own the song. This may not apply to every circumstance, however, it’s far from black and white. The law has yet to fully adapt to this concept.

Returning to the concept of deep fakes, AI is a mastermind at creating shams. Marvin stresses that if you are going to create an AI cover, you need to credit it as such and not try to profit off of it. It goes without saying, don’t steal someone’s identity.

Marvin shares that voice imitations can be used for non-entertainment purposes: aka theft.

An CNBC article on the subject states that last March, the Federal Trade Commission (FTM), sent out a consumer alert, warning people that scammers could target the elderly and ask for money while mimicking the voice of a loved one.

“It’s all pre-scripted, so if the victim asks questions like, ‘what do you need the money for?’, they will have something prepared,” Marvin said. “Bad voice quality can be blamed on bad signal or other phone issues.”

Based on the same article, there are preventative methods you can take. Firstly, make sure to fact check everything that is being said. Next, if you have a panicked “family member” on the line, consider hanging up and calling or texting them using their own number.

Aside from the detriments of the above scenario, AI is not evil, nor is it good. It’s based on the initial creator and its designated purpose, as stated by Marvin.

“There was a new Beatles song released in 2023 called “Now and Then”, it was something they recorded years ago,” Zachary Cairn, Associate Professor of Music Theory said. “People assumed they were going to recreate the late John Lennon’s voice but that wasn’t true.”

As mentioned in an NPR article, John Lennon recorded a song demo on an analog tape with a TV playing in the background, so the remaining band members decided to clean it up and isolate the vocals using AI.

New Beatles song, photo credit to: Upworthy

Another instance of using AI for good, is the elimination of busy work.

“There's a research project that I'm working on right now that deals with the analysis of about 500 Rock songs that were written in the 80s’ and the early 90s’,” Cairn said. “Corpus analysis is a big thing in music theory these days, where I’m trying to look for specific musical features and trends in a really large body of repertoire. If there was some sort of AI tool that I could train to look for these features and then kind of set it loose on the music, that would save me a lot of time.”

Henry has a similar viewpoint, believing that AI will help more than hinder, especially in regards to songwriting.

“AI is just a glorified search engine, it could help with songwriting by collecting popular music trends and analyzing all that data,” Henry said.

The pessimistic mindset that AI would replace real singers and take away from productivity, was not a perspective Henry was willing to consider.

“Replace is the wrong word,” Henry said. “It’s not like they are going away. In fact, it could have an artist last after their death. They already have holograms of Tupac, Michael Jackson and Nat King Cole.”

He further explains that the music industry has the potential to benefit from AI.

“I don’t think it's going to hurt the music industry, but it’s going to change the music industry,” Henry said. “It's definitely going to shift. And we have to be ready to understand how we can use it to make music better.”

After reading this article, if you want to experiment with AI music production or vocal effects, look no further:

Retrieval Based Voice Conversion (RVC, Version 2, and free to use).
ACE Studio (functions like a digital instrument, can choose from a variety of authorized, commercially available AI voices so you won’t have to worry about copyright).
Ultimate Vocal Remover, (free to use but it might be a lengthy process, especially if you have spotty WiFi).
In a similar context, Respeecher is a good software for voice cloning, great for content creators or people with too much free time.

Artificial Intelligence in Music: A Useful Tool or a Replacement?

Recent Posts

SUBSCRIBE