April 5, 2024

A super-brief introduction to Music AI for non-engineers and newbies alike

We’re starting a series of articles focused on music innovation, and we’ve decided to begin with a general AI piece to set the stage. We also considered naming this article “Music AI for Dummies—BMAT Edition,” but we ended up opting for a more cautious approach.

If you’re an engineer or deeply engaged in AI research, coding, engineering, or a related field, this article may not unveil groundbreaking discoveries. Consider yourself warned, and feel free to enjoy its occasional puns and entertaining references.

Alan had AI dream

In 1950, Alan Turing published a paper entitled “Computing Machinery and Intelligence“, which opened the doors to what is now globally known as Artificial Intelligence (AI). 5 years later, a program called the Logic Theorist was presented that was designed to mimic the problem-solving skills of a human. This is often considered to be the first artificial intelligence program. Since then, considerable effort has been dedicated to AI, mainly in the academic field.

So, what is AI Music

AI music is music that is composed, produced, or generated with the assistance of artificial intelligence (AI) technologies. AI-generated music platforms are like smart synthesisers that make their own music using artificial intelligence. They’re trained on loads of music data like vocals, chords, and rhythms, to create new songs by themselves.

How does AI music work

It uses algorithms, Machine Learning and other computational methods to analyse musical data, understand musical patterns and come up with original tunes. So, instead of having a human composer, you’ve got a computer doing the job. AI music can cover many creative outputs, from simple melodies to complex compositions that sound like your favourite artists made them.

Like this song composed with AI in the style of the Beatles called Daddy’s Car. Dropped 7 years ago, it was among the first examples we’ve encountered at an AI Song Contest that still runs today. It’s worth mentioning that although AI handled the composition, the song was brought to life by (human) musicians who took care of the performance, vocals, recording, and production.

What has changed since the beginning

At its core, not much. Most of the engines are symbolic, and none is still there yet – we said what we’ve said. On the marketing side, a lot has changed. Large Language Models have impressed the world and almost everyone now acknowledges that AI can replicate human intelligence and perform most of the human tasks that traditionally require human cognitive abilities, including creative projects.

source

These models represent the progress enabled by Artificial Neural Networks (ANNs), computer systems made up of connected artificial neurons. Inspired by the intricate neural networks of the human brain, these artificial neurons are organised into layers with interconnected algorithms.

As ANNs grow and develop, they show how they can achieve things that were once thought only humans could do, like being creative.

What is and what isn’t gen AI

Generative AI music, which has created strong concerns amongst the industry in 2023, refers to a class of artificial intelligence techniques that involve generating new data, content, or outputs that are alike but not the same as the input data it was trained on. Still, most non-generative AI use cases in the music industry shouldn’t worry the creators.

AI operates through a complex interplay of algorithms, data, and computational power, processing vast amounts of data, recognising patterns within that data, and making decisions or predictions based on those patterns.

Since these AI algorithms can spot patterns and trends in vast amounts of data, they enable informed decision-making and can be very helpful to most actors in the value chain. Some existing use cases include user’s listening habits mapping by Digital Service Providers (DSP), assisted mixing and mastering, automated music tagging, metadata matching, data reconciliation, and many more.

For short, Generative AI music creates new content based on learned patterns, while non-generative AI assists in tasks like analysing listening habits and tagging music without generating new content.

Machine Deep Learning 101

One of the key components of AI is Machine Learning – a subset of AI that focuses on enabling computers to learn from data without being explicitly programmed. Machine Learning algorithms are trained on large datasets, where they iteratively adjust their parameters to improve performance on a specific task. This process allows AI systems to recognise patterns and make predictions or decisions with increasing accuracy over time.

bmat.com_AI-deep-learning-machine-learning

Deep learning, a subfield of Machine Learning, has gained significant traction in recent years due to its ability to learn hierarchical representations of data automatically. Deep learning models, such as neural networks, consist of multiple layers of interconnected nodes that extract intricate features from raw data, enabling more sophisticated tasks like image recognition, natural language processing, and autonomous driving.

Cool cats, according to AI

Image recognition is a good example of understanding the mechanics of deep learning. To train AI software on images of cats, the neural network’s input layer processes millions of raw image data of cats. It transforms this data into numerical values and passes it through subsequent network layers, each refining the input representation. As this process continues, the information in the hidden layers becomes increasingly abstract, losing direct correlation with the original image data.

A case in point is the Meow Generator, a collection of Machine-Learning algorithms that have been unleashing thousands of disturbing cat faces into the world – 15,749 of them, to be exact. These creepy cats are the design of Alexia Jolicoeur-Martineau, who started with a dataset of 10,000 cat pictures and focused on their faces. She used a generative adversarial network (GAN) to create new cat faces, which consist of two neural networks, one generating images and the other discerning their authenticity.

bmat.com_AI-cat-image-identification-1

source

These networks were trained to recognize cat faces and generate new ones. The generator adapts based on the discriminator’s feedback until the generated images are almost indistinguishable from real ones – almost.

bmat.com_AI-cat-image-identification

source

Eventually, the AI determines the relationship between the observed data and the desired output—such as recognising a cat in the final layer. Consequently, the reasoning behind the AI’s decision becomes opaque to human understanding, making it challenging for operators to trace how a specific result, like identifying a cat, is reached.

AI systems also utilise natural language processing (NLP) techniques to understand and generate human language, computer vision to interpret visual information from images or videos, and reinforcement learning to make sequential decisions through trial and error.

Everybody will be AI dancing, and BMAT will be doing it right

We and AI go back, and this serves as extra motivation for us to become a tech leader in that field for the music industry. As such, we are working on pioneering ideas such as the use of embeddings to match and homogenise music data and metadata databases.

bmat.com_AI-Musical works embedding

Fundamentally, we’re exploring new AI techniques and applying them to entity resolution within the music industry to develop the best data-matching engine. This engine will optimise specific use cases and enable scalable back-end services. These include the creation of master databases to unify different versions of musical entities, establishing links between these entities, and identifying music compositions and phonograms within unstructured texts from platforms like YouTube.

bmat.com_BMAT technology using AI in 2024: Embeddings

In particular, we investigate and evaluate the use of so-called embeddings (vector representations of raw data that encapsulate its semantic meaning) for comparing entities and the use of Transformer-type neural networks. Embeddings can be generated from raw text, images, or audio.

At BMAT, we have trained an AI system able to encode the musical information of an audio file into an embedding and use them for massive cover song identification with very interesting results. Our preliminary experiments using a collection of 3 million songs achieved the expected results for standard covers, but we also observed promising side effects of the algorithm. We realised that we’ve built a system able to spot copyright infringements from a wide musical perspective. We plan to keep developing this idea to fully understand the capabilities of the technology for similar use cases.

The UPF-BMAT Chair on Artificial Intelligence and Music is another AI initiative in which BMAT is engaged. Think of it as a sturdy chair supporting the weight of innovation and creativity in the music industry in Spain.

In collaboration with UPF, this Chair involves researchers and staff from Music Technology Group and BMAT, establishing partnerships with institutions and experts in the national music sector. This includes applied research, software development, and creating advanced technologies beneficial to the sector, promoting open innovation through public-private collaboration.

And we also made sure that the Chair is as comfortable as a studio recliner for those long brainstorming sessions that await us.

This is (not) the end, my friend

If you’re worried that AI will destroy the music industry, chill. It’s like giving a fish, a highly intelligent one though, a bicycle – impressive, but not really doing the trick.

Or maybe you’re reading this from the future ROFL-ing. Only time will tell.

There’s a popular internet quote saying that if you want to know the future, you have to look at the past. Although there’s no evidence that Albert Einstein actually said this, hear us out.

After the release of The Jazz Singer in 1927, all bets were off for live musicians who played in movie theatres. Thanks to synchronised sound, the use of live musicians has become unnecessary at the movies, and all of a sudden, more than 80% of musicians in the industry were now out of work. 

In 1930, the American Federation of Musicians established a new organisation named the Music Defense League. They initiated harsh advertising campaigns to fight the spread of what they believed was a dreadful threat: recorded sound, or as they termed it, “canned music.”

bmat.com_AI-1930-music-defense-league

source

It’s hard not to notice the similarities.

AI is making waves across various sectors, including business, product development, technology and creativity. AI’s impact extends to the music industry itself with applications ranging from AI-generated music to innovative tools for music operations and royalty distribution – and that would be our cue.

With new AI laws on copyrighted music in Europe and neverending industry conversations, we felt like it was time to launch an initiative to examine AI’s impact from every angle. In other words, we’ve started a company-wide effort to explore how AI is reshaping the music industry.

The plan is to ride the AI wave, cut costs and boost productivity while shifting our focus to more strategic human endeavours, like the rest of the planet, but ours is oriented towards the music industry, because that’s our bread and butter. At the same time, we aspire to elevate our standards to ensure every artist has the best chance of getting paid. That’s been our thing since day 1.

This is how we founded the BMAT AI Defense League (sadly not the AI Musical Justice League), a cross-functional in-house division dedicated to driving innovation across several sectors. We’re leveraging AI to fuel growth, optimise business workflows, streamline development processes, and push the boundaries of digital innovation through research and product development. And a few other things, but we’ll leave those classified for now.

We’ll disclose more as we go, so keep an AI on us.


Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.


Latest articles

March 20, 2024

Collaborating with RTP to champion fair compensation from public TV broadcasting for the Portuguese music community

RTP – Rádio e Televisão de Portugal, the Portuguese public broadcaster, has started a multi-year partnership with BMAT Music Innovators to enhance and automate music reporting process [...]


December 4, 2023

Joining forces with the Japan Music Copyright Association (JASRAC) to enhance the identification accuracy of JASRAC-managed songs on YouTube

JASRAC – Japan Music Copyright Association – has joined forces with BMAT Music Innovators to enhance the identification accuracy of JASRAC-managed songs on YouTube. Accurate identification  [...]


October 17, 2023

Behind the curtains of the Internship Orchestra Tour

Back in the fall of ’22 we launched the Internship Orchestra Tour – a programme giving 10 bright minds a seat to perform for the Music Operating System and better the industry altoget [...]