Βι¶ΉΤΌΕΔ

Research & Development

Posted by Chris Baume on , last updated

One of the universal appeals of music lies in its mysterious ability to manipulate and reflect our emotions. Even the simplest of tunes can evoke strong feelings of joy, fear, anger, sadness and anything in between. Music is a huge part of what the Βι¶ΉΤΌΕΔ does - in fact it broadcasts over 200,000 different tracks every week. With so much music to choose from, especially in the digital age, there is more and more interest in finding ways of navigating music collections in a more human way. Some of our colleagues are looking at of finding TV programmes by mood, but can we do something similar for music?

The alliteratively-named 'Making Musical Moods Metadata' is a collaborative project between Βι¶ΉΤΌΕΔ R&D, (QMUL) and . Part of the project involves researching how information about the mood of music tracks can be added to large collections. I Like Music is a company that provides the Βι¶ΉΤΌΕΔ with an online music library called the 'Desktop Jukebox', which includes over a million songs. Labelling each of these by hand would take many years, so we are developing software that will do it automatically.

The Desktop Jukebox interface.

Calculating emotions

As you can imagine, getting a computer to understand human emotions has its challenges - three, in fact. The first one is how to numerically define mood. This is a complicated task as not only do people disagree on the mood of a track, but music often expresses a combination of emotions. Over the years, researchers have come up with various models, notably which define eight mood categories, and , which represents mood as a point on a two-dimensional plane. Both approaches have their drawbacks, so our partners at QMUL Centre for Digital Music are developing a model which combines the strengths of both. The model will be based on earlier research conducted on the emotional similarity of common keywords.

Russell's circumplex model. []

The next challenge is processing the raw digital music into a format that the computer can handle. This should be a small set of numbers that represent what a track sounds like. They are created by running the music through a set of algorithms, each of which produce an array of numbers called 'features'. These features represent different properties of the music, such as the tempo and what key it's written in. They also include statistics about the frequencies, loudness and rhythm of the music. The trick lies in finding the right set of features that describe all the properties of music that are important for expressing emotion.

Now for the final challenge. We need to find out exactly how the properties of the music work together to produce different emotions. Even the smartest musicologists struggle with this question, so - rather lazily - we're leaving it to the computer to work it out.

Machine learning is a method of getting a computer to 'learn' how two things are related by analysing lots of real-life examples. In this case, it is looking at the relationship between musical features and mood. There are a number of algorithms we could use, but initially we are using the popular 'support vector machine' (SVM) which has been shown to work for this task and can handle both linear and non-linear relationships.

For the learning stage to be successful, the computer will need to be 'trained' using thousands of songs that have accompanying information about the mood of each track. This kind of collection is very hard to come across, and researchers often struggle to find appropriate data sets. Not only that, but the music should cover a wide range of musical styles, moods and instrumentation.

Production music

Although the Desktop Jukebox is mostly composed of commercial music tracks, it also houses a huge collection of what is known as 'production music'. This is music that has been recorded using session artists, and so is wholly owned by the music publishers who get paid each time the tracks are used. This business model means that they are keen to make their music easy to find and search, so every track is hand-labelled with lots of useful information.

Through our project partners at I Like Music, we obtained over 128,000 production music tracks to use in our research. The tracks, which are sourced from over 80 different labels, include music from every genre.

The average production music track is described by 40 keywords, of which 16 describe the genre, 12 describe the mood and 5 describe the instrumentation. Over 36,000 different keywords are used to describe the music, the top 100 of which are shown in the tag cloud below. Interestingly, about a third of the keywords only appear once, including such gems as 'kangaroove', 'kazoogaloo', 'pogo-inducing' and 'hyper-bongo'.

A tag cloud of the top 100 keywords used to describe production music. The more common the keyword, the larger the font size.

Drawing a mood map

In order to investigate how useful the keywords are in describing emotion and mood, the relationships between the words were analysed. The way we did this was to calculate the co-occurrence of keyword pairs - that is, how often a pair of words appear together in the description of a music track. The conjecture was that words which appear together often have similar meanings.

Using the top 75 mood keywords, we calculated the co-occurrence of each pair in the production music database to produce a large matrix. In order to make any sense out of it, we used a graphing program called to visualise the keywords and the connections between them. Those with strong connections (that often appeared together) were positioned close to each other, and those with weak connections further apart. Thanks to the fabulous , you can using any up-to-date desktop browser.

Part of a graph which describes how keywords link together.

We found that the keywords arranged themselves into a logical pattern, where negative emotions were on the left and positive emotions on the right, with energetic emotions on top and lethargic emotions on the bottom. This roughly fits Russell's , suggesting that this model may be a suitable way to describe moods in the production music library, however more research is required before a model is chosen.

Next steps

We have been working with the University of Manchester to extract features from over 128,000 production music files using the . Once that is work is complete, we will be able to start training and testing musical mood classifiers which can automatically label music tracks. Watch this space for updates, and hopefully a working online demo.

For further information on the latest developments in music emotion recognition, check out the proceedings of the and conferences.