Music

Automatic music classification system puts songs in their place

A new computer system is able to accurately classify music according to genre
A new computer system is able to accurately classify music according to genre

It's a growing problem: a dizzying number of songs get released to online music stores and streaming services or uploaded to archives around the world each day, and those songs need to be categorized. But how? Play the same song to 10 people and they might each put it into a different genre or subgenre. An automated genre identification system developed by researchers in India, which they claim is the best yet, could be the answer.

The system, created by a group led by Arijit Ghosal of the Neotia Institute of Technology Management and Science, is predicated on the idea that musical genres are characterized by pitch, tempo, amplitude variation patterns (changes between loud and soft), and periodicity (how or to what extent the music repeats phrases). Most major genres can be identified by pitch analysis – which reflects the melody – alone, but including the others makes for more accurate readings.

To analyze music for common pitch features, the system breaks it down into 88 frequency bands, each divided into short frames for which they calculated something called the short-time mean-square power (a kind of measure related to the sound wave's voltage and current), both individually and as an average. For tempo, it starts with a novelty curve, which follows changes in the song's timbre, or tone color – so, basically when the instrumentation changes. It then performs a Fourier transform, which deconstructs the song's sound wave into many sine curves, each corresponding to a different frequency, which can be further analyzed to get the beats per minute.

For amplitude variation patterns, the signal is smoothed and then mathematical matrix operations are performed on it to get the equivalent of the signal's texture. While for periodicity, it divides the signal into frames of 100 samples each and calculates cross correlations between them. The system then takes the maximum cross correlation of each frame and uses it to calculate mean and standard deviations.

All of this information gets fed into a classification scheme. The researchers tested their method with three classifiers – multilayer perceptron (MLP), which is an artificial neural network that consists of multiple layers of neuron-like things called perceptrons; support vector machines (SVMs), which use machine learning and a set of training data; and random sample consensus (RANSAC), which makes a hypothesis based on a randomly-selected sample set and then verifies it against the model (and iterates through the data, taking the best fit estimate as the final one).

RANSAC outperformed the other two classifiers in both feature sets developed from a database of 490 songs in seven different genres. And the methodology used also proved more accurate – or in their words, "substantially better" – than different approaches used in previous studies, when tested on the same data.

The researchers believe that their genre identification system could be easily incorporated into existing music databases and recommendation services.

A paper describing the research was published in the International Journal of Computational Intelligence Studies. An earlier version of the same system was described in a paper presented at the International Conference on Advanced Computing, Networking, and Informatics in 2013.

Sources: Inderscience, Neotia Institute of Technology Management and Science

  • Facebook
  • Twitter
  • Flipboard
  • LinkedIn
4 comments
Deadpan
The standardization of art is the beginning of the death of it.
felix
What was the accuracy on the 7 class problem? I achieved 78% when I tackled this task for my thesis in 2003.
Dave Lawrence
The stated opinion that playing the same piece of music to ten people would result in ten different genre classifications, makes this classification system redundant by it's absolute lack of the one thing music inspires in the listener - an emotional response.
Got to give the guys kudos for the work though, it's certainly inspiring stuff but maybe better suited to the study of things that are less emotive
JohnDelaneyIV
I experience what I call the "Bob Seger Effect" on Pandora and it seems to have crept into the iTunes Genius. If I favorite three to four Bob Seger songs I end up with a stream of '70's Southern Rock, which I don't like all that much of actually.
While I see very big differences between Seger & Lynrd Skinard, apparently the machine(s) don't. Creedence Clearwater, okay. Some ZZ Top, sure. I'd throw some AC/DC in with a mix like that maybe.
The other trick here is judging the listeners current mood. What I want to hear changes all the time.
If this is an improvement, awesome!