Research Description

Here is a synopsis of our research.
53a9bc1f9a7b911d21db8675_pandora.png53a9c04864ded3de312cdbdc_last.png

Research Problem

The overwhelming size of musical data online compels the need for tools to efficiently access and organize these songs. However, there is an existing gap between the highly mathematical, impersonal recommendations and the recommendations embodying the actual perceptions of a user. We want to create an efficient recommendation program that accurately represents a user’s musical preferences.


Current music recommendation systems include Pandora and Last.fm. Pandora uses a team of music analysts to tag songs, which leads to the possibility of introducing human error from the analysts. Last.fm uses user-generated tags to determine song similarity; these tags are often ambiguous and do not take into account the quantifiable content from a song’s musical data.

It is clear that people do not perceive music by the representation of numerical similarity; therefore, we must consider abstract features that model the perception of music.

53aa46c14b053ae4361d813b_question%20mark.png

Research Question & Hypothesis

We have crafted our research questions based on the areas that we believe require growth to advance the field of music recommendation software:

How can music perception be incorporated into an automated recommendation system that focuses on an individual’s music preferences and listening trends?

Our question takes into account how listeners perceive music and combines that with quantifiable, automated characteristics from audio files.
In response to this question, our hypothesis is:

If we can computationally analyze qualities and trends within a specific listener’s song list, then we can generate a model of the user’s perception of music that will account for abstract factors like mood.

The Beginning

Originally, our methodology focused on the development of Matlab code to extract features from a song as described in our Thesis Proposal. Features are mathematically defined components within the music, such as tempo or key signature. In feature extraction of audio signals, you start with a song file, find the corresponding audio signal and then extrapolate certain data points, or features, from this waveform. We were going to use music theory to determine which higher-order features we would extract from this type of data, and then compare the features of songs to each other to estimate the degree of musical similarity for song recommendation.

However, we have recently realized that this method, being so mathematically oriented, seems rather impersonal from a user’s perspective as it fails to take into account the individuality of the user. We have been working on refocusing our approach, so that we could allow our program to be more attentive to the user’s individual musical preferences.

54b54f8f50d87cf623f7e34e_echonest.png


This change in approach led our search for APIs that would generate musical features, which we can then apply to music recommendation in a way that models a user’s musical cognition. The use of an API allows us to concentrate on the user-specificity of the program, especially in dealing with the perception of music. We found The Echo Nest, which is an API that returns numbers corresponding to an input song’s features. We will develop a cognitive music model in order to represent the musical perception of listeners and then choose features from The Echo Nest’s API for our program.

Project Outline

The overall outline of our project is as follows:
          (1) Compilation of research on current music recommendation systems, feature generation, The Echo Nest’s API, and statistical analysis methods. Although this is listed only as a first step, this is an ongoing action.
          (2) Experimentation with feature generation and contact of experts based on our testing and research proposal.
          (3) Use of Focus Group sessions to determine satisfaction levels of current music recommendation systems and elements of songs on which people focus.
          (4) Testing of the The Echo Nest API for use as a feature generation engine.
          (5a) Creation and release of an online survey to compare human perception features of songs (from the Five-Factor Model Experiment) to features from The Echo Nest API.
          (5b) Comparison of the Five-Factor Model and The Echo Nest’s API features to find correlations using statistical methods such as the ANOVA test.
          (6) Prediction of the Five-Factor Model from the API correlations made in (5b).
          (7) Generation and release of final online survey to observe if similarities between the Five-Factor Model and The Echo Nest’s API led to valid recommendations.
          (8) Analysis of survey in (7).

Documents and Presentations

Below are materials further describing our project.

Thesis Proposal

A report detailing the background of our project and the proposed procedure we plan to follow.
Download

Thesis Proposal Video

A video containing our presentation of our proposal in front of a panel of experts and Gemstone staff.
Watch

Presentation Slides

Slides illustrating an overview of our project with notes which can be used as a quick reference guide. 
Download

Budget & Glossary

A document including the most recent estimate of our budget and a glossary of important terms.
Download