Repository logo
 
Loading...
Profile Picture

Search Results

Now showing 1 - 7 of 7
  • How Does the Spotify API Compare to the Music Emotion Recognition State-of-the-Art?
    Publication . Panda, Renato; Redinho, Hugo; Gonçalves, Carolina; Malheiro, Ricardo; Paiva, Rui Pedro
    Features are arguably the key factor to any machine learning problem. Over the decades, myriads of audio features and recently feature-learning approaches have been tested in Music Emotion Recognition (MER) with scarce improvements. Here, we shed some light on the suitability of the audio features provided by the Spotify API, the leading music streaming service, when applied to MER. To this end, 12 Spotify API features were obtained for 704 of our 900-song dataset, annotated in terms of Russell’s quadrants. These are compared to emotionally-relevant features obtained previously, using feature ranking and emotion classification experiments. We verified that energy, valence and acousticness features from Spotify are highly relevant to MER. However, the 12-feature set is unable to meet the performance of the features available in the state-of-the-art (58.5% vs. 74.7% F1-measure). Combining Spotify and state-of-the-art sets leads to small improvements with fewer features (top5: +2.3%, top10: +1.1%), while not improving the maximum results (100 features). From this we conclude that Spotify provides some higher-level emotionally-relevant features. Such extractors are desirable, since they are closer to human concepts and allow for interpretable rules to be extracted (harder with hundreds of abstract features). Still, additional emotionally-relevant features are needed to improve MER.
  • A Comparison Study of Deep Learning Methodologies for Music Emotion Recognition
    Publication . Louro, Pedro; Redinho, Hugo; Malheiro, Ricardo; Paiva, Rui Pedro; Panda, Renato
    Classical machine learning techniques have dominated Music Emotion Recognition. However, improvements have slowed down due to the complex and time-consuming task of handcrafting new emotionally relevant audio features. Deep learning methods have recently gained popularity in the field because of their ability to automatically learn relevant features from spectral representations of songs, eliminating such necessity. Nonetheless, there are limitations, such as the need for large amounts of quality labeled data, a common problem in MER research. To understand the effectiveness of these techniques, a comparison study using various classical machine learning and deep learning methods was conducted. The results showed that using an ensemble of a Dense Neural Network and a Convolutional Neural Network architecture resulted in a state-of-the-art 80.20% F1 score, an improvement of around 5% considering the best baseline results, concluding that future research should take advantage of both paradigms, that is, combining handcrafted features with feature learning.
  • Improving Deep Learning Methodologies for Music Emotion Recognition
    Publication . Louro, Pedro Lima; Redinho, Hugo; Malheiro, Ricardo; Paiva, Rui Pedro; Panda, Renato
    Music Emotion Recognition (MER) has traditionally relied on classical machine learning techniques. Progress on these techniques has plateaued due to the demanding process of crafting new, emotionally-relevant audio features. Recently, deep learning (DL) methods have surged in popularity within MER, due to their ability of automatically learning features from the input data. Nonetheless, these methods need large, high-quality labeled datasets, a well-known hurdle in MER studies. We present a comparative study of various classical and DL techniques carried out to evaluate these approaches. Most of the presented methodologies were developed by our team, if not stated otherwise. It was found that a combination of Dense Neural Networks (DNN) and Convolutional Neural Networks (CNN) achieved an 80.20% F1-score, marking an improvement of approximately 5% over the best previous results. This indicates that future research should blend both manual feature engineering and automated feature learning to enhance results.
  • Audio Features for Music Emotion Recognition: a Survey
    Publication . Panda, Renato; Malheiro, Ricardo; Paiva, Rui Pedro
    The design of meaningful audio features is a key need to advance the state-of-the-art in Music Emotion Recognition (MER). This work presents a survey on the existing emotionally-relevant computational audio features, supported by the music psychology literature on the relations between eight musical dimensions (melody, harmony, rhythm, dynamics, tone color, expressivity, texture and form) and specific emotions. Based on this review, current gaps and needs are identified and strategies for future research on feature engineering for MER are proposed, namely ideas for computational audio features that capture elements of musical form, texture and expressivity that should be further researched. Finally, although the focus of this article is on classical feature engineering methodologies (based on handcrafted features), perspectives on deep learning-based approaches are discussed.
  • MERGE App: A Prototype Software for Multi-User Emotion-Aware Music Management
    Publication . Louro, Pedro; Branco, Guilherme; Redinho, Hugo; Santos, Ricardo Correia Nascimento Dos; Malheiro, Ricardo; Panda, Renato; Paiva, Rui Pedro
    We present a prototype software for multi-user music library management using the perceived emotional content of songs. The tool offers music playback features, song filtering by metadata, and automatic emotion prediction based on arousal and valence, with the possibility of personalizing the predictions by allowing each user to edit these values based on their own emotion assessment. This is an important feature for handling both classification errors and subjectivity issues, which are inherent aspects of emotion perception. A path-based playlist generation function is also implemented. A multi-modal audio-lyrics regression methodology is proposed for emotion prediction, with accompanying validation experiments on the MERGE dataset. The results obtained are promising, showing higher overall performance on train-validate-test splits (73.20% F1-score with the best dataset/split combination).
  • Exploring Song Segmentation for Music Emotion Variation Detection
    Publication . Ferreira, Tomas; Redinho, Hugo; Louro, Pedro L.; Malheiro, Ricardo; Paiva, Rui Pedro; Panda, Renato
    This paper evaluates the impact of song segmentation on Music Emotion Variation Detection (MEVD). In particular, the All-In-One song-structure segmentation system was employed to this end and compared to a fixed 1.5-sec window approach. Acoustic features were extracted for each obtained segment/window, which were classified with SVMs. The attained results (best F1-score of 55.9%) suggest that, despite its promise, the potential of this song segmentation approach was not fully exploited, possibly due to the small employed dataset. Nevertheless, preliminary results are encouraging.
  • Exploring Deep Learning Methodologies for Music Emotion Recognition
    Publication . Louro, Pedro; Redinho, Hugo; Malheiro, Ricardo; Paiva, Rui Pedro; Panda, Renato
    Classical machine learning techniques have dominated Music Emotion Recognition (MER). However, improvements have slowed down due to the complex and time-consuming task of handcrafting new emotionally relevant audio features. Deep Learning methods have recently gained popularity in the field because of their ability to automatically learn relevant features from spectral representations of songs, eliminating such necessity. Nonetheless, there are limitations, such as the need for large amounts of quality labeled data, a common problem in MER research. To understand the effectiveness of these techniques, a comparison study using various classical machine learning and deep learning methods was conducted. The results showed that using an ensemble of a Dense Neural Network and a Convolutional Neural Network architecture resulted in a state-of-the-art 80.20% F1-score, an improvement of around 5% considering the best baseline results, concluding that future research should take advantage of both paradigms, that is, conbining handcrafted features with feature learning.