Recherche et sélection de publications

Downbeat Detection with Conditional Random Fields and Deep Learned Features

Simon Durand #1, Slim Essid #1

#1	Laboratoire Traitement et Communication de l'Information [Paris] (LTCI) Télécom ParisTech CNRS : UMR5141

References

International Society for Music Information Retrieval (ISMIR), New York City, USA, August 2016, pp. 386-392

Abstract

In this paper, we introduce a novel Conditional Random Field (CRF) system that detects the downbeat sequence of musical audio signals. Feature functions are computed from four deep learned representations based on harmony, rhythm, melody and bass content to take advantage of the high-level and multi-faceted aspect of this task. Downbeats being dynamic, the powerful CRF classification system allows us to combine our features with an adapted temporal model in a fully data-driven fashion. Some meters being under-represented in our training set, we show that data augmentation enables a statistically significant improvement of the results by taking into account class imbalance. An evaluation of different configurations of our system on nine datasets shows its efficiency and potential over a heuristic based approach and four downbeat tracking algo- rithms.

Keywords

Category

Paper in proceedings

Research Area(s)

Statistics/Applications
Computer Science/Machine Learning
Computer Science/Neural and Evolutionary Computing
Computer Science/Sound
Computer Science/Signal and Image Processing
Engineering Sciences/Signal and Image processing
Statistics/Machine Learning

Identifier(s)

Bibliographic key SD:ISMIR-16

File(s)

Complete document

Export

BibTeX

Last update

on january 19, 2017 by Simon Durand