Recherche et sélection de publications
Interface en ou

Feature Adapted Convolutional Neural Networks for Downbeat Tracking

Simon Durand #1, Juan P. Bello #2, Bertrand David #1, Gaël Richard #1
#1 Laboratoire Traitement et Communication de l'Information [Paris] (LTCI)
  • Télécom ParisTech
  • CNRS : UMR5141
#2 Music and Audio Research Lab [New York] (MARL)
  • New York University [New York]
References
ICASSP 2016, Shanghai, Chine, September 2016,
Abstract

We define a novel system for the automatic estimation of downbeat positions from audio music signals. New rhythm and melodic features are introduced and feature adapted convolutional neural networks are used to take advantage of their specificity. Indeed, invariance to melody transposition, chroma data augmentation and length-specific rhythmic patterns prove to be useful to learn downbeat likelihood. After the data is segmented in tatums, complementary features related to melody, rhythm and harmony are extracted and the likelihood of a tatum being at a downbeat position is computed with the aforementioned neural networks. The downbeat sequence is then extracted with a flexible temporal hidden Markov model. We then show the efficiency and robustness of our approach with a comparative evaluation conducted on 9 datasets.

Keywords
Category
Paper in proceedings
Research Area(s)
Computer Science/Signal and Image Processing
Statistics/Machine Learning
Computer Science/Computers and Society
Computer Science/Neural and Evolutionary Computing
Computer Science/Sound
Engineering Sciences/Signal and Image processing
Identifier(s)
Bibliographic key DBDR:ICASSP-16
File(s)
Export
Last update
on february 02, 2018 by Gael Richard


Responsable du service
Dominique Asselineau dominique.asselineau@telecom-paristech.fr
Copyright © 1998-2017, Télécom ParisTech/Dominique Asselineau