Recherche et sélection de publications
Interface en ou

Feature Adapted Convolutional Neural Networks for Downbeat Tracking

Simon Durand #1, Juan P. Bello #2, Bertrand David #1, Gaël Richard #1
#1 Laboratoire Traitement et Communication de l'Information [Paris] (LTCI)
  • Télécom ParisTech
  • CNRS : UMR5141
#2 Music and Audio Research Lab [New York] (MARL)
  • New York University [New York]
ICASSP 2016, Shanghai, Chine, September 2016,

We define a novel system for the automatic estimation of downbeat positions from audio music signals. New rhythm and melodic features are introduced and feature adapted convolutional neural networks are used to take advantage of their specificity. Indeed, invariance to melody transposition, chroma data augmentation and length-specific rhythmic patterns prove to be useful to learn downbeat likelihood. After the data is segmented in tatums, complementary features related to melody, rhythm and harmony are extracted and the likelihood of a tatum being at a downbeat position is computed with the aforementioned neural networks. The downbeat sequence is then extracted with a flexible temporal hidden Markov model. We then show the efficiency and robustness of our approach with a comparative evaluation conducted on 9 datasets.

Paper in proceedings
Research Area(s)
Computer Science/Signal and Image Processing
Statistics/Machine Learning
Computer Science/Computers and Society
Computer Science/Neural and Evolutionary Computing
Computer Science/Sound
Engineering Sciences/Signal and Image processing
Bibliographic key DBDR:ICASSP-16
Last update
on february 02, 2018 by Gael Richard

Responsable du service
Dominique Asselineau
Copyright © 1998-2017, Télécom ParisTech/Dominique Asselineau