Recherche et sélection de publications
Interface en ou

Multichannel audio source separation: variational inference of time-frequency sources from time-domain observations

Simon Leglaive #1, Roland Badeau #1, Gaël Richard #1
#1 Laboratoire traitement et communication de l'information (LTCI)
  • Télécm ParisTech
  • Institut Mines-Télécom
  • Université Paris-Saclay
References
42nd International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, USA, IEEE, March 2017,
Abstract

A great number of methods for multichannel audio source separation are based on probabilistic approaches in which the sources are modeled as latent random variables in a time-frequency (TF) domain. For reverberant mixtures, most of the methods approximate the time-domain convolutive mixing process in the TF-domain, assuming short mixing filters. The TF latent sources are then inferred from the TF mixture observations. In this paper we propose to infer latent TF sources from the time-domain observations. This approach allows us to exactly model the convolutive mixing process. The inference procedure rely on a variational expectation-maximization algorithm. In significant reverberation conditions, we show that our approach leads a Signal-to-Distortion Ratio improvement of 5.5 dB.

Keywords
Multichannel audio source separation, time-domain convolutive model, time-frequency source model, variational EM algorithm
Category
Paper in proceedings
Research Area(s)
Engineering Sciences/Signal and Image processing
Identifier(s)
HAL ref. hal-01416347
Bibliographic key SL:ICASSP-17
File(s)
Export
Last update
on march 20, 2017 by Roland Badeau


Responsable du service
Dominique Asselineau dominique.asselineau@telecom-paristech.fr
Copyright © 1998-2017, Télécom ParisTech/Dominique Asselineau