Audio-visual object localization and separation using low-rank and sparsity

Conference paper


Pu, J., Panagakis, Y., Petridis, S. and Pantic, M. 2017. Audio-visual object localization and separation using low-rank and sparsity. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). New Orleans, LA, USA 05 - 09 Mar 2017 Institute of Electrical and Electronics Engineers (IEEE). pp. 2901-2905 https://doi.org/10.1109/ICASSP.2017.7952687
TypeConference paper
TitleAudio-visual object localization and separation using low-rank and sparsity
AuthorsPu, J., Panagakis, Y., Petridis, S. and Pantic, M.
Abstract

The ability to localize visual objects that are associated with an audio source and at the same time seperate the audio signal is a corner stone in several audio-visual signal processing applications. Past efforts usually focused on localizing only the visual objects, without audio separation abilities. Besides, they often rely computational expensive pre-processing steps to segment images pixels into object regions before applying localization approaches. We aim to address the problem of audio-visual source localization and separation in an unsupervised manner. The proposed approach employs low-rank in order to model the background visual and audio information and sparsity in order to extract the sparsely correlated components between the audio and visual modalities. In particular, this model decomposes each dataset into a sum of two terms: the low-rank matrices capturing the background uncorrelated information, while the sparse correlated components modelling the sound source in visual modality and the associated sound in audio modality. To this end a novel optimization problem, involving the minimization of nuclear norms and matrix ℓ1-norms is solved. We evaluated the proposed method in 1) visual localization and audio separation and 2) visual-assisted audio denoising. The experimental results demonstrate the effectiveness of the proposed method.

Conference2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Page range2901-2905
ISSN2379-190X
ISBN
Electronic9781509041176
Electronic9781509041169
Paperback9781509041183
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Publication dates
Print19 Jun 2017
Publication process dates
Deposited06 Mar 2018
Accepted12 Dec 2016
Output statusPublished
Accepted author manuscript
Copyright Statement

© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Digital Object Identifier (DOI)https://doi.org/10.1109/ICASSP.2017.7952687
LanguageEnglish
Book title2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Permalink -

https://repository.mdx.ac.uk/item/8786q

Download files


Accepted author manuscript
  • 15
    total views
  • 9
    total downloads
  • 2
    views this month
  • 1
    downloads this month

Export as