Audio-assisted movie dialogue detection
Article
Kotti, M., Ververidis, D., Panagakis, Y., Kotropoulos, C., Maragos, P. and Pitas, I. 2008. Audio-assisted movie dialogue detection. IEEE Transactions on Circuits and Systems for Video Technology. 18 (11), pp. 1618-1627. https://doi.org/10.1109/TCSVT.2008.2005613
Type | Article |
---|---|
Title | Audio-assisted movie dialogue detection |
Authors | Kotti, M., Ververidis, D., Panagakis, Y., Kotropoulos, C., Maragos, P. and Pitas, I. |
Abstract | An audio-assisted system is investigated that detects if a movie scene is a dialogue or not. The system is based on actor indicator functions. That is, functions which define if an actor speaks at a certain time instant. In particular, the crosscorrelation and the magnitude of the corresponding the crosspower spectral density of a pair of indicator functions are input to various classifiers, such as voted perceptrons, radial basis function networks, random trees, and support vector machines for dialogue/non-dialogue detection. To boost classifier efficiency AdaBoost is also exploited. The aforementioned classifiers are trained using ground truth indicator functions determined by human annotators for 41 dialogue and another 20 non-dialogue audio instances. For testing, actual indicator functions are derived by applying audio activity detection and actor clustering to audio recordings. 23 instances are randomly chosen among the aforementioned 41 dialogue instances, 17 of which correspond to dialogue scenes and 6 to non-dialogue ones. Accuracy ranging between 0.739 and 0.826 is reported. |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Journal | IEEE Transactions on Circuits and Systems for Video Technology |
ISSN | 1051-8215 |
Electronic | 1558-2205 |
Publication dates | |
Online | 23 Sep 2008 |
01 Nov 2008 | |
Publication process dates | |
Deposited | 06 Mar 2018 |
Accepted | 11 Jul 2008 |
Output status | Published |
Accepted author manuscript | |
Copyright Statement | © 2008 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Digital Object Identifier (DOI) | https://doi.org/10.1109/TCSVT.2008.2005613 |
Language | English |
https://repository.mdx.ac.uk/item/8783x
Download files
14
total views5
total downloads1
views this month1
downloads this month