Analysis of machine learning methods for speech disfluency classification
Conference paper
Sharma. N., Gandhi, V. and Mahapatra, P. 2024. Analysis of machine learning methods for speech disfluency classification. Yang, X.S., Sherratt, S., Dey, N. and Joshi, A. (ed.) 9th International Congress on Information and Communication Technology. London, UK 19 - 22 Feb 2024 Singapore Springer. pp. 13-22 https://doi.org/10.1007/978-981-97-3556-3_2
Type | Conference paper |
---|---|
Title | Analysis of machine learning methods for speech disfluency classification |
Authors | Sharma. N., Gandhi, V. and Mahapatra, P. |
Abstract | Speech is essential for communication as it allows us to express ourselves and enables us to use the systems that are speech-based. Disfluency is referred to as any interruption in speaking and can often adversely impact an individual's life quality. The paper presents an experimental study of several methods for identifying and categorizing the speech disfluencies. More specifically, this study discusses two disfluencies: prolongation and repetition. We have investigated various machine learning algorithms using the University College London Archive of Stuttered Speech dataset, a popular disfluency dataset created by University College London. Manual segmentation, although a time-consuming approach, has been performed on ten speech files from the dataset, generating a total of 335 disfluent speech samples to train the classifiers. Linear Predictive Cepstral Coefficients and Mel-Frequency Cepstral Coefficients (MLCCs) are two feature extraction methods that have been applied. A comparison of many classifiers and their variations reveals that subspace kNN achieves the highest test accuracy of 87.1% with MFCC features. The future plan is to develop a system for automatically segmenting disfluent speech and classify it. |
Sustainable Development Goals | 3 Good health and well-being |
Middlesex University Theme | Health & Wellbeing |
Research Group | Foundations of Computing group |
Conference | 9th International Congress on Information and Communication Technology |
Page range | 13-22 |
Proceedings Title | Proceedings of Ninth International Congress on Information and Communication Technology: ICICT 2024, London, Volume 2 |
Series | Lecture Notes in Networks and Systems |
Editors | Yang, X.S., Sherratt, S., Dey, N. and Joshi, A. |
ISSN | 2367-3370 |
Electronic | 2367-3389 |
ISBN | |
Hardcover | 9789819735556 |
Electronic | 9789819735563 |
Publisher | Springer |
Place of publication | Singapore |
Publication dates | |
Online | 10 Aug 2024 |
Publication process dates | |
Accepted | Dec 2023 |
Deposited | 04 Nov 2024 |
Output status | Published |
Accepted author manuscript | File Access Level Open |
Copyright Statement | This version of the paper has been accepted for publication, after peer review and is subject to Springer Nature’s AM terms of use (https://www.springernature.com/gp/open-science/policies/accepted-man...), but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/978-981-97-3556-3_2 |
Digital Object Identifier (DOI) | https://doi.org/10.1007/978-981-97-3556-3_2 |
Web address (URL) of conference proceedings | https://doi.org/10.1007/978-981-97-3556-3 |
Language | English |
https://repository.mdx.ac.uk/item/1v0469
Restricted files
Accepted author manuscript
11
total views1
total downloads1
views this month0
downloads this month