Comparative analysis of various machine learning techniques for classification of speech disfluencies

Article


Sharma, N., Kumar, V., Mahapatra, P. and Gandhi, V. 2023. Comparative analysis of various machine learning techniques for classification of speech disfluencies. Speech Communication. 150, pp. 23-31. https://doi.org/10.1016/j.specom.2023.04.003
TypeArticle
TitleComparative analysis of various machine learning techniques for classification of speech disfluencies
AuthorsSharma, N., Kumar, V., Mahapatra, P. and Gandhi, V.
Abstract

Speech plays a vital role in communication, from expressing oneself, to utilizing speech-based platforms, speech is a necessity. Any disruption in speech is referred to as disfluency, and can impact one’s quality of life. This paper presents an experimental study on various techniques for the detection and classification of speech disfluencies. Six different types of disfluencies are examined in this paper, namely Interjection, Sound Repetition, Word Repetition, Phrase Repetition, Revision and Prolongation (6 classes). However, this paper also goes a step further by including the clean speech signals as an added class alongside the six disfluencies, thereby making this work more robust with 7 classes. Various machine learning approaches have been investigated on the University College London Archive of Stuttered Speech (UCLASS) dataset; a standard disfluency dataset generated by University College London (UCL). Five different feature extraction techniques viz. Mel Frequency Cepstral Coefficients (MFCC), Linear Predictive Cepstral Coefficients (LPCC), Gammatone Frequency Cepstral Coefficients (GFCC), Mel-filterbank energy features, and Spectrograms have been used. Comparative analysis of various classifiers shows that MFCC, GFCC, and Spectrograms achieved greater than 90% accuracy on both 6 and 7 classes with the kNN classifier. As a future scope to this study, the authors aim to focus on tackling the challenges of detecting multiple disfluencies present simultaneously in a speech sample.

KeywordsDisfluency; Speech Recognition; Feature Extraction; Speech Signals
Sustainable Development Goals9 Industry, innovation and infrastructure
Middlesex University ThemeHealth & Wellbeing
PublisherElsevier
JournalSpeech Communication
ISSN0167-6393
Publication dates
Online23 Apr 2023
PrintMay 2023
Publication process dates
Submitted07 Nov 2022
Accepted22 Apr 2023
Deposited06 Nov 2023
Output statusPublished
Accepted author manuscript
License
Copyright Statement

© 2023. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/(opens in new tab/window)

Digital Object Identifier (DOI)https://doi.org/10.1016/j.specom.2023.04.003
LanguageEnglish
Permalink -

https://repository.mdx.ac.uk/item/vxy13

  • 78
    total views
  • 6
    total downloads
  • 1
    views this month
  • 0
    downloads this month

Export as

Related outputs

Analysis of machine learning methods for speech disfluency classification
Sharma. N., Gandhi, V. and Mahapatra, P. 2024. Analysis of machine learning methods for speech disfluency classification. Yang, X.S., Sherratt, S., Dey, N. and Joshi, A. (ed.) 9th International Congress on Information and Communication Technology. London, UK 19 - 22 Feb 2024 Singapore Springer. pp. 13-22 https://doi.org/10.1007/978-981-97-3556-3_2
Classification of EEG signals on standing, walking and running dataset using LSTM-RNN
Murugavalli, K., Ramalakshmi, R., Pallikonda Rajasekaran, M. and Gandhi, V. 2022. Classification of EEG signals on standing, walking and running dataset using LSTM-RNN. Sharma, V., Singh, M. and Sinha, J. (ed.) International Conference on Advances in Computing, Communication Control and Networking (ICAC3N). Greater Noida, India 16 - 17 Dec 2022 IEEE. pp. 1624-1630 https://doi.org/10.1109/ICAC3N56670.2022.10074500
Bridging neuroscience and robotics: spiking neural networks in action
Jones, A., Gandhi, V., Mahiddine, A. and Huyck, C. 2023. Bridging neuroscience and robotics: spiking neural networks in action. Sensors. 23 (21), pp. 1-14. https://doi.org/10.3390/s23218880
Classification of EEG signals on SEED dataset using improved CNN
Ramar, B., Ramalakshmi, R., Gandhi, V. and Pandiselvam, P. 2023. Classification of EEG signals on SEED dataset using improved CNN. 2nd International Conference on Edge Computing and Applications. Namakkal, India 19 - 21 Jul 2023 IEEE. pp. 1095-1102 https://doi.org/10.1109/ICECAA58104.2023.10212279
Exploration of functional connectivity of brain to assess cognitive and physical health parameters using brain-computer interface
Murugavalli, K., Ramalakshmi, R., Pallikonda Rajasekaran, M. and Gandhi, V. 2023. Exploration of functional connectivity of brain to assess cognitive and physical health parameters using brain-computer interface. International Journal of Biomedical Engineering and Technology. 43 (2), pp. 101-130. https://doi.org/10.1504/IJBET.2022.10052922
Neuromorphic building blocks for locomotion pattern generation
Yang, Z. and Gandhi, V. 2022. Neuromorphic building blocks for locomotion pattern generation. 2022 International Conference on Machine Learning, Control, and Robotics (MLCR). Suzhou, China 29 - 31 Oct 2022 IEEE. https://doi.org/10.1109/MLCR57210.2022.00010
Robot Operating System (ROS) controlled anthropomorphic robot hand
Krawczyk, M., Gandhi, V. and Yang, Z. 2022. Robot Operating System (ROS) controlled anthropomorphic robot hand. Journal of Scientific and Industrial Research. 81 (9), pp. 901-910. https://doi.org/10.56042/jsir.v81i09.45313
Exploring new traffic prediction models to build an intelligent transport system for Smart Cities
Mehta, V., Mapp, G. and Gandhi, V. 2022. Exploring new traffic prediction models to build an intelligent transport system for Smart Cities. IEEE/IFIP Network Operations and Management Symposium. Hungary 25 - 29 Apr 2022 pp. 1-6
Developing traffic predictions from source to destination using probabilistic modelling
Mehta, V., Gandhi, V. and Mapp, G. 2021. Developing traffic predictions from source to destination using probabilistic modelling. Third UK Mobile, Wearable and Ubiquitous Systems Research Symposium. Online via Zoom 05 - 06 Jul 2021
Exploring real time traffic signalling using probabilistic approach in intelligent transport system
Mehta, V., Gandhi, V. and Mapp, G. 2018. Exploring real time traffic signalling using probabilistic approach in intelligent transport system. Mobi-UK 2018. University of Cambridge, Cambridge, UK 12 - 13 Sep 2018
Exploring real time traffic signalling using probabilistic approach in intelligent transport system
Mehta, V., Gandhi, V. and Mapp, G. 2018. Exploring real time traffic signalling using probabilistic approach in intelligent transport system. 3rd CommNet2 PhD Autumn School. University of Sheffield, Sheffield, UK 17 - 19 Sep 2018
Design and development of the sEMG-based exoskeleton strength enhancer for the legs
Cenit, M. and Gandhi, V. 2020. Design and development of the sEMG-based exoskeleton strength enhancer for the legs. Journal of Mechatronics, Electrical Power, and Vehicular Technology. 11 (2), pp. 64-74. https://doi.org/10.14203/j.mev.2020.v11.64-74
A survey of modern exogenous fault detection and diagnosis methods for swarm robotics
Graham Miller, O. and Gandhi, V. 2020. A survey of modern exogenous fault detection and diagnosis methods for swarm robotics. Journal of King Saud University – Engineering Science. 33 (1), pp. 43-53. https://doi.org/10.1016/j.jksues.2019.12.005
What makes a social robot good at interacting with humans?
Onyeulo, E. and Gandhi, V. 2020. What makes a social robot good at interacting with humans? Information. 11 (1), pp. 1-13. https://doi.org/10.3390/info11010043
Developing traffic prediction and congestion algorithms for a C-ITS network
Mehta, V., Gandhi, V. and Mapp, G. 2019. Developing traffic prediction and congestion algorithms for a C-ITS network. Second UK Mobile, Wearable and Ubiquitous Systems Research Symposium. Dept of Computer Science, University of Oxford, UK 01 Jul 2019
Wrist movement detector for ROS based control of the robotic hand
Krawczyk, M., Yang, Z., Gandhi, V., Karamanoglu, M., Franca, F., Priscila, L., Xiaochen, W. and Geng, T. 2018. Wrist movement detector for ROS based control of the robotic hand. Advances in Robotics & Automation. 7 (1). https://doi.org/10.4172/2168-9695.1000182
Development of an EMG-controlled mobile robot
Bisi, S., De Luca, L., Shrestha, B., Yang, Z. and Gandhi, V. 2018. Development of an EMG-controlled mobile robot. Robotics. 7 (3), pp. 1-13. https://doi.org/10.3390/robotics7030036
Project-based cooperative learning to enhance competence while teaching engineering modules
Gandhi, V., Yang, Z. and Aiash, M. 2017. Project-based cooperative learning to enhance competence while teaching engineering modules. International Journal of Continuing Engineering Education and Life-Long Learning. 27 (3), pp. 198-208. https://doi.org/10.1504/IJCEELL.2017.10003462
ROS based autonomous control of a humanoid robot
Kalyani, G., Gandhi, V., Yang, Z. and Geng, T. 2016. ROS based autonomous control of a humanoid robot. 25th International Conference on Artificial Neural Networks (ICANN). Barcelona, Spain 06 - 09 Sep 2016 Springer. pp. 550-551 https://doi.org/10.1007/978-3-319-44778-0
Using robot operating system (ROS) and single board computer to control bioloid robot motion
Kalyani, G., Yang, Z., Gandhi, V. and Geng, T. 2017. Using robot operating system (ROS) and single board computer to control bioloid robot motion. 18th Towards Autonomous Robotic Systems (TAROS) Conference. Guildford, Surrey, UK 19 - 21 Jul 2017 Springer. pp. 41-50 https://doi.org/10.1007/978-3-319-64107-2_4
Neuron-based control mechanisms for a robotic arm and hand
Singh, N., Huyck, C., Gandhi, V. and Jones, A. 2017. Neuron-based control mechanisms for a robotic arm and hand. International Journal of Computer, Electrical, Automation, Control and Information Engineering. 11 (2), pp. 221-229. https://doi.org/10.5281/zenodo.1128871
Brain computer interface: a review
Parmar, P., Joshi, A. and Gandhi, V. 2015. Brain computer interface: a review. NUiCONE 2015: 5th Nirma University International Conference on Engineering. Nirma University, Ahmedabad, India 26 - 28 Nov 2015 Institute of Electrical and Electronics Engineers (IEEE). pp. 1-6 https://doi.org/10.1109/NUICONE.2015.7449615
Characterising information correlation in a stochastic Izhikevich neuron
Yang, Z., Gandhi, V., Karamanoglu, M. and Graham, B. 2015. Characterising information correlation in a stochastic Izhikevich neuron. International Joint Conference on Neural Networks (IJCNN 2015). Killarney, Republic of Ireland 12 - 17 Jul 2015 Institute of Electrical and Electronics Engineers (IEEE). pp. 1-5
EMG based elbow joint powered exoskeleton for biceps brachii strength augmentation
Krasin, V., Gandhi, V., Yang, Z. and Karamanoglu, M. 2015. EMG based elbow joint powered exoskeleton for biceps brachii strength augmentation. International Joint Conference on Neural Networks (IJCNN 2015). Killarney, Republic of Ireland 12 - 17 Jul 2015 Institute of Electrical and Electronics Engineers (IEEE). pp. 1-6 https://doi.org/10.1109/IJCNN.2015.7280643
Evaluating quantum neural network filtered motor imagery brain-computer interface using multiple classification techniques
Gandhi, V., Prasad, G., Coyle, D., Behera, L. and McGinnity, T. 2015. Evaluating quantum neural network filtered motor imagery brain-computer interface using multiple classification techniques. Neurocomputing. 170, pp. 161-167. https://doi.org/10.1016/j.neucom.2014.12.114
Brain-computer interfacing for assistive robotics: electroencephalograms, recurrent quantum neural networks and user-centric graphical user interfaces
Gandhi, V. 2014. Brain-computer interfacing for assistive robotics: electroencephalograms, recurrent quantum neural networks and user-centric graphical user interfaces. Elsevier.
EEG-based mobile robot control through an adaptive brain–robot interface
Gandhi, V., Prasad, G., Coyle, D., Behera, L. and McGinnity, T. 2014. EEG-based mobile robot control through an adaptive brain–robot interface. IEEE Transactions on Systems Man and Cybernetics: Systems. 44 (9), pp. 1278-1285. https://doi.org/10.1109/TSMC.2014.2313317
Image classification based on textural features using Artificial Neural Network (ANN)
Shah, S. and Gandhi, V. 2004. Image classification based on textural features using Artificial Neural Network (ANN). Journal of The Institution of Engineers (India): Series A. 84, pp. 72-77.
Image classification based on textural features using unsupervised neural network
Gandhi, V. 2006. Image classification based on textural features using unsupervised neural network. 1st International Indian Geographical Congress. Hyderabad, India 05 - 07 Oct 2006
A recurrent quantum neural network model enhances the EEG signal for an improved brain-computer interface
Gandhi, V., Arora, V., Behera, L., Prasad, G., Coyle, D. and McGinnity, T. 2011. A recurrent quantum neural network model enhances the EEG signal for an improved brain-computer interface. in: IET Seminar on Assisted Living 2011 London Institution of Engineering and Technology. pp. 42-47
Quantum neural network based surface EMG signal filtering for control of robotic hand
Gandhi, V. and McGinnity, M. 2013. Quantum neural network based surface EMG signal filtering for control of robotic hand. IJCNN 2013: The International Joint Conference on Neural Networks. Dallas, TX, USA 04 - 09 Aug 2013
Intelligent adaptive user interfaces for BCI based robotic control
Gandhi, V., Prasad, G., McGinnity, M., Coyle, D. and Behera, L. 2013. Intelligent adaptive user interfaces for BCI based robotic control. BCI meeting. USA Graz University of Technology Publishing House. https://doi.org/10.3217/978-3-85125-260-6-130
Quantum neural network-based EEG filtering for a brain-computer interface
Gandhi, V., Prasad, G., Coyle, D., Behera, L. and McGinnity, T. 2013. Quantum neural network-based EEG filtering for a brain-computer interface. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2013.2274436
EEG filtering with quantum neural networks for a Brain-Computer Interface (BCI)
Gandhi, V., Prasad, G., Coyle, D., Behera, L. and McGinnity, T. 2012. EEG filtering with quantum neural networks for a Brain-Computer Interface (BCI). Young researchers futures meeting: Neural engineering. University of Warwick 19 - 21 Sep 2012 pp. 21
A novel EEG signal enhancement approach using a recurrent quantum neural network for a Brain Computer Interface
Gandhi, V., Prasad, G., Coyle, D., Behera, L. and McGinnity, M. 2011. A novel EEG signal enhancement approach using a recurrent quantum neural network for a Brain Computer Interface. Technically Assisted Rehabilitation. Berlin, Germany 17 - 18 Mar 2011
An intelligent Adaptive User Interface (iAUI) for enhancing the communication in a Brain-Computer Interface (BCI)
Gandhi, V., Prasad, G., Coyle, D., Behera, L. and McGinnity, M. 2011. An intelligent Adaptive User Interface (iAUI) for enhancing the communication in a Brain-Computer Interface (BCI). UKIERI workshop on the Fusion of Brain-Computer Interface and Assistive Robotics. University of Ulster 07 - 08 Jul 2011
EEG denoising with a recurrent quantum neural network for a brain-computer interface
Gandhi, V., Arora, V., Behera, L., Prasad, G., Coyle, D. and McGinnity, T. 2011. EEG denoising with a recurrent quantum neural network for a brain-computer interface. 2011 International Joint Conference on Neural Networks (IJCNN). San Jose, CA, USA 31 Jul - 05 Aug 2011 IEEE. pp. 1583-1590 https://doi.org/10.1109/IJCNN.2011.6033413
Interfacing a dynamic interface paradigm for multiple target selection using a two class brain-computer interface
Gandhi, V., Coyle, D., Prasad, G., Bharti, C., Behera, L. and McGinnity, M. 2009. Interfacing a dynamic interface paradigm for multiple target selection using a two class brain-computer interface. Indo-US Workshop on System of Systems Engineering. IIT Kanpur, India 26 - 28 Oct 2009 https://doi.org/10.1049/cp.2009.1690
A novel paradigm for multiple target selection using a two class brain computer interface
Gandhi, V., Prasad, G., Coyle, D., Behera, L. and McGinnity, M. 2009. A novel paradigm for multiple target selection using a two class brain computer interface. Irish Signal & Systems Conference. Dublin, Ireland 10 - 11 Jun 2009 Dublin IET. https://doi.org/10.1049/cp.2009.1690