Development of speech emotion recognition system using optimized convolutional neural network

  • B. F. Adebiyi LAUTECH
  • A. O. Oke Department of Computer Engineering, LAUTECH
  • A. S. Falohun Department of Computer Engineering, LAUTECH
  • O. O. Awodoye Department of Computer Engineering, LAUTECH

Abstract

Speech Emotion Recognition (SER) allows systems to interpret emotions in human speech, creating more natural and responsive interactions between people and machines. Due to the complex nature of emotion detection, several deep learning techniques have been utilized, yet limited research have focused on optimizing key hyperparameters of Convolutional Neural Network (CNN) for a more efficient system. Hence, this research optimized CNN with Mantis Search Algorithm (MSA) due to its ease of implementation, ability to preserve population diversity during the optimization process, ability to escape from the local optima and balance between exploration and exploitation operators. Audio data for four emotions: anger, fear, happiness and neutrality were acquired from Toronto Emotional Speech Set (TESS) available on Kaggle.com. The audio data were then converted into text using speech-to-text code and preprocessed using Natural Language Processing (NLP) techniques: tokenization, removal of stop words, lemmatization, removal of punctuations and lowercase conversion. Mantis Search Algorithm was then applied to optimize CNN for optimal selection of filter size and learning rate.  The optimized CNN (MSA-CNN) was implemented using MATLAB R2023a software. The performance of the system was evaluated and compared with CNN classifier using False Positive Rate (FPR), Specificity (Spec), Sensitivity (Sen), Precision (Prec), Accuracy (Acc), and Recognition Time (RT). The optimized speech emotion recognition system showed improved values over CNN on all the metrics considered.

Published
2025-01-23
How to Cite
Adebiyi, B., Oke, A., Falohun, A., & Awodoye, O. (2025). Development of speech emotion recognition system using optimized convolutional neural network. LAUTECH Journal of Engineering and Technology, 18(No 4), 129-139. Retrieved from https://laujet.com/index.php/laujet/article/view/765