Alex Net-Based Speech Emotion Recognition Using 3D Mel-Spectrograms
Journal Title: International Journal of Innovations in Science and Technology - Year 2024, Vol 6, Issue 2
Abstract
Speech Emotion Recognition (SER) is considered a challenging task in the domain of Human-Computer Interaction (HCI) due to the complex nature of audio signals. To overcome this challenge, we devised a novel method to fine-tune Convolutional Neural Networks (CNNs) for accurate recognition of speech emotion. This research utilized the spectrogram representation of audio signals as input to train a modified Alex Net model capable of processing signals of varying lengths. The IEMOCAP dataset was utilized to identify multiple emotional states such as happy, sad, angry, and neutral from the speech. The audio signal was preprocessed to extract a 3D spectrogram that represents time, frequencies, and color amplitudes as key features. The output of the modified Alex Net model is a 256-dimensional vector. The model achieved adequate accuracy, highlighting the effectiveness of CNNs and 3D Mel-Spectrograms in achieving precise and efficient speech emotion recognition, thus paving the way for significant advancements in this domain.
Authors and Affiliations
Sara Ali, BushraNaz,Sanam Narejo, Zohaib Ahmed
Unlocking Potential: Personality-Aware TVET Course Recommendations Revolutionize Skill Development
Personality is a complex amalgamation of ideas, behaviors, and social constructs that shape our self-perception and influence our interactions with others. It tends to remain relatively stable over time. The developmen...
Implementation of Renthub System: An Intelligent Online Rental Marketplace with ML-Powered Personalized Product Discovery and Recommendations
The rapid expansion of peer-to-peer rental services has significantly influenced the share economy by connecting consumers with short-term access to diverse rental products. However, existing platforms primarily focus...
Investigate the Operating Temperature Effect onFast Pyrolysis Products ofFood Waste withHydrogen
Energy crises and environmental pollution are the main issues of concern all over the world and the disposal of wastes by converting into gaseous products can reduce this to a level. Investigating how operating tempera...
Design of a Modified Wilkinson Power Divider for Ultra-Wideband Antipodal Vivaldi Antenna Arrays
This paper presents the design of a two-wayModifiedWilkinson power divider(MWPD) feeding networkfor atwo-element Antipodal Vivaldi Antenna (AVA) array, operating in the 3–10 GHz ultra-wideband (UWB) frequency...
An Efficient and Robust Deep Learning Approach for Vehicle Recognition using Light-weight Deep Network
In the realm of intelligent transportation systems, automatic number plate detection has emerged as a crucial research topic due to its wide range of applications, including traffic violation monitoring, support for au...