MFCC as Features for Speaker Classification using Machine Learning

Xiaojie Mu; Cheol-Hong Min

doi:10.1109/AIIoT58121.2023.10174566

Back

Conference proceeding

MFCC as Features for Speaker Classification using Machine Learning

Xiaojie Mu and Cheol-Hong Min

2023 IEEE World AI IoT Congress (AIIoT), pp.0566-0570

06/07/2023

DOI: https://doi.org/10.1109/AIIoT58121.2023.10174566

Abstract

Feature extraction

Machine learning

MFCC

Object recognition

Pitch Frequency

Security

Speaker recognition

Speech recognition

Support vector machines

Speech recognition is an important area in modern-day systems for security and communications. In this preliminary study, a speaker recognition system using MFCC (Mel Frequency Cepstral Coefficient) and machine learning has been designed, developed, and evaluated for male and female identification and speaker identification applications. In this study, pitch frequency and MFCC is used for gender and speaker identification application. Male and female speech signals are recorded in a .wav file at a sampling rate of 16kHz using an application on the phone in a quiet place and then transferred to a computer. This speech signal data is processed using MATLAB to calculate and analyze in the time and frequency domain. The MFCC features in the speech signal are extracted by a small window of 20 to 40 milliseconds in the classifier. This preliminary study extracts the features over 32ms, which shows that N is 512 due to the 16kHz sampling frequency. MFCC features are used to train and predict using Tree, SVM and KNN models with the highest prediction accuracy of 99.6%. Therefore, the study shows MFCC as an excellent feature for speaker recognition classification tasks.

Metrics

4 Record Views

Details

Title: MFCC as Features for Speaker Classification using Machine Learning
Author/Creator: Xiaojie Mu - University of St. Thomas - Minnesota
Cheol-Hong Min - University of St. Thomas - Minnesota
Publication Details: 2023 IEEE World AI IoT Congress (AIIoT), pp.0566-0570
Publisher: IEEE
Academic Unit: Electrical & Computer Engineering
Language: English
Resource Type: Conference proceeding
Record Identifier: 991015164275003691

MFCC as Features for Speaker Classification using Machine Learning

Abstract

Related links

Metrics

Details