Abstract
Speech recognition is an important area in modern-day systems for security and communications. In this preliminary study, a speaker recognition system using MFCC (Mel Frequency Cepstral Coefficient) and machine learning has been designed, developed, and evaluated for male and female identification and speaker identification applications. In this study, pitch frequency and MFCC is used for gender and speaker identification application. Male and female speech signals are recorded in a .wav file at a sampling rate of 16kHz using an application on the phone in a quiet place and then transferred to a computer. This speech signal data is processed using MATLAB to calculate and analyze in the time and frequency domain. The MFCC features in the speech signal are extracted by a small window of 20 to 40 milliseconds in the classifier. This preliminary study extracts the features over 32ms, which shows that N is 512 due to the 16kHz sampling frequency. MFCC features are used to train and predict using Tree, SVM and KNN models with the highest prediction accuracy of 99.6%. Therefore, the study shows MFCC as an excellent feature for speaker recognition classification tasks.