Regresar

SMCS: Automatic Real-Time Classification of Ambient Sounds, Based on a Deep Neural Network and Mel Frequency Cepstral Coefficients

Abstract:

This paper presents a model to classify ambient sounds in an automatic and real-time way using the sound dataset provided in the Kaggle free sounds competition. For this, two data preprocessing techniques are performed, the first, length normalization that unifies the audio inputs to a single time interval and the second, property normalization that standardizes the sampling frequency and bit depth; This also includes a DNN (Deep Neural Network) capable of classifying common environmental sounds, the input for the network is formed by MFCC (Mel Frequency Cepstral Coefficients) vectors, which reduces the processing time improving the response capacity of the model for detect sounds, especially those that are considered warning signs about environmental threats, facilitating the mobility of people with hearing impairment.