Gender Recognition of Teen and Adult Voices in Non-Tonal and Tonal Languages in Uncontrolled Environments

Enrique Díaz-Ocampo, Areli Karina Martínez-Tapia, Andrea Magadán-Salazar, Raúl Pinto-Elías, Máximo López-Sánchez, Yael Bensoussan

Abstract


Voice gender recognition systems is a term that refers the automatization of gender detection by an acoustic signal of voice. These systems can be trained in uncontrolled environments, whose audios present different types of noises and speaker characteristics. However, the current systems present a bias in the training language, which is usually mainly English. The present work focused on the gender recognition of adult and teen voices in a group of tonal languages and Spanish under uncontrolled environments. The features used were 7 derived from pitch, and two from the mean of the fourth formant and vocal tract length. Two scenarios were built: a training-test scenario on one dataset, and a second validation scenario using the other dataset. The metrics used were accuraccy, recall, F1-score, and area under the ROC curve. The algorithms used were Multilayer Perceptron and Random Forest. Despite the bias in the datasets, the biological features and the algorithms were robust to language change.

Keywords


Voice gender recognition; fundamental frequency; vocal tract length; tonal language; spanish language

Full Text: PDF