Document Type
Article
Publication Date
2-1-2024
Journal / Book Title
Jasa Express Letters
Abstract
The effects of different acoustic representations and normalizations were compared for classifiers predicting perception of children's rhotic versus derhotic /ɹ/. Formant and Mel frequency cepstral coefficient (MFCC) representations for 350 speakers were z-standardized, either relative to values in the same utterance or age-and-sex data for typical /ɹ/. Statistical modeling indicated age-and-sex normalization significantly increased classifier performances. Clinically interpretable formants performed similarly to MFCCs and were endorsed for deep neural network engineering, achieving mean test-participant-specific F1-score = 0.81 after personalization and replication (σx = 0.10, med = 0.83, n = 48). Shapley additive explanations analysis indicated the third formant most influenced fully rhotic predictions.
DOI
10.1121/10.0024632
Rights
© 2024 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Montclair State University Digital Commons Citation
Benway, Nina R.; Preston, Jonathan L.; Salekin, Asif; Hitchcock, Elaine; and McAllister, Tara, "Evaluating acoustic representations and normalization for rhoticity classification in children with speech sound disorders" (2024). Department of Communication Sciences and Disorders Faculty Scholarship and Creative Works. 169.
https://digitalcommons.montclair.edu/communcsci-disorders-facpubs/169
Published Citation
Nina R. Benway, Jonathan L. Preston, Asif Salekin, Elaine Hitchcock, Tara McAllister; Evaluating acoustic representations and normalization for rhoticity classification in children with speech sound disorders. JASA Express Lett. 1 February 2024; 4 (2): 025201. https://doi.org/10.1121/10.0024632