Document Type

Article

Publication Date

2-1-2024

Journal / Book Title

Jasa Express Letters

Abstract

The effects of different acoustic representations and normalizations were compared for classifiers predicting perception of children's rhotic versus derhotic /ɹ/. Formant and Mel frequency cepstral coefficient (MFCC) representations for 350 speakers were z-standardized, either relative to values in the same utterance or age-and-sex data for typical /ɹ/. Statistical modeling indicated age-and-sex normalization significantly increased classifier performances. Clinically interpretable formants performed similarly to MFCCs and were endorsed for deep neural network engineering, achieving mean test-participant-specific F1-score = 0.81 after personalization and replication (σx = 0.10, med = 0.83, n = 48). Shapley additive explanations analysis indicated the third formant most influenced fully rhotic predictions.

DOI

10.1121/10.0024632

Rights

© 2024 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Published Citation

Nina R. Benway, Jonathan L. Preston, Asif Salekin, Elaine Hitchcock, Tara McAllister; Evaluating acoustic representations and normalization for rhoticity classification in children with speech sound disorders. JASA Express Lett. 1 February 2024; 4 (2): 025201. https://doi.org/10.1121/10.0024632

Share

COinS