Document Type
Conference Proceeding
Publication Date
1-1-2024
Journal / Book Title
15th International Conference on Information Intelligence Systems and Applications Iisa 2024
Abstract
Early detection of tuberculosis (TB) remains a critical challenge. This research presents a novel approach leveraging audio information from cough recordings for predicting TB. We move beyond traditional image-based methods (such as sputum smear microscopy and chest X-rays) and explore the feasibility of leveraging cough recordings for differentiating TB cases. Two main audio processing techniques, i.e. Mel-Spectrograms and Mel-Frequency Cepstral Coefficients (MFCCs), are utilized to feature encoding audio recording into deep learning models for TB classification. Our proposed methods leverage a large challenge dataset encompassing clinical data from over 1,105 participants and over 502,252 cough recordings. Notably, a simple 1D convolutional neural network (CNN) trained on MFCC features achieves an accuracy of 91%, exceeding the World Health Organization's (WHO) requirements for TB screening tests. Our findings highlight the potential of MFCC features and 1D CNNs for accurate TB detection using cough sounds data. This approach aligns with the Occam's Razor principle, favoring simpler models (such as 1D CNNs) when both achieve good results. This research opens the door to further study in diverse populations and translation to accessible TB screening solutions, especially in resource-limited settings where only cough recording can be collected, highlighting its real-world impact.
DOI
10.1109/IISA62523.2024.10786619
Journal ISSN / Book ISBN
85215807149 (Scopus)
Montclair State University Digital Commons Citation
Yadav, Jyoti; Varde, Aparna S.; Liu, Hao; Antoniou, George; and Xie, Lei, "Audiovisual Multimodal Cough Data Analysis for Tuberculosis Detection" (2024). School of Computing Faculty Scholarship and Creative Works. 73.
https://digitalcommons.montclair.edu/computing-facpubs/73