Document Type

Article

Publication Date

8-27-2025

Journal / Book Title

Computational Economics

Abstract

We examine U.S. publicly traded bank holding companies (BHCs) that failed during the 2007–2009 global financial crisis. Using consolidated data at the BHC level and 10-K filings, we investigate the determinants of bank failures during this period using nonlinear machine learning (ML). The in-sample analysis demonstrates that 90% of the failed banks can be classified during 2007–2009. In addition, our sensitivity analysis for interpretable ML shows that net tone is among the top five important features. However, the power of tone/text is less evident when we consider predictive (out-of-sample) analysis. While nonlinear ML models such as random forest and support vector regressions benefit from textual data in forming predictions, linear models that rely on actuarial data attain a similar or even better performance. Overall, our paper demonstrates that the least complex linear models use conventional financial ratios efficiently in predicting the failure of publicly traded banks, deeming more complex ML algorithms with 10-K textual data redundant. Our findings remain robust, even when incorporating large language models such as FinBERT (Huang et al. in Contemp Account Res 40(2):806–841, 2023).

DOI

10.1007/s10614-025-10969-2

Rights

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Published Citation

Gupta, A., Lu, C., Simaan, M. et al. When Positive Sentiment is not so Positive: Textual Analytics and Bank Failures. Comput Econ (2025). https://doi.org/10.1007/s10614-025-10969-2

Share

COinS