Document Type
Article
Publication Date
12-22-2025
Journal / Book Title
Big Data and Cognitive Computing
Abstract
This study investigates whether incorporating highlighted information in discharge notes improves the quality of the summaries generated by Large Language Models (LLMs). Specifically, it evaluates the effect of using highlighted versus unhighlighted inputs for fine-tuning LLaMA2-13B model for summarization tasks. We fine-tuned LlaMA2-13B in two variants using MIMIC-IV-Ext-BHC dataset: one variant fine-tuned with the highlighted discharge notes (H-LLaMA), and the other on the same set of notes without highlighting (U-LLaMA). Highlighting was performed automatically using a Cardiology Interface Terminology (CIT) presented in our previous work. H-LLaMA and U-LLaMA were evaluated on a randomly selected test set of 100 discharge notes using multiple metrics (including BERTScore, ROUGE-L, BLEU, and SummaC_CONV). Additionally, LLM-based judgment via ChatGPT-4o rated coherence, fluency, conciseness, and correctness, alongside a manual completeness evaluation on a random sample of 40 notes. H-LLaMA consistently outperformed U-LLaMA across all metrics. H-summaries, generated using H-LLaMA, in comparison to U-summaries, generated using U-LLaMA, achieved higher BERTScore (63.75 vs. 59.61), ROUGE-L (23.43 vs. 21.82), BLEU (10.4 vs. 8.41), and SummaC_CONV (67.7 vs. 40.2). Manual review also showed improved completeness for H-summaries (54.8% vs. 47.6%). All improvements were statistically significant (p < 0.05). Moreover, LLM-based evaluation indicated higher average ratings across coherence, correctness, and conciseness.
DOI
10.3390/bdcc10010004
Montclair State University Digital Commons Citation
Dehkordi, Mahshad Koohi Habibi; Perl, Yehoshua; Deek, Fadi P.; and Liu, Hao, "Fine-Tuning LLaMA2 for Summarizing Discharge Notes: Evaluating the Role of Highlighted Information" (2025). School of Computing Faculty Scholarship and Creative Works. 85.
https://digitalcommons.montclair.edu/computing-facpubs/85
Rights
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Published Citation
Koohi Habibi Dehkordi, M., Perl, Y., Deek, F. P., & Liu, H. (2026). Fine-Tuning LLaMA2 for Summarizing Discharge Notes: Evaluating the Role of Highlighted Information. Big Data and Cognitive Computing, 10(1), 4. https://doi.org/10.3390/bdcc10010004