Stratifying no-show patients into multiple risk groups via a holistic data analytics-based framework

Document Type


Publication Date


Journal / Book Title

Decision Support Systems


Accurate prediction of no-show patients plays a crucial role as it enables researchers to increase the efficiency of their scheduling systems. The purpose of the current study is to formulate a novel hybrid data mining-based methodology to a) accurately predict the no-show patients, b) build a parsimonious model by employing a comprehensive variable selection procedure, c) build a model that does not suffer due to data imbalance, and d) provide healthcare agencies with a patient-specific risk level. Our study suggests that an Artificial Neural Network (ANN) model should be employed as a classification algorithm in predicting patient no-shows by using the variable set that is commonly selected by a Genetic Algorithm (GA) and Simulated Annealing (SA). In addition, we used Random Under Sampling (RUS) to improve the performance of the model in predicting the minority group (no-show) patients. The patient-specific risk scores were justified by applying a threshold sensitivity analysis. Also, the web-based decision support tool that can be adopted by clinics is developed. The clinics can incorporate their own intuition/incentive to make the final decision on the cases where the model is not confident enough (i.e. when the estimated probabilities fall near the decision boundary). These insights enable health care professionals to improve clinic utilization and patient outcomes.