Document Type

Article

Publication Date

1-9-2025

Journal / Book Title

PLoS ONE

Abstract

This study uses the Oracle SQL certification exam questions to explore the design of automatic classifiers for exam questions containing code snippets. SQL’s question classification assigns a class label in the exam topics to a question. With this classification, questions can be selected from the test bank according to the testing scope to assemble a more suitable test paper. Classifying questions containing code snippets is more challenging than classifying questions with general text descriptions. In this study, we use factorial experiments to identify the effects of the factors of the feature representation scheme and the machine learning method on the performance of the question classifiers. Our experiment results showed the classifier with the TF-IDF scheme and Logistics Regression model performed best in the weighted macro-average AUC and F1 performance indices. The classifier with TF-IDF and Support Vector Machine performed best in weighted macro-average Precision. Moreover, the feature representation scheme was the main factor affecting the classifier’s performance, followed by the machine learning method, over all the performance indices.

DOI

10.1371/journal.pone.0309050

Rights

This is an open access article distributed under the terms of the Creative Commons Attribution License

Published Citation

Chen H-Y, Shih P-C, Wang Y (2025) Exploration of designing an automatic classifier for questions containing code snippets—A case study of Oracle SQL certification exam questions. PLoS ONE 20(1): e0309050. https://doi.org/10.1371/journal.pone.0309050

Share

COinS