Document Type

Preprint

Publication Date

1-1-2024

Journal / Book Title

Information Security Journal

Abstract

[Context:] Failure to predict vulnerability in the earlier stage of development can cause vulnerable code being written and deployed in the final software product. Vulnerability prediction using software metrics as features can support the discovery process by localizing vulnerable code. Existing studies have successfully employed metrics for vulnerability prediction for some platforms (C/C++ or Java projects). We propose that a comparative evaluation of how these metrics perform in projects of different languages can help the developers in deciding whether metrics-based prediction approach can be effective in their own project’s context. [Objective:] The purpose of this research is to analyze/compare the performance of software metrics in vulnerability-prediction for different programming language contexts (Java vs. Python). [Method:] We conducted experiments on vulnerabilities reported for Apache Tomcat (releases 6 and 7), Apache CXF, and two Python projects (Django and Keystone). We applied machine learning for predicting a particular type of code component (Java and Python classes) as vulnerable/non-vulnerable. [Results:] We found that metrics-based prediction can predict Java vulnerable classes with higher recall and precision than the Python vulnerable classes. [Conclusion:] This study at class-level will help developers to predict vulnerabilities at the class-level and assist in secure coding in object-oriented programming.

DOI

10.1080/19393555.2023.2240343

Journal ISSN / Book ISBN

85175105286 (Scopus)

Share

COinS