Machine Learning-Based Malware Detection and Malicious URL Classification System for Detecting Cyberattacks and Achieving Cybersecurity

Document Type

Article

Publication Date

2025

Abstract

Owing to progress in the usage of Internet-based applications, cybersecurity has become more important in various fields, such as education, government, and finance, as the risk of cyberattacks against networks has evolved. In this paper, we present a novel approach to developing an effective system that both detects malware and classifies URLs. We developed machine learning-driven malware detection techniques for the proposed system using portable executable (PE) header data. In our system, we employ decision trees and random forest models for malware detection, in which the chosen features are trained and tested to achieve high accuracy. The classifier that yields the highest classification accuracy is retained for use when new information needs to be classified. We applied strategies such as feature engineering, hyperparameter tuning, and the use of a confusion matrix heatmap for model optimization and enhancement. In addition, we adopted random forest, support vector machine, and logistic regression classifiers in our system. Our main objective was to obtain broad and accurate classifications of URLs. Using the accuracy, precision, recall, and f1-score, we evaluated the performance of our proposed system. Among the tested models, the random forest model yielded the highest accuracy of 99.98% for malware detection and 90.59% for malicious URL detection. The simulation results and comparison with other state-of-the-art approaches demonstrate that our system is robust in detecting cyberattacks and achieving cybersecurity.

Share

COinS