Information Systems

Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark

Nermin Abdelhakim Othman, The British University in EgyptFollow
Nagwa Goher, Helwan UniversityFollow
Manal Abdel Fattah, Helwan UniversityFollow

Document Type

Article

Publication Date

Winter 2-23-2022

Abstract

Chronic kidney disease (CKD) has become a widespread disease among people. It is related to various serious risks like car diovascular disease, heightened risk, and end-stage renal disease, which can be feasibly avoidable by early detection and treatment of people in danger of this disease. +e machine learning algorithm is a source of significant assistance for medical scientists to diagnose the disease accurately in its outset stage. Recently, Big Data platforms are integrated with machine learning algorithms to add value to healthcare. therefore, this paper proposes hybrid machine learning techniques that include feature selection methods and machine learning classification algorithms based on big data platforms (Apache Spark) that were used to detect chronic kidney disease (CKD). +e feature selection techniques, namely, Relief-F and chi-squared feature selection method, were applied to select the important features. Six machine learning classification algorithms were used in this research: decision tree (DT), logistic regression (LR), Naive Bayes (NB), Random Forest (RF), support vector machine (SVM), and Gradient-Boosted Trees (GBT Classifier) as ensemble learning algorithms. Four methods of evaluation, namely, accuracy, precision, recall, and F1- measure, were applied to validate the results. For each algorithm, the results of cross-validation and the testing results have been computed based on full features, the features selected by Relief-F, and the features selected by chi-squared feature selection method. +e results showed that SVM, DT, and GBT Classifiers with the selected features had achieved the best performance at 100% accuracy. Overall, Relief-F’s selected features are better than full features and the features selected by chi-square.

Recommended Citation

Othman, Nermin Abdelhakim; Goher, Nagwa; and Abdel Fattah, Manal, "Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark" (2022). Information Systems. 5.
https://buescholar.bue.edu.eg/info_sys/5

Link to Full Text

COinS

Information Systems

Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark

Document Type

Publication Date

Abstract

Recommended Citation

Browse

Search

Author Corner

Links

Information Systems

Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark

Authors

Document Type

Publication Date

Abstract

Recommended Citation

Share

Browse

Search

Author Corner

Links