Skripsi
KOMBINASI METODE IMPUTASI MEAN DAN MULTIPLE IMPUTATION BY CHAINED EQUATIONS (MICE) UNTUK PENANGANAN DATA HILANG DAN PENINGKATAN EVALUASI KINERJA KLASIFIKASI PREDIKSI PENYAKIT DIABETES MELITUS
Pima Indians Diabetes 2020 dataset is one of the datasets that contains missing data. Missing data can cause some statistical information to be lost due to the small sample size and can cause overfitting problems in the training data. One way to deal with missing data can be done by imputing data. This study aims to improve classification performance on Pima Indians Diabetes 2020 dataset by applying a combination of Single Imputation using the Mean imputation method on attributes containing missing data less than or equal to 10% and Multiple Imputation using MICE on attributes containing more than 10% missing data. 10%. The results of missing data imputation were tested using the Multi Layer Perceptron (MLP) and Support Vector Machine (SVM) methods to find out the increase in classification performance evaluation. Before handling missing data, the results of the classification performance evaluation obtained an accuracy of 78.947%, a precision of 78.554%, and a recall of 76.616%, after handling missing data using the Mean and MICE methods, the results of the classification performance evaluation obtained an accuracy of 84.221%, a precision of 82.462%, and a recall of 82.462%. Accuracy, precision and recall values increased by 5.274%, 3.908% and 5.846% respectively. It can be concluded that the prediction of missing data using the Multi Layer Perceptron (MLP) and Support Vector Machine (SVM) methods can improve the performance evaluation of the prediction classification of diabetes mellitus.
Inventory Code | Barcode | Call Number | Location | Status |
---|---|---|---|---|
2307000655 | T89866 | T898662023 | Central Library (Referens) | Available but not for loan - Not for Loan |
No other version available