Text
KLASIFIKASI SPAM PADA EMAIL MENGGUNAKAN METODE SUPPORT VECTOR MACHINE DAN DETEKSI ANOMALY
Email is a written communication tool commonly used in everyday life. The problem with e mail is spam. This study includes a machine learning approach, Support Vector Machine, which is used for spam classification on e-mail. Using two datasets, data that has not been vectorized and data that has been vectorized. For data that has not been vectorized, the first step taken is processing the text so that the data becomes numeric. After the data has been vectorized, the next step for these two data is to detect anomalies using Isolation Forest for removal of outliers in the data. The next step will be the data resampling using SMOTE so that the data becomes balanced. Then the last step is classification using the Support Vector Machine method by sharing data using K-Fold Cross Validation and normalizing using Min Max Scaler. In the research the best validation value for the Emails dataset obtained an average accuracy value of 96.80%, Recall 98.70%, Precision 95.12%, F1 Score 96.88%, FPR 5.11%, AUC 96.79%, Error 3.19%. The best validation values for the Spambase dataset obtained an average accuracy value of 94.08%, Recall 92.55%, Precision 95.31%, F1 Score 93.91%, FPR 4.42%, AUC 94.06%, Error 5 91%. Based on the results, it means that the method used in spam classification on e-mail is the right method
Inventory Code | Barcode | Call Number | Location | Status |
---|---|---|---|---|
2007000059 | T32024 | T320242020 | Central Library (REFERENSI) | Available but not for loan - Not for Loan |
No other version available