Text
KLASIFIKASI PDF MALWARE PADA GARBA RUJUKAN DIGITAL (GARUDA) KEMDIKBUD DIKTI DENGAN METODE RANDOM FOREST
The Portable Document Format (PDF) is one of the most commonly used document reader formats, the object structure in PDF is flexible and easy to use. Therefore, that hackers use PDFs to carry out the attacks. The dataset comes from the Garba Rujukan Digital (GARUDA), which consists of a collection of PDF files. PDF files will extract using the pdfid tools to get features used in the multiclass classification process. This research dataset has imbalanced data conditions. Overcoming imbalanced data by resampling using oversampling with SMOTE and undersampling with NearMiss. The classification results using the Random Forest method produce an accuracy rate of 99.94%, a precision of 99,95%, a recall of 99,94%, an F1-Score of 99.94%, and an OOB-Error of 0.06%. Then validation was carried out for the accuracy rate of the model using Stratified K-fold Cross Validation, and the highest average accuracy obtained using 7-fold was 99.74%.
Inventory Code | Barcode | Call Number | Location | Status |
---|---|---|---|---|
2207005488 | T86331 | T863312022 | Central Library (Referens) | Available but not for loan - Not for Loan |
No other version available