Skripsi
PERBANDINGAN ALGORITMA JARO-WINKLER DISTANCE DAN LEVENSHTEIN DISTANCE DALAM MENDETEKSI KEMIRIPAN DOKUMEN BAHASA INDONESIA
Document similarity detection is used to calculate the similarity between two or more documents based on semantic similarity or lexical similarity. This research proposed to detect similarity based on lexical similarity using a string matching techniques on each documents. Jaro-Winkler and Levenshtein Distance are algorithms usually used in string matching techniques. Jaro-Winkler Distance includes a step of calculating the length of strings in the document, counting common characters, and transposition. Levenshtein Distance is an algorithm which is used to calculate the minimum distance that needed to transform one string into the other. Testing was done with a total 19 authentic document and 6 comparative, the result of this research shows that the average error value of Levenshtein Distance is 7,86% while Jaro-Winkler Distance with average error value of 24,45%. As for computing time, four out of five testing configuration shows that Jaro-Winkler Distance have a faster computing time than Levenshtein Distance.
Inventory Code | Barcode | Call Number | Location | Status |
---|---|---|---|---|
2007000078 | T35560 | T355602020 | Central Library (REFERENSI) | Available but not for loan - Not for Loan |
No other version available