Skripsi
PEMODELAN TOPIK MENGGUNAKAN PRE-TRAINED LANGUAGE MODEL INDOBERT DAN VARIATIONAL AUTOENCODER(VAE)
The number of information and documents scattered on the internet today is very large and makes it difficult to search for information according to topics. This is a challenge in grouping and managing information such as online news headline data. Therefore, the solution to this problem is to use a Topic Modeling System that aims to group information and documents according to their topics. This research uses a Topic Modeling method that combines the use of pre-trained language models BERT and Variational Autoencoders. This approach utilizes BERT's capability in text embedding and VAE's capability in dimensionality reduction and hidden representation, and uses K-means algorithm to cluster the data. For model training, 5000 news headline data with 10 different categories were used from online media namely cnnindonesia, detik.com, and kompas. Testing was conducted using 2000 news headline data that did not enter the training stage. The Topic Modeling System produces 10 groups, with an average coherence score cv of 0.78, the lowest value 0.76, and the highest value 0.80.
Inventory Code | Barcode | Call Number | Location | Status |
---|---|---|---|---|
2407002733 | T143978 | T1439782024 | Central Library (REFERENCES) | Available but not for loan - Not for Loan |
No other version available