20.04.3709
C -
Karya Ilmiah - Skripsi (S1) - Reference
Natural Language Processing
62 kali
Hate speech and abusive words spread widely on social media. The impact of hate speech on social media is very dangerous which can lead to discrimination, social conflict, and even genocide. Hate speech also has target types, categories, and levels. This research discusses the classification of hate speech and abusive words in the text on social media Twitter in Indonesian, English, and a mixture of both up to the types, categories, and levels. Classification of hate speech multilabel text is investigated using RFDT, BiLSTM, and BiLSTM with the pre-trained BERT model. The Classifier Chains, Label Powerset, and Binary Relevance methods are also used as data transformation and TF-IDF is also used as feature extraction combined with the RFDT classification method. Some scenarios of the preprocessing stage are also carried out to find the best results, those are full preprocess, without stopword removal, and without stemming and without stopword removal. The problem of having Indonesian, English, and a mixture of both is solved in two ways, those are without being translated and translated into Indonesian. The best results with an accuracy of 76.12% were obtained using the RFDT classification method with Classifier Chains, without translation, without stemming, and without stopword removal. This research also shows that the translation, stemming and stopword removal preprocesses are not effective and the problem of dependencies between labels greatly affects the results of classification.
Seluruh 1 koleksi sedang dipinjam
Nama | RAHMAT HENDRAWAN |
Jenis | Perorangan |
Penyunting | |
Penerjemah |
Nama | Universitas Telkom |
Kota | |
Tahun | 2020 |
Harga sewa | IDR 0,00 |
Denda harian | IDR 0,00 |
Jenis | Non-Sirkulasi |