Text Classification of British English and American English Using Support Vector Machine

MUHAMMAD ROMI ARIO UTOMO

Informasi Dasar

19.04.1531
C
Karya Ilmiah - Skripsi (S1) - Reference

Abstract— English is a language commonly used in the international world. Meanwhile, English that is often used in society is British English and American English. Beyond of the similarities, they all have fundamental differences, starting from the vocabulary to the grammar used. In learning English, people must ensure the type of English that they will learn. Therefore, this study is created a text classification system that can classify sentences according to the type of English used in the text. By that, it is expected to facilitate the language learning process in English. The dataset is divided into two classes namely British English and American English. The data will be divided by 10-fold-cross-validation. In this study, a combination of N-gram features, Term Frequency-Inverse Document Frequency (TF-IDF) weighting, and additional word dictionary as features were used. In the TF-IDF weighting process, a threshold of 2,0 in the Document-Frequency (DF) is given. The classification process is carried out using Support Vector Machine (SVM) algorithm with a linear kernel and the best accuracy obtained is 96.53%. Keywords—text classification, support vector machine, British English, American English.

Subjek

Text mining
 

Katalog

Text Classification of British English and American English Using Support Vector Machine
 
 
Indonesia

Sirkulasi

Rp. 0
Rp. 0
Tidak

Pengarang

MUHAMMAD ROMI ARIO UTOMO
Perorangan
YULIANT SIBARONI, NIKEN DWI WAHYU C
 

Penerbit

Universitas Telkom
Bandung
2019

Koleksi

Kompetensi

 

Download / Flippingbook

 

Ulasan

Belum ada ulasan yang diberikan
anda harus sign-in untuk memberikan ulasan ke katalog ini