POS Tagger Improvisation using HMM with the Addition of Foreign Word Labels on Telkom University News

WINKIE SETYONO

Informasi Dasar

22.04.2283
006.35
Karya Ilmiah - Skripsi (S1) - Reference

News is a medium of daily information usually obtained by the public. The news consists of a lot of information in it and is composed of sentence structures. Each language is unique with its own sentence structure, like Indonesian and other foreign languages. But nowadays, many media mix Indonesian with foreign languages, making the sentence structure different from Bahasa Indonesia. To classify these words, Part Of Speech Tagging needed to determine the class of words composed of sentences by learning from the Corpus of each language. The language structure can determine the results of tagging from the POS Tagger. If there are words that are not in the Corpus, it can reduce the accuracy of the POS Tagger. With the new sentence structure, POS Tagger requires a larger Corpus to learn, but the current corpus doesn’t cover it yet. We conducted to enhance the research results by adding data to Corpus with a different sentence structure from the Indonesian Language Corpus using sentences from online media. Added about 242 sentences with 7,043 tokens on Corpus focused on Foreign Word tags, which total 3819 tags. After some testing and scenarios, the results of the accuracy of POS Tagger show an accuracy of 94.7% using the Hidden Markov Model method with the F1-Score tag FW 78%.

Subjek

NATURAL LANGUAGE PROCESSING
NATURAL MATERIALS,

Katalog

POS Tagger Improvisation using HMM with the Addition of Foreign Word Labels on Telkom University News
 
 
Indonesia

Sirkulasi

Rp. 0
Rp. 0
Tidak

Pengarang

WINKIE SETYONO
Perorangan
Donni Richasdy, Mahendra Dwifebri Purbalaksono
 

Penerbit

Universitas Telkom, S1 Informatika
Bandung
2022

Koleksi

Kompetensi

  • CII4E4 - TUGAS AKHIR

Download / Flippingbook

 

Ulasan

Belum ada ulasan yang diberikan
anda harus sign-in untuk memberikan ulasan ke katalog ini