Sentiment classification of user reviews plays a vital role in business decision-making, especially on e-commerce platforms like Tokopedia. This study evaluates the performance of various sentiment classification models such as Logistic Regression LinearSVC, and BERT models, both baseline and fine-tuned. Evaluation metrics used include accuracy, precision, recall, and F1-score, applied to Tokopedia review data labelled based on user ratings. The result is fine-tuned BERT model has the best and consistent result, with 92% accuracy and 0.92 f1-score for each class. This shows that fine-tuned BERT can effectively capture the semantic context of user reviews. Its consistent performance across classes makes it suitable for reliable sentiment classification in real-world applications. Furthermore, fine-tune BERT model is visualized by Local Interpretable Model-agnostic Explanation to identify features – in this case is word – that indicates sentiment as positive or negative. It will show as color, orange for positive and blue as negative. This method will make the model more transparent and more reliable.
Keywords: sentiment classification; Tokopedia; machine learning; BERT; XAI; LIME.