Design Amharic Text Sentiment Analysis Model Using Machine Learning Techniques. In Case of Restaurant Reviews

Design Amharic Text Sentiment Analysis Model Using Machine Learning Techniques. In Case of Restaurant Reviews

Gedif, Birku; Assefa, Yibeltal

URI: http://hdl.handle.net/123456789/8607

Date: 2025-02-12

Abstract:

Abstract—Sentiment analysis is a type of natural language processing for tracking the attitude of the public about a particular product, service or topic. It is also highly challenging as natural language processing research topic, and covers many novel sub-problems. Now business organizations and academics are putting forward their efforts to find the best system for sentiment analysis. The focus of this study was an Amharic unstructured restaurant review on the web. The objective of the paper was to design Amharic text sentiment analysis model using supervised machine learning techniques and evaluate the performance of classifiers. This paper explored the supervised machine learning classification approaches (na¨ıve Bayes, support vector machine and k-nearest neighbor ) with different feature selection schemes to obtain a sentiment analysis model for domain specific restaurant review dataset at sentence level. The proposed model has the following components: Data preparation, preprocessing such as tokenization, normalization, filter stop words, feature extraction and selection to prepare feature vector, polarity classification. Performance analysis carried out on classifiers, based on n-grams proposed. From the results of the experimental studies, all algorithms are known to be highly effective classifiers, and are able to achieve good accuracy in this experiment. The experiments show that Term frequency (TF) and the TF-IDF scheme gives maximum accuracy 80.43 % and 79.49 % respectively for SVM in bigram features. Term frequency and term occurrence also give maximum accuracy 78.37% and 78.00% respectively for Na¨ıve Bayes classifier at bigram features. TF-IDF also give maximum accuracy 78.00% for KNN at 4-gram. The challenge was opinion holders sometimes give objective text to express their opinion, but the classifier did not identify those facts from opinions. These kinds of complexities of natural languages make sentiment mining systems more challenging and to resolving this challenge subjectivity and objectivity classification is needed.

Show full item record