JUCS - Journal of Universal Computer Science 23(8): 755-768, doi: 10.3217/jucs-023-08-0755
Comparative Evaluation of Algorithms for Sentiment Analysis over Social Networking Services
expand article infoAkrivi Krouska, Christos Troussas, Maria Virvou
‡ University of Piraeus, Piraeus, Greece
Open Access
Abstract
Twitter is a highly popular social networking service and a web-based communication platform with million users exchanging daily public messages, namely tweets, expressing their opinion and feelings towards various issues. Twitter represents one of the largest and most dynamic datasets for data mining and sentiment analysis. Therefore, Twitter Sentiment Analysis constitutes a prominent and an active research area with significant applications in industry and academia. The purpose of this paper is to provide a guideline for the decision of optimal algorithms for sentiment analysis services. In this context, five well-known learning-based classifiers (Naive Bayes, Support Vector Machine, k-Nearest Neighbor, Logistic Regression and C4.5) and a lexicon-based approach (SentiStrength) have been evaluated based on confusion matrices, using three different datasets (OMD, HCR and STS-Gold) and two test models (percentage split and cross validation). The results demonstrate the superiority of Naive Bayes and Support Vector Machine regardless of datasets and test methods.
Keywords
social networking services, Twitter, Sentiment analysis, polarity detection, learning machines, lexicon-based classification