JUCS - Journal of Universal Computer Science 23(8): 755-768, doi: 10.3217/jucs-023-08-0755

Comparative Evaluation of Algorithms for Sentiment Analysis over Social Networking Services

Akrivi Krouska^‡, Christos Troussas^‡, Maria Virvou^‡

‡ University of Piraeus, Piraeus, Greece

Corresponding author: Akrivi Krouska ( akrouska@unipi.gr )

This article is freely available under the J.UCS Open Content License.

Citation: Krouska A, Troussas C, Virvou M (2017) Comparative Evaluation of Algorithms for Sentiment Analysis over Social Networking Services. JUCS - Journal of Universal Computer Science 23(8): 755-768. https://doi.org/10.3217/jucs-023-08-0755

Abstract

Twitter is a highly popular social networking service and a web-based communication platform with million users exchanging daily public messages, namely tweets, expressing their opinion and feelings towards various issues. Twitter represents one of the largest and most dynamic datasets for data mining and sentiment analysis. Therefore, Twitter Sentiment Analysis constitutes a prominent and an active research area with significant applications in industry and academia. The purpose of this paper is to provide a guideline for the decision of optimal algorithms for sentiment analysis services. In this context, five well-known learning-based classifiers (Naive Bayes, Support Vector Machine, k-Nearest Neighbor, Logistic Regression and C4.5) and a lexicon-based approach (SentiStrength) have been evaluated based on confusion matrices, using three different datasets (OMD, HCR and STS-Gold) and two test models (percentage split and cross validation). The results demonstrate the superiority of Naive Bayes and Support Vector Machine regardless of datasets and test methods.

Keywords

social networking services, Twitter, Sentiment analysis, polarity detection, learning machines, lexicon-based classification