Corresponding author: Jesus Serrano-Guerrero ( jesus.serrano@uclm.es ) © Jesus Serrano-Guerrero, Bashar Alshouha, Francisco P. Romero, Jose A. Olivas. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY-ND 4.0). This license allows reusers to copy and distribute the material in any medium or format in unadapted form only, and only so long as attribution is given to the creator. The license allows for commercial use. Citation:
Serrano-Guerrero J, Alshouha B, Romero FP, Olivas JA (2022) Affective Knowledge-enhanced Emotion Detection in Arabic Language: A Comparative Study. JUCS - Journal of Universal Computer Science 28(7): 733-757. https://doi.org/10.3897/jucs.72590 |
Online opinions/reviews contain a lot of sentiments and emotions that can be very useful, especially, for Internet suppliers which can know whether their services/products are meeting their customers’ expectations or not. To detect these sentiments and emotions, most applications resort to lexicon-based approaches. The major issue here is that most well-known emotion lexicons have been developed for English language; nevertheless, in other languages such as Arabic, there are fewer available tools, and many times, the quality of them is poor.
The goal of this study is to compare the performance of two different types of algorithms, shallow machine learning-based and deep learning-based, when dealing with emotion detection in Arabic language. To improve the performance of the algorithms, two lexicons, which were originally developed in other languages and translated into Arabic language, have been used to add emotional features to different information models used to represent opinions. All approaches have been tested using the dataset SemEval 2018 Task 1: Affect in Tweets and the dataset LAMA+DINA. The semantic approaches outperform the classical algorithms, that is, the information provided by the lexicons clearly improves the results of the algorithms. Particularly, the BiLSTM algorithm outperforms the rest of the algorithms using word2vec. On the contrary to other languages, the best results were obtained using the NRC lexicon.