JUCS - Journal of Universal Computer Science 29(11): 1319-1335, doi: 10.3897/jucs.112604
Deep Random Forest and AraBert for Hate Speech Detection from Arabic Tweets
expand article infoKheir Eddine Daouadi, Yaakoub Boualleg, Oussama Guehairia§
‡ Echahid Cheikh Larbi Tebessi University, Tebessa, Algeria§ Mohamed Khider University of Biskra, Biskra, Algeria
Open Access
Abstract
Nowadays, hate speech detection from Arabic tweets attracts the attention of many researchers. Numerous systems and techniques have been proposed to address this classification challenge. Nonetheless, three major limits persist: the use of deep learning models with an excess of hyperparameters, the reliance on hand-crafted features, and the requirement for a huge amount of training data to achieve satisfactory performance. In this study, we propose Contextual Deep Random Forest (CDRF), a hate speech detection approach that combines contextual embedding and Deep Random Forest. From the experimental findings, the Arabic contextual embedding model proves to be highly effective in hate speech detection, outperforming the static embedding models. Additionally, we prove that the proposed CDRF significantly enhances the performance of Arabic hate speech classification.
Keywords
Twitter, Hate Speech Detection, Arabic Tweet Classification, Contextual Deep Ran-dom Forest, Fine-tuning, Pre-trained Contextual Embedding