JUCS - Journal of Universal Computer Science 30(5): 674-693, doi: 10.3897/jucs.104790

Automatic Sarcasm Detection on Cross-Platform Social Media Datasets: A GLoVe and Bi-LSTM Based Approach

Saima Farhan^‡, Rubiya Shoukat^‡, Aqsa Aslam^‡

‡ Lahore College for Women University, Lahore, Pakistan

Corresponding author: Aqsa Aslam ( aqsa.aslam28@gmail.com )

This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY-ND 4.0). This license allows reusers to copy and distribute the material in any medium or format in unadapted form only, and only so long as attribution is given to the creator. The license allows for commercial use.

Citation: Farhan S, Shoukat R, Aslam A (2024) Automatic Sarcasm Detection on Cross-Platform Social Media Datasets: A GLoVe and Bi-LSTM Based Approach. JUCS - Journal of Universal Computer Science 30(5): 674-693. https://doi.org/10.3897/jucs.104790

Abstract

Sarcastic remarks on social media platforms have become commonplace, with people expressing their bad feelings in a quite positive manner or in a mocking way. This contradictive nature of sarcasm makes its detection a very challenging task. Many researchers have provided their solutions to perform automatic sarcasm detection from a single domain dataset. Most of them have considered only the content of the text and has ignored the context of the text. Understanding that the context of a text is the most important factor in determining either it is sarcastic or not. This study aims to detect sarcastic remarks from multi-domain dataset by using Bi-LSTM model employed with pre-trained GloVe word embeddings because the GloVe embeddings and Bi-LSTM both are good at capturing contextual information from the provided data. The dataset is generated by the concatenation of three publicly available datasets, the gosh tweets, the news headlines dataset, and the sarcasm corpus v2 dataset. GloVe embeddings will extract contextual and semantic features from the text while the Bi-LSTM model will get trained and tested on those features. The proposed model has achieved 86.35% of accuracy with 88% of Recall, Precision, and F1-score. Different experiments have been done to test the model's reliability. This study's findings indicate that the proposed model yields state-of-the-art or comparable results. The proposed study aids in improving the performance of sentiment analysis. It will also help individuals, and different organizations to identify accurate sentiments of people about individuals or products of an organization.

Keywords

NLP, Sentiment Analysis, Sarcasm, Deep Learning, Bi-LSTM, GloVe embeddings