JUCS - Journal of Universal Computer Science 29(5): 491-507, doi: 10.3897/jucs.94081
Developed Models Based on Transfer Learning for Improving Fake News Predictions
expand article infoTahseen A. Wotaifi, Ban N. Dhannoon§
‡ University of Babylon, Babil, Iraq§ Al-Nahrain University, Baghdad, Iraq
Open Access
In conjunction with the global concern regarding the spread of fake news on social media, there is a large flow of research to address this phenomenon. The wide growth in social media and online forums has made it easy for legitimate news to merge with comprehensive misleading news, negatively affecting people’s perceptions and misleading them. As such, this study aims to use deep learning, pre-trained models, and machine learning to predict Arabic and English fake news based on three public and available datasets: the Fake-or-Real dataset, the AraNews dataset, and the Sentimental LIAR dataset. Based on GloVe (Global Vectors) and FastText pre-trained models, A hybrid network has been proposed to improve the prediction of fake news. In this proposed network, CNN (Convolution Neural Network) was used to identify the most important features. In contrast, BiGRU (Bidirectional Gated Recurrent Unit) was used to measure the long-term dependency of sequences. Finally, multi-layer perceptron (MLP) is applied to classify the article news as fake or real. On the other hand, an Improved Random Forest Model is built based on the embedding values extracted from BERT (Bidirectional Encoder Representations from Transformers) pre-trained model and the relevant speaker-based features. These relevant features are identified by a fuzzy model based on feature selection methods. Accuracy was used as a measure of the quality of our proposed models, whereby the prediction accuracy reached 0.9935, 0.9473, and 0.7481 for the Fake-or-Real dataset, AraNews dataset, and Sentimental LAIR dataset respectively. The proposed models showed a significant improvement in the accuracy of predicting Arabic and English fake news compared to previous studies that used the same datasets. 
Fake News, Pre-trained Model, Hybrid Network, Improved Random Forest, Fuzzy Model