Deep Learning in Arabic Tweets Fake News Detection: Comparison
Please note this is a comparison between Version 1 by Manal Kalkatawi and Version 2 by Lindsay Dong.

Fake news has been around for a long time, but the rise of social networking applications over recent years has rapidly increased the growth of fake news among individuals. Fake news negatively impacts various aspects of life (economical, social, and political). Identifying fake news manually on these open platforms would be challenging as they allow anyone to build networks and publish the news in real time. Therefore, creating an automatic system for recognizing news credibility on social networks relying on artificial intelligence techniques, including machine learning and deep learning, has attracted the attention of researchers. Using deep learning methods has shown promising results in recognizing fake news written in English. 

  • Arabic language
  • Twitter
  • fake news
  • deep learning

1. Introduction

In recent years, the rapid growth of social networks has facilitated the exchange of news among users. Social networks can be utilized to inform society regarding the latest news, but they can also be a source of fake news. Twitter is considered one of the most widespread social networks in the Arabian area [1]. Posting news on Twitter is less costly in terms of both money and time than any other medium. Its simplicity and lack of content monitoring enable fake news to reach a wide range of users rapidly [2]. Fake news refers to false, misleading, and fabricated news delivered intentionally [3]. Fake news dissemination aims to deceive the audience for political, social, or financial gains. Consequently, fake news imposes significant risks on individuals, organizations, and governments [4]. Thus, there is an urgent need for efficient techniques to detect and eliminate fake news in social networks to prevent its negative impacts.
Many fact-checking websites, such as Anti-Rumors Authority and Misbar, have been implemented to check news veracity propagated on the internet in an early attempt to reduce the impact of fabricated news. These websites depend on human experts to manually confirm or reject the validity of the news [2]. It consumes time and effort to deal manually with a large volume of news and is not scalable. Most recently, the adoption methods of machine learning and deep learning have become popular in tackling anomaly detection problems such as fake news detection [5]. Two types of learning techniques are currently adopted in these methods to construct automated systems for false news detection on social networks, which are news content-based learning and social context-based learning. News content-based methodology mainly focuses on the writing style of the text content of news to discover syntactic or semantic patterns to classify news. The social context-based methodology primarily analyses user behaviour and engagement in social media. Social context features can be explored from the user profile, discussions, and connected networks among users [4].
Informal writing styles, spelling errors, using diverse dialects, etc., makes processing Arabic content on social media more difficult. Other challenges that aggravate the processing complexity are the massive vocabulary and complex morphological patterns of the Arabic language. Moreover, the limited availability of Arabic datasets. These difficulties result in little research focused on detecting and eliminating fake news in Arabic. Few studies have proposed models using deep learning algorithms to identify fabricated news posted on the Twitter platform [3][5][6][3,5,6]. In general, these models were trained to target a specific topic of fake news posts and relied only on the textual content of the Tweets to produce a classification. Furthermore, the detection performance of the current models still requires improvement.

2. Deep Learning in Arabic Tweets Fake News Detection

Detecting fake news in Arabic is still in its infancy compared with other languages, such as English. The models are discussed below and summarized in Table 1.

Table 1.
A summary of existing detection approaches for fake news in Arabic.
Ref Year Dataset Topic Classification Approach Feature Type Textual Feature Representations Result
News Content Social Context
[7][19] 2018 800 Tweets General NB, SVM, DT - Accuracy 0.899
[8][20] 2019 177 Tweets General EM - F1-score 0.80
[9][10] 2019 268 labeled blog posts, 20,392 unlabeled blog posts General CNN   Word2vec(CBOW), char-level embeddings F1-score 0.63
[10][21] 2019 9000 Tweets General RF, SVM, DT, NB - F1-score 0.776
[11][22] 2019 1862 Tweets Syrian crisis LR, RF, DT, AdaBoost - Accuracy 0.76
[12][11] 2020 4547 news General LSTM, mBERT   Word-level embeddings, char-level embeddings, mBERT F1-score 0.643
[13][12] 2020 AraNews (97,310 news), ATB (48,655 news), ANS (4547 news) General mBERT, AraBERT, XLM-RBase, XLM-RLarg   mBERT, AraBERT, XLM-RBase, XLM-RLarg F1-score 0.70
[14][15] 2020 6895 news articles Political NB, XGBoost, CNN   BOW, TF-IDF, fastText F1-score 0.984
[1] 2021 1862 Tweets Syrian crisis KNN, DT, NB, LR, LDA, SVM, RF, XGboost TF, TF-IDF, BoW Accuracy 0.82
[15][16] 2021 37,000 Tweets COVID-19 NB, LR, SVM, MLP, RF, XGB   BOW, TF-IDF F1-score 0.933
[6] 2021 10,828 Tweets COVID-19 AraBERT, mBERT, distilBERT-multi, mBERT COV19, AraBERT COV19   AraBERT, mBERT, distilBERT-multi, mBERT COV19, AraBERT COV19 F1-score 0.9578
[16][14] 2021 COVID-19-Fakes (70,959 Tweets), ArCOV19-Rumors (3032 Tweets), ANS (4091 news), AraNews (108,194 news) COVID-19, general CNN, RNN, GRU, AraBERT v1, AraBERT v2, AraBERT v02, QARiB, Ar-Electra, Marbert, Arbert   Word2vec, fastText, doc2vec, glove, AraBERT v1, AraBERT v2, AraBERT v02, QARiB, Ar-Electra, MARBERT, Arbert F1-score 0.95
[3] 2021 8786 Tweets COVID-19 XGB, RF, NB, SVM, SGD, CNN, RNN, CRNN   TF-IDF, word2vec, fastText F1-score 0.54
[17] 2022 3157 Tweets COVID-19 LR, KNN, CART, SVM, NB, RF, AdaBoost, Bagging, ExtraTree TF-IDF, glove F1-score 0.935
[18] 2022 4299 Tweets COVID-19 RF, DT, XGBoost, SVM, KNN, NB, SGD, LR, RNN, BiRNN, GRU, BiGRU, LSTM, BiLSTM   N-Gram, TF-IDF, word2vec Accuracy 0.81
[19][13] 2022 1098 news articles Hajj SVM, RF, NB   - F1-score 0.79
Detection models based on news content features make up the majority of current studies in news truth verification. Helwe et al. [9][10] developed an approach using two CNN models to assess Arabic weblog posts’ credibility. Both models have similar layers, except the embedding layer. The first model used pre-trained word-level embeddings, while the second used character-level embeddings. Each model was trained on a labelled dataset in the first iteration, and then the predictions of unlabelled data for each model were picked to re-train the other model. In the experiment, they compared the proposed model to a support vector machine (SVM) trained with a TF-IDF feature representation, CNN-trained with character-level vector representation, CNN-trained with word-level vector representation, and a combined model based on Word-CNN and Char-CNN. Their proposed model scored the highest F1-score of 0.63. The fundamental limitation of the work is that the amount of labelled data is small.
The previous study [9][10] focused on examining the credibility of news blogs. Here, some works [12][13][19][11,12,13] assessed news verification models using fabricated news generated from real news stories by modifying their semantics. Jude Khouja [12][11] proposed a system for claim verification based on the textual information of news. The work introduced a publicly available Arabic News Stance (ANS) dataset to determine the claims’ veracity. The author acquired a subset of news titles from the Arabic news texts (ANT) dataset. The news titles were modified to generate fake claims. Two approaches have been used to train and test a generated dataset for claim classification: long short-term memory (LSTM) and pre-trained BERT model. The study reported that LSTM achieved the highest result for false claims recognition with an F1-score of 0.643. Nagoudi et al. [13][12] developed a method for automatically manipulating real multi-topic news to generate a fake news dataset, AraNews. They used transformer-based pre-trained models to detect manipulated Arabic news. Furthermore, they experimented with various modelling settings to examine the impact of their generated data on fake news verification models compared to a human-created fake news dataset. The authors reported that automatically generated news positively affects the fake news detection task. Compared to previous work [12][11], it achieved a better improvement with an F1-score of 0.0576. The ANS and AraNews datasets were utilized by another work [16][14]. The work intended to examine the performance of language models such as AraBERT, QARiB, and AraGPT2 when applied to the Arabic fake news detection task. Each model was trained and evaluated using the ANT and AraNews datasets. The results showed that AraBERT and QARiB revealed some ability to identify false news with a similar accuracy of 0.80. In both experiments, AraGPT2 achieved the lowest accuracy. Himdi et al. [19][13] proposed a machine learning model to assess the veracity of Arabic news articles. They gathered factual news articles related to a single domain, which is the Hajj. After this, the acquired dataset was utilized to construct fake news articles relying on crowdsourcing. They extracted a set of linguistic features, including emotional, syntactical, polarity, and part of speech. The extracted features were used to train three classifiers, Naïve Bayes (NB), Random Forest (RF), and SVM, to detect Arabic false news. The results demonstrated that the extracted linguistic features could effectively detect fake news, and the best classifier was RF, which had a 0.79 accuracy rate.
Another approach exploited textual features defined in [14][15] to automatically identified satire news. They released a dataset that was collected from a variety of news websites. They analysed the linguistic properties of news and concluded that false news involves highly positive and negative keywords and tends to be written in a more subjective tone. Machine learning and deep learning models have been trained to identify satirical news. A CNN with pre-trained word embeddings achieved the highest performance with an accuracy of 0.98.
During the COVID-19 pandemic, a lot of false information was disseminated through various social networking applications. The effect of misinformation is not confined to individual lives but also includes society and the economy. Several studies [3][6][15][16][17][18][3,6,14,16,17,18] have concentrated on assessing the credibility of information related to the spread of COVID-19 in Arabic communities via social media platforms such as Twitter.
Combining information from both news content and social context sources may result in a better detection rate. Incorporating social context features, such as user behaviour, user profile, etc., with other news content features is uncommon in deep learning-based studies. Several efforts have been made to propose models for detecting misinformation using traditional machine learning algorithms utilizing both news content and social context aspects [1][7][8][10][11][1,19,20,21,22].
The detection of Arabic Tweets containing fabricated news is still in its early stages. As a result, few Arabic datasets for detecting fake news Tweets are publicly available to the research community. There have been few studies conducted to address the Arabic Tweets’ veracity using deep learning techniques, and the majority of them focus on detecting fake news relating to a certain topic, such as COVID-19. Additionally, they relied only on Tweets text to produce a classification. A major challenge for these identification systems is that the underlying textual characteristics vary under different fake news. For this reason, models that used only textual content of news Tweets may have a generalizability issue. In contrast, machine learning models commonly use news content and social context features to more accurately identify various types of fake news. However, the existing models’ detection performance still needs to be improved.
ScholarVision Creations