Multilingual Evidence for Fake News Detection: Comparison
Please note this is a comparison between Version 1 by Mikhail Kuimov and Version 2 by Rita Xu.

The rapid spread of deceptive information on the internet can have severe and irreparable consequences. As a result, it is important to develop technology that can detect fake news. Although significant progress has been made in this area, current methods are limited because they focus only on one language and do not incorporate multilingual information. Multiverse—a new feature based on multilingual evidence that can be used for fake news detection and improve existing approaches.

  • fake news detection
  • multilinguality
  • news similarity

1. Introduction

The fast consumption of information from social media and news websites has become a daily routine for millions of users. Many readers neither have the time nor the interest (and/or skills) to fact-check every announced event. This opens up a wide range of opportunities to manipulate the opinions of citizens, one of which is fake news, which contains information about events that never happened in real life (or representations of real events in extremely narrow and biased ways). Fake news can be as simple as damaging the reputation of a person, organization, or country, or as serious as inciting immediate emotional reactions that lead to destructive actions in the physical world.
Since the exploitation of Facebook to influence public opinion during the 2016 U.S. presidential election [1], there has been significant interest in fake news. However, the dissemination of false information not only misinforms readers but can also result in much more serious consequences. For instance, the spreading of a baseless rumor alleging that Hillary Clinton was involved in child sex trafficking led to a dangerous situation at a Washington D.C. pizzeria [2]. The global pandemic in 2020 has led to the rise of an infodemic [3], which could have even more severe consequences by exacerbating the epidemiological situation and endangering people’s health. Furthermore, The recent events of 2022 also showed how politics and global events could be dramatically influenced by the spread of fake news. The Russia–Ukraine conflict was accompanied by an intense information war [4] featuring an enormous amount of fake stories. In addition to the political world, the World Cup 2022 was surrounded by rumors from both organizers and visitors that had an impact on security during the competition [5].
The issue of fake news has garnered significant public attention and has also become a subject of growing interest in academic circles. With the proliferation of online content, there is a great deal of optimism about the potential of automated methods for detecting fake news. Numerous studies have been conducted on fake news detection, utilizing a variety of information from diverse sources. While the misinformation mitigation field is represented in the artificial intelligence field via different tasks (i.e., stance detection, fact-checking, source credibility classification, inter alia), rwesearchers focus on the supervised fake news classification task.

2. User Behavior for Fake News Detection

Firstly, before the discussion of automatic machine fake news detection methods, researchwers analyze how real-life users react to fake information and in which way they check the veracity of information. In [6][19], a very broad analysis of users’ behavior was obtained. The authors discovered that when people attempt to check information credibility, they rely on a limited set of features, such as:
  • Is this information compatible with other things that I believe to be true?
  • Is this information internally coherent? Do the pieces form a plausible story?
  • Does it come from a credible source?
  • Do other people believe it?
Thus, people can rely on the news text, its source, and their judgment. However, if they receive enough internal motivation, they can also refer to some external sources for evidence. These external sources can be knowledgeable sources or other people. The conclusions from [7][20] repeat the previous results: individuals rely on both their judgment of the source and the message. When these factors do not adequately provide a definitive answer, people turn to external resources to authenticate the news. The intentional and institutional reactions sought confirmation from institutional sources, with some respondents answering simply “Google”. Moreover, several works have been conducted to explore the methods to combat fake information received by users and convince them with facts. In [8][21], it was shown that explicitly emphasizing the myth and even its repetition with refutation can help users pay attention and remember the truth. Additionally, participants who received messages across different media platforms [9][22] and different perspectives on the information [10][23] showed greater awareness of news evidence. Consequently, information obtained from external searches is an important feature for evaluating news authenticity and seeking evidence. Furthermore, obtaining different perspectives from different media sources adds more confidence decision-making process.

3. Fake News Detection Datasets

To leverage the task of automatic fake news detection there have been created several news datasets focused on misinformation, each with a different strategy of labeling. The comparison of all discussed datasets is presented in Table 1. The Fake News Challenge (, accessed on 14 December 2022) launched in 2016 was a big step in identifying fake news. The objective of FNC-1 was a stance detection task [11][24]. The dataset includes 300 topics, with 5–20 news articles each. In general, it consists of 50,000 labeled claim-article pairs. The dataset was derived from the Emergent project [12][25]. Another publicly available dataset is LIAR [13][26]. In this dataset 12,800 manually labeled short statements in various contexts from (, accessed on 14 December 2022) were collected. They covered such topics as news releases, TV or radio interviews, campaign speeches, etc. The labels for news truthfulness are fine-grained in multiple classes: pants-fire, false, barely-true, half-true, mostly true, and true. Claim verification is also related to the Fact Extraction and VERification dataset (FEVER) [14][27]; 185,445 claims were manually verified against the introductory sections of Wikipedia pages and classified as SUPPORTED, REFUTED, or NOTENOUGHINFO. For the first two classes, the annotators also recorded the sentences forming the necessary evidence for their judgments. FakeNewsNet [15][28] contains two comprehensive datasets that include news content, social context, and dynamic information. Moreover, as opposed to all of the datasets described above, in addition to all of the textual information, there is also a visual component saved in this dataset. All news was collected via PolitiFact and GossipCop (, accessed on 31 August 2021) crawlers. In general, 187,014 fake and 415,645 real news items were crawled. Another dataset collected for supervised learning is the FakeNewsDataset [16][6]. The authors conducted a lot of manual work to collect and verify the data. As a result, they managed to collect 240 fake and 240 legit news items on 6 different domains—sports, business, entertainment, politics, technology, and education. All of the news articles in the dataset are from the year 2018. One large dataset is NELA-GT-2018 [17][29]. In this dataset, the authors attempted to overcome some limitations that could be observed in previous works: (1) Engagement-driven—the majority of the datasets, for news articles and claims, contained only data that were highly engaged with on social media or received attention from fact-checking organizations; (2) lack of ground truth labels—all current large-scale news article datasets do not have any form of labeling for misinformation research. To overcome these limitations, they gathered a wide variety of news sources from varying levels of veracity and scraped article data from the gathered sources’ RSS feeds twice a day for 10 months in 2018. As a result, a new dataset was created consisting of 713,534 articles from 194 news and media producers.
Table 1. The datasets covered in related work. The majority of datasets for fake news detection tasks are in English.
Due to the events of 2020, there has been ongoing work toward creating a COVID-19 fake news detection dataset. The COVID-19 Fake News [23][7] is based on information from public fact-verification websites and social media. It consists of 10,700 tweets (5600 real and 5100 fake posts) connected to the COVID-19 topic. In addition, the ReCOVery [20][32] multimodal dataset was created. It also incorporates 140,820 labeled tweets and 2029 news articles on coronavirus collected from reliable and unreliable resources. However, all of the above datasets have one main limitation—they are monolingual and dedicated only to the English language. Regarding languages other than English, such datasets can be mentioned: the French satiric dataset [24][35], GermanFakeNC [21][33], The Spanish Fake News Corpus [22][34], and Arabic Claims Dataset [18][30]. These datasets do not fully cover the multilingualism gap in fake news detection. The mentioned datasets are monolingual as well and mostly cover fake news classification tasks, missing, for instance, fact verification and evidence generation problems.

4. Fake News Classification Methods

Based on previously described datasets, multiple methods have been developed to tackle the problem of obtaining such a classifier. The feature sets used in all existing methods can be divided into two categories: (1) internal features that can be obtained from different preprocessing strategies and a linguistic analysis of the input text; (2) external features that are extracted from a knowledge base, the internet, or social networks, and give additional information about the facts from the news, its propagation in social media, and users’ reactions. In other words, internal methods rely on the text itself while external methods rely on meta-information from the text.

4.1. Methods Based on Internal Features

Linguistic and psycholinguistic features are helpful in fake news classification tasks. In [16][6], a strong baseline model based on such a feature set was created based on the FakeNewsDataset. The set of features used is n this work is as follows:
  • Ngrams: tf–idf values of unigrams and bigrams from a bag-of-words representation of the input text.
  • Punctuation such as periods, commas, dashes, question marks, and exclamation marks.
  • Psycholinguistic features extracted with LIWC lexicon. Alongside some statistical information, LIWC also provides emotional and psychological analysis.
  • Readability that estimates the complexity of a text. The authors use content features such as the number of characters, complex words, long words, the number of syllables, word types, and others. In addition, they used several readability metrics, including the Flesch–Kincaid, Flesch Reading Ease, Gunning Fog, and Automatic Readability Index.
  • Syntax is a set of features derived from production rules based on context-free grammar (CFG) trees.
Using this feature set, the system yields strong results. That is why in ouresearchers work we rely on it as a baseline, further extending this set with theour newly developed features. Based on such features, different statistical machine learning models can be trained. In [16][6], the authors trained the SVM classifier according to the set of characteristics presented. Naïve Bayes, Random Forest, KNN, and AdaBoost were also frequently used as fake news classification models [25][26][27][36,37,38]. In [28][39], the authors explore the potential of using emotional signals extracted from text to detect fake news. The authors analyzed the set of emotions present in true and fake news to test the hypothesis that trusted news sources do not use emotions to affect the reader’s opinion while fake news does. They discovered that emotions, such as negative emotions, disgust, and surprise tend to appear in fake news and can give a strong signal for fake news classification. In addition to linguistic features, feature extraction strategies based on deep learning architectures were also explored. In [29][40], the classical architecture for the text classification task based on CNN was successfully applied to the fake news detection task. Given the recent surge in the use of Transformer architectures in natural language processing, models like BERT [30][31][10,41] and RoBERTa [32][9] have achieved high results in classifying general-topic fake news, as well as in detecting COVID-19-related fake news. In addition to text features, images mentioned in news articles can serve as strong indicators for veracity identification. Visual content can be manipulated, for instance, via deepfakes [33][42] or by combining images from different contexts in a misleading format [34][43]. While multimodal fake news detection is a developing field, several approaches were already presented in [35][36][44,45]. It is evident that models based on internal feature sets have a significant advantage in their ease of use, as they do not require extensive additional time for feature extraction. Furthermore, such models can be highly efficient in terms of inference time and memory usage, as they solely rely on internal information from input news. However, if rwesearchers take into account the aspect of explainability for end users, the evidence generated from such internal features is unlikely to be sufficient to persuade the user of the model’s accuracy and to justify the label assigned to the news.

4.2. Methods Based on External Features

Although internal feature-based models can achieve high classification scores in the fake news classification task, the decisions of such are hard to interpret. As a result, additional signals from external sources can add more confidence to model decision reasoning. If the news appears on a social network, information about the users who liked or reposted the item and the resulting propagation can serve as valuable features for fake news classification. It was shown in [37][46] that fake news tends to spread more quickly over social networks than true news. As a result, to combat fake news in the early stages of its appearance, several methods have been created to detect the anomaly behaviors in reposts or retweets [38][39][47,48]. In [40][49], different data about specific users were explored. The author extracted locations, profile images, and political biases to create a feature set. User comments related to a news article can also serve as a valuable source of information for detecting fake news, and this approach was explored in [41][13]. The dEFEND system was created to explain fake news detection. The information from users’ comments was used to find related evidence and validate the facts from the original news. The Factual News Graph (FANG) system from [42][12] was presented to connect the content of news, news sources, and user interactions to create a fulfilled social picture of the inspected news.
Video Production Service