Negation and Speculation Corpora in Natural Language Processing

Negation and Speculation Corpora in Natural Language Processing: Comparison

Please note this is a comparison between Version 3 by Jessie Wu and Version 2 by Ahmed Mahany.

Negation and speculation are universal linguistic phenomena that affect the performance of Natural Language Processing (NLP) applications, such as those for opinion mining and information retrieval, especially in biomedical data.

negation
Natural Language Processing
annotation process
speculation

1. Corpora and Annotation Process

Several research works have addressed the construction of NLP resources to meet the challenges concerning the detection of negation and speculation [14,40]^[1][2]. Most of the annotated corpora have been collected in the biomedical domain, such as from clinical texts, radiology reports, and sometimes medical-related scientific publications [18,19]^[3][4]. For reasons of confidentiality and ethics, it is complicated to obtain access to biomedical corpora generated from patients’ records, even if anonymized and in itself the anonymization of such data types may remove important events. Therefore, biomedical corpora obtained from scientific publications and reporting are increasingly available, such as those from the PubMed Central (PMC) portal (https://pubmed.ncbi.nlm.nih.gov/ (accessed on 20 February 2022)). Other corpora are available for domains such as short stories [15^[5][6],41], reviews [42]^[7], social media posts ^[8], news [28]^[9], and financial articles [43]^[10]. By contrast, regardless of their language, these and most data generated by social media platforms and product review websites are informal and contain grammatical mistakes. Such data types have many uncertainty and factuality issues, making them challenging to annotate.

The following sections discuss the annotation process and guidelines on the phenomena of negation and speculation. In addition to exploring various corpora in several natural languages on the basis of analysis criteria such as domain, size, language, annotated elements, and availability, this section discusses the annotation process and the measure of inter-annotator agreement (IAA).

2. Annotation Guidelines

Negation and speculation corpora are usually annotated in two steps: first annotated is the token that indicates the phenomenon and the cue, then the cue’s scope—the sequence of words that are negated/speculated. Especially in the biomedical domain, further elements, called events, can be annotated, and in the negation corpora of various domains besides the biomedical [35^[11][12][13],44,45], the element of focus has been annotated.

Vincze et al. [14]^[1] created annotation guidelines (https://rgai.sed.hu/sites/rgai.sed.hu/files/Annotation%20guidelines2.1.pdf (accessed on 17 September 2021)) for use during the process of annotating biomedical texts in order to ensure the quality of the annotated data. These are the most-used guidelines for negation and speculation, and have been adapted for various domains [27,40,41]^[2][6][14] and multiple natural languages [9,19,46]^[4][15][16]. The general rule is to consider only those sentences with negating or speculative particles, including complex cases that assure non-existence/uncertainty about something without including any negating or speculative cues. The guidelines described suggest annotating keywords and their scope by following the ‘min–max’ strategy [14]^[1]. The smallest unit that expresses negation or uncertainty is labeled a cue, yet, in special cases, multiple tokens may be considered as the cue since a single word is unable to express the phenomenon. On the other hand, scope involves the longest sequence of tokens to be affected by a cue. To assure the quality of the annotation process, the guidelines clearly describe complex cases, such as the implicit negation employed in the Arabic language [19]^[4].

It is recommended that several annotators with the same level of experience regularly carry out the annotation process, during which they are not allowed to communicate but can consult an annotation expert. The guidelines written by an expert linguist should be adhered to; however, any problematic cases that arise may lead this expert to adjust the guidelines. Although in most cases more than one annotator annotates the entire corpus, in large corpora another approach may be applied whereby a single annotator annotates each sentence, with random checks by a second annotator. A linguist expert resolves any disagreements that arise between annotators and assures the quality of the process by measuring IAA.

IAA measures the consistency between multiple annotators making the same decision and indicates how precise the annotation guidelines are. Usually, the Cohen [47]^[17] or Fleiss’s [48]^[18] kappa coefficient measure is used in the annotation of negation and speculation data. Cohen’s kappa measures the agreement between only two annotators, but Fleiss’s kappa allows for more.

3. Corpora

3.1. English Corpora

This subsection explores ten English corpora that have been annotated with negation or speculation. As shown in Table 1, they cover texts extracted from various domains (biomedical, clinical, reviews, and others).

Table 1. English corpora annotated for negation and/or speculation.

English corpora annotated for negation and/or speculation.

Ref.	Year	Corpus	Domain	Size	Neg.	Spec.	Scope	Event	Avail.
[49]	2007	BioInfer	Biomedical	1100	√		√		√
[21]	2008	GENIA	Biomedical	9372	√	√		√	√
[14]	2008	BioScope	Biomedical	20,924	√	√	√		√
[50]	2010	CoNLL-2010	Biological Wikipedia	40,289		√	√		√
[44]

40]^[2]. Their annotation guidelines were adapted from the BioScope corpus guidelines [14]^[1] and tailored to the review domain as an example: no cues were included in the scope. Of the sentences, 22% contained speculative instances, yet only 18% contained negating instances. Due to the large number of sentences, 17,263, the entire corpus was annotated by one linguist and a second linguist annotated 10% of the documents at random to measure the IAA. The IAA was 0.92 in negation cues and 0.890 in speculation cues and 0.87 and 0.86 in their scope, respectively. The original corpus, its annotated form, and the annotation guidelines are published on their website (http://www.sfu.ca/~mtaboada/research/SFU_Review_Corpus.html (accessed on 23 August 2021)).

The ConanDoyle-neg corpus consists of two of the 56 Sherlock Holmes short stories by Arthur Conan Doyle: The Hound of the Baskervilles (HB) and The Adventure of Wisteria Lodge (WL) [41]^[6]. Morante and Daelemans annotated them with negating cues, their scope, and event information. The annotation guidelines were adapted from the BioScope [14]^[1], yet have several differences. The authors focused on narrative texts, and in addition to defining the annotation of negating cues and their scope, they defined their events [55]^[26]. In this corpus, negation cues and their scope may be discontinuous. They annotated 850 and 145 negating sentences from the total of 3640 from the HB story and 783 from the WL. The corpus was annotated by two annotators, an MSc student and a researcher, both with a background in linguistics. The IAA was based on the F1 measure of 0.85 and 0.77 for scope in the HB and WL stories, respectively. The corpus and annotation guidelines are publicly available (https://www.clips.ua.ac.be/BiographTA/corpora.html (accessed on 1 November 2021)). The corpus was used alongside the PropBank corpus in the SEM 2012 Shared Task*, (https://www.clips.ua.ac.be/sem2012-st-neg/ (accessed on 1 November 2021)) dedicated to resolving the scope and focus of negation [58]^[30].

The Twitter Negation Corpus contains tweets downloaded using Twitter API ^[8]. Two authors manually annotated the tweets with negation cues and their scope. The number of tweets involving negation was 539 of 4000 tweets, including 615 of negation scope. The IAA was measured at both the token and scope level, with values of 0.98 for the token scope and 0.73 for the full scope.

The DeepTutor Negation (DT-Neg) is the first corpus to focus on negation phenomena for dialogue-based systems [27]^[14]. It consists of texts extracted from tutorial interactions between high-school students to solve conceptual physics problems, as logged on an intelligent tutoring system. The authors automatically detected 2603 explicit negation cues in 27,785 student responses, using a compiled list of cue words [14,54]^[1][25]. Their annotation guidelines are based on Morante’s work [54]^[25] to validate negation cues and annotate their scope and focus. The annotation process was performed by five graduate and research students to report 1088 negation cues and 458 instances of scope/focus. The IAA was based on 500 randomly selected instances divided into five parts, two annotators annotating each. The average sentence-level agreement was 0.66 for both scope and focus. The corpus is freely available for academic and research purposes (http://deeptutor.memphis.edu/resources.htm (accessed on 23 September 2021)).

The SFU Opinion and Comments Corpus (SOCC): SOCC contains 10,339 opinion articles, with 663,173 comments, from The Globe and Mail Canadian newspapers [45]^[13]. The corpus has three categories: the articles, the comments, and the comment threads that are publicly available (https://github.com/sfu-discourse-lab/SOCC (accessed on 5 February 2022)). The authors selected a subset of the corpus, 1043 comments, to annotate to four layers: constructiveness, toxicity, negation, and appraisal. The main target of this corpus was to study the relationship between negation and appraisal. A research student and a linguist performed the annotation using guidelines developed to annotate the negation cue, scope, and focus. The guidelines included a new annotation, ‘xscope’, to label the implied content of an inexplicit scope. The annotation process led to 1397 negation cues, and 1349 instances of scope, 1480 of focus, and 34 of ‘xscope’. The IAA was based on 50 comments from the beginning of the annotation process and 50 from the conclusion. Agreement was measured using percentage agreement for nominal data with annotation and another for label and span, then combined to produce the average agreement. The agreement for the first 50 comments was 0.96, 0.88, and 0.47 for the cue, scope, and focus, respectively; for the last 50 comments, the agreement was 0.70, 0.63, and 0.43. The annotated corpus is publicly available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://www.kaggle.com/mtaboada/sfu-opinion-and-comments-corpus-socc (accessed on 7 March 2022)).

3.2. Corpora in Other Languages

This section explores non-English corpora annotated with negation and speculation, as shown in Table 2. Moreover, it highlights the first parallel negation corpus (English-Chinese). As seen in this survey, Spanish is the second language in terms of numbers of annotated corpora for these two phenomena, where it is the nearest language to Arabic.

Table 2. Non-English corpora annotated with negation and/or speculation.

Non-English corpora annotated with negation and/or speculation.

Ref.	Year	Corpus	Lang.	Domain	Size	Neg	Spec	Scope	Event	Focus	Avail
[20]	2010	Stockholm EPR	Swedish	Clinical	6740	√	√
[29]	2014	hUnCertainty	Hungarian	Misc.	15,203		√
[59]	2014	Review and Newspaper	Japanese	Review, Newspaper	2147	√				√	√
[60]	2014	EMC	Dutch	Clinical	12,888 medical terms	√			√			2010	Product Review	Review	2111	√		√
[43		]
2015	CNeSp	Chinese	literature, reviews, financial articles	16,841	√	√	√			√	[51]	2011	PropBank FOC	Journal stories	3779	√				√	√
[46]	2016	EMR	Chinese	Biomedical	36,828	√		√				[40]	2012	SFU Review	Review	17,263	√	√	√			√
[41]	2012	ConanDoyle-neg	Short stories	4423	√		√	√		√
[8]	2015	Twitter Negation	Tweets	4000	√		√
[27]	2016	DT-Neg	Dialogues	27,785 responses	√		√		√	√
[45]	2018	SFU SOCC	Opinion	1043 comments	√		√		√	√

BioInfer was the first biomedical corpus to have negation annotation [49]^[19]. It is collected from abstracts of biomedical scientific papers, where 1100 sentences were annotated with entities, their relationship, and dependencies. In addition, 6% of the annotated relationships include negated cases using the ‘not’ cue. This corpus is publicly available (http://mars.cs.utu.fi/BioInfer/ (accessed on 1 October 2021)), and was built for information extraction systems to find relations between proteins, genes, and RNAs.

The Genome Information Acquisition (GENIA) corpus was originally annotated with part of speech (PoS), syntactic trees, and terms [52,53]^[20][21]. In 2008, half the GENIA corpus, 1000 Medline abstracts, was annotated with negated biological events and two levels of uncertainty [21]^[22]. It consists of 9372 sentences in which 36,114 events were identified, and it is considered the first event-annotated corpus. A chief biologist and three graduates undertook the process of annotating negated, uncertain, and other event types (http://www.nactem.ac.uk/meta-knowledge/Annotation_Guidelines.pdf (accessed on 17 September 2021)). The corpus and its annotation guidelines are publicly available under the terms of the Creative Commons Attribution 3.0 Public License (http://geniaproject.org/genia-corpus/event-corpus (accessed on 17 September 2021)).

BioScope is a well-known biomedical corpus of information on negating/speculative cues and their scope [14]^[1]. It consists of documents from various sources (clinical radiology reports, full biological papers from the FlyBase and BioMed Central (BMC) website, and biological paper abstracts from the GENIA corpus [52]^[20]) covering many types of texts in the biomedical domain. It comprises 6383 clinical texts (radiology reports), 2670 sentences from full scientific papers, and 11,871 sentences from scientific papers’ abstracts. Two independent annotators annotated it, following the annotation guidelines written by a linguist expert. This expert followed the ‘min–max’ strategy during the annotation process, and modified it several times due to ambiguous cases arising between annotators. As a result, the guidelines followed in annotating the BioScope corpus were adapted to multiple domains [7]^[23]. Around 13% of the entire corpus contains negating expressions and more than 16% has speculative sentences. The reliability level of the annotation process was evaluated using the IAA rate, defined as the Fβ—1 measure of one annotation, considering the second to be the ‘gold standard’. The average IAA of annotation was 0.85 for negation scope and 0.81 for speculation scope. The corpus is freely available for the purposes of academic and research (https://rgai.sed.hu/sites/rgai.sed.hu/files/bioscope.zip (accessed on 17 September 2021)).

The Computational Natural Language Learning (CoNLL) 2010 shared task was dedicated to identifying the speculation cues and their scope in two sub-corpora: biological publications and Wikipedia articles [50]^[24]. The first consists of 14,541 sentences, the biological part of the BioScope corpus representing the training set with an evaluation dataset formed of 790 of 5003 sentences from the PMC database. Wikipedia documents, the second sub-corpus, include 2484 of 11,111 sentences as a training set, with an evaluation set of 2346 of 9634 sentences. The corpus was annotated by two independent annotators who followed the ‘min–max’ strategy used on the BioScope corpus. The chief linguist who wrote the annotation guidelines resolved any disagreement between the annotators. The corpus is publicly available for research purposes (http://www.inf.u-szeged.hu/rgai/conll2010st (accessed on 2 February 2022)).

The research community has investigated other domains, and Morante et al. highlight the need for corpora covering domains other than the biomedical [54]^[25]. The annotation guidelines of the biomedical domain should be adapted to new domains, such as product reviews [44]^[12] and short stories by Conan Doyle [55]^[26].

The Product Review corpus is the first corpus in the review domain to have been annotated for negation [44]^[12]. It consists of 2111 sentences written as reviews for products, as extracted from Google Product Search. The number of negated sentences in this corpus is 679, and each sentence was annotated manually to define its cues and scope. Unlike the BioScope guidelines [14]^[1], the authors of this corpus included no negation cues in its scope. The IAA between two annotators on a dataset sample was 0.91, which is high. As a result, to complete the annotation of the entire corpus, each annotator applied the guidelines to a separate part of the reviews.

Blanco and Moldovan introduced the negation focus using the PropBank Corpus [51]^[27], the focus of negation in a sentence is the part of the scope that is most directly or explicitly negated. This corpus was selected for its semantic annotation, as it is not restricted to the biomedical domain [56]^[28]. The authors worked on 3779 sentences marked with MNEG, annotating the negation focus. Half of the PropBank FOC sentences were annotated by two annotators, achieving an IAA of 0.72. Any disagreements were then carefully examined and resolved, and the annotators were given a revised version of the guidelines to annotate the remaining half.

The Simon Fraser University (SFU) Review Corpus is a large annotated English corpus comprising 400 documents extracted from the Epinions.com website, belonging to various domains such as books, hotels, movies, and consumer product reviews [57]^[29]. Each sentence is labelled according to whether it is a positive or negative review, where a different person submits every review. Konstantinova et al. annotated the corpus at the token level with negating and speculative cues, and at the sentence level with the linguistic scope [

[
61
]
2016
GNSC	German	Biomedical	2234	√	√	√	√
[19]	2016	BioArabic	Arabic	Biomedical	10,165	√	√	√
[9]	2017	IULA	Spanish	Biomedical	3194	√		√			√
[62]	2017	UHU-HUVR	Spanish	Clinical	8412	√		√	√
[26]	2017	SFU Review_SP NEG	Spanish	Review	9455	√		√	√		√
[28]	2017	News (Fact-Ita Bank) and Tweets	Italian	News stories, Tweets	1591	√		√	√	√
[15]	2018	NegPar	English-Chinese	Short stories	5520 E5005 C	√		√	√		√
[24]	2019	ESSAI	French	Medical	6547	√	√	√
[24]	2019	CAS	French	Medical	3811	√	√	√
[13]	2020	REBEC	Brazilian Portuguese	Clinical	3228	√		√		√
[13]	2020	Clinical narratives	Brazilian Portuguese	Clinical	9808	√		√		√
[18]	2020	NUBES	Spanish	Biomedical	29,682	√	√	√	√		√
[35]	2020	NewsComm	Spanish	Comments	4980	√		√	√	√	√
[17]	2021	T-MexNeg	Mexican Spanish	Tweets	13,704	√		√	√		√
[36]	2021	ArNeg	Arabic	Wikipedia Biography Religion	6000	√		√

Swedish: A subset of the Stockholm Electronic Patient Record (EPR) corpus [63]^[31] was randomly selected, giving 6740 sentences annotated with certain/uncertain expressions, as well as negating and speculative cues [20]^[32]. The Stockholm ERP corpus is a clinical corpus with free texts in the category of Assessment, comprising patient records from the city of Stockholm. The annotation guidelines are similar to those for the BioScope corpus [14]^[1], but certain expressions, such as those containing question marks, were not annotated. Three annotators worked together on the entire corpus, consisting of 6996 expressions. The average number of annotated cues is 1624 speculative and 1008 negation. The IAA was measured by a pairwise F-measure, with approximate values of 0.8, 0.5, 0.9, and 0.6 for negation cues, speculation cues, certain cases, and uncertain cases. The authors plan to make the corpus available for research purposes.

Hungarian: Vincze presented the first Hungarian corpus to annotate uncertainty cues, where the hUnCertainty corpus consists of 15,203 sentences from several domains [29]^[33]. The author randomly selected 1081 paragraphs from the Hungarian Wikipedia, including 9722 sentences. The second sub-corpus consists of 300 parts (5481 sentences) of criminal news from the HVG Hungarian news portal. The annotation guidelines were adapted from seven categories from earlier works [64]^[34] and [38]^[35], with slight modifications. The frequency of uncertainty cues in Wikipedia and the news was 5980 and 2361, respectively.

Japanese: Matsuyoshi et al. annotated the first Japanese corpus with negation cues, scope, and focus, covering texts from two domains [59]^[36]. It consists of 5178 sentences of user review data randomly selected from Rakuten Travel data, and 5582 sentences from newspaper articles (Group A and B) from the Balanced Corpus of Contemporary Written Japanese (BCCWJ). After filtering the data, the authors proposed annotation guidelines for negation in Japanese, for which the total candidate sentences for reviews and newspaper articles were 1246 and 901, respectively. The total negation cues were 1023 and 762, respectively, where 300 and 190 sentences in each domain include a negated scope. Two annotators marked the focus for newspaper documents in Group A, where 66% of segments include focus particles. The next step was to resolve disagreements to annotate the remaining part of the entire corpus using a single annotator, and 165 out of 490 of instances of negation scope were found to include focus particles. The authors plan to publish their annotated corpus for the research community (http://cl.cs.yamanashi.ac.jp/ (accessed on 1 January 2022)).

Dutch: The Erasmus Medical Center (EMC) clinical corpus includes several types of anonymized clinical documents, such as those by general practitioners (GP), specialists’ letters (SP) from the IPCI database, radiology reports (RD), and discharge letters (DL) from the EMC in Netherland [60]^[37]. The authors extracted the Dutch medical terms from the Unified Medical Language System (UMLS) for use in annotating the corpus with negation. A medical term was marked as ‘Negated’ if the evidence did not exist; otherwise, it was annotated as ‘Not Negated’. Two independent annotators followed the compiled guidelines, with support from a linguist expert to resolve disagreements. As a result, 1804 negated terms were reported of the 12,888 instances of medical terms. The IAA for the entire corpus was 0.92, on average. The authors mention that their corpus is free for research, upon request.

Chinese: The Chinese Negation and Speculation (CNeSp) corpus consists of 16,841 sentences from various domains, annotated with negating and speculative cues and their scope [43]^[10]. It includes 19 scientific articles, 311 financial articles, and 821 product reviews. The authors adjusted the BioScope corpus guidelines [14]^[1] to make it suitable for the Chinese language. These modifications include that the subject should be within the scope, and that the scope should be a continuous set of tokens. Two annotators carried out the annotation process and a linguist expert resolved disagreements. As a result, the percentages of negating sentences were 13.2, 17.5, and 52.9 for scientific articles, financial articles, and product reviews, respectively. These percentages indicate that the review domain has high percentages of negation, independent of language. Moreover, the percentages of speculation were 21.6, 30.5, and 22.6, respectively, similar to the percentages of negation. IAA kappa-based average values of 0.90 for negation and 0.89 speculation show that the annotation guidelines were well formulated. The authors have made this corpus publicly available for research purposes (http://nlp.suda.edu.cn/corpus/CNeSp/ (accessed on 10 January 2022)).

Chinese: Kang et al. collected 36,828 sentences from 400 admission notes and 400 discharge summaries of one month’s data, March 2011, from the EMR database of Peking Union Medical College Hospital [46]^[16]. This corpus includes data from four biomedical categories: diseases, symptoms, treatments, and laboratory tests. The BioScope corpus guidelines [14]^[1] and other Chinese guidelines [43]^[10] were adapted to annotate the corpus. Three domain experts formulated the initial draft and then applied the guidelines to a corpus sample to refine them. The outcome of the process is 21,767 negating sentences from the entire corpus. The IAA measure was based on only 80 notes, annotated by two annotators, reaching 0.79 measured by Cohen kappa.

German: The German Negation and Speculation Corpus (GNSC) is the first German annotated corpus in the biomedical domain [61]^[38]. It consists of eight anonymized discharge summaries containing medical histories and 175 clinical notes from the nephrology domain, which are shorter than discharge summaries. First, the medical terms were automatically pre-annotated using an annotation tool [65]^[39], following the UMLS unified coding standards for EHRs, as per the predefined types. Secondly, a human annotator rectified incorrect annotations and classified whether a given finding occurred in a positive, negative, or speculative context. Finally, a linguist revised the annotated sentences to assure the data quality. As a result, for discharge summaries the negative and speculated sentences were 106 and 22 of 1076 sentences, and for clinical notes 337 and 4 of 1158 sentences. The results show that in this corpus speculation rarely arises.

Arabic: The BioArabic is the first Arabic corpus annotated with negation and speculation for biomedical Arabic texts [19]^[4]. The corpus consists of 10,165 sentences extracted from 70 medical and biological articles collected from the Iraqi Journal of Biotechnology, the Journal of Damascus University for Health Sciences, and Biotechnology News. An expert linguist adapted the annotation guidelines of the BioScope corpus [14]^[1] to Arabic biomedical texts. Five linguist annotators performed the annotation according to the guidelines developed, as described in detail in the paper. The expert linguist resolved disagreements arising between the annotators to report 1297 negated sentences and 1376 speculative ones. Unfortunately, the measure of agreement between the annotators was not reported, and the corpus is not readily available online.

Spanish: The IULA Clinical Record corpus consists of 3194 sentences extracted from 300 anonymized clinical records from the main hospitals in Barcelona, Spain [9]^[15]. These sentences were manually annotated with negation cues and their scope by three linguists, advised by a clinician. Marimon et al. used the English annotation guidelines from several domains: BioScope from the biomedical domain, and ConanDoyle-neg from the short stories domain and general concepts [14,54,66]^[1][25][40]. As in the BioScope guidelines, the annotation guidelines include neither the negation cue nor the subject of the record in the scope. After applying these annotation rules to the entire corpus, the team reported 1093 negated sentences. The percentage of sentences that include negation scope, roughly 35%, is relatively high compared with English corpora. The IAA Cohen kappa rates were 0.85 between annotators 1 and 2 or 3 and 0.88 between annotators 2 and 3. The corpus is publicly available with a CC-BY-SA 3.0 license (http://eines.iula.upf.edu/brat//#/NegationOnCR_IULA/ (accessed on 5 January 2022)).

Spanish: UHU-HUVR, a corpus with 276 radiology reports and 328 patients’ personal histories (written in free text) obtained from the Virgen del Rocío Hospital in Seville, Spain [62]^[41]. It is considered the first Spanish corpus to include affixal negation. Cruz et al. adapted the Thyme corpus guidelines to annotate the negation cues and scope, using two domain experts [67]^[42]. The results of the annotation are 1079 negated sentences of the 3065 in the patient histories and 1219 negated sentences of the 5347 in the radiology documents. As observed in the report, the percentage of negated sentences in the histories is relatively high since they are patients’ descriptions of clinical conditions, which are not 100% factual, whereas the radiology report gives the radiologists’ observations. The IAA measure for the negation scope was 0.72.

Spanish: The SFU Review_SP Negation (NEG) extends the SFU Review corpus [57]^[29]; it consists of 400 review documents from domains such as movies, music, and various product reviews, extracted from the Ciao.es website [26]^[43]. Each domain contains 25 positive and 25 negative review documents, based on the stars awarded by the reviewer. It is the first Spanish corpus to include events in the annotation and reflect any discontinuous negation cues. Like its English equivalent, the entire corpus was manually annotated with negation cues and their consequent scope and events [40]^[2]. Although the Bioscope corpus guidelines had been written for the biomedical domain, for the review domain the authors adapted these guidelines alongside a typology of negation patterns in Spanish [14]^[1]. The annotation process was supervised by experts and performed by two trained annotators, who came up with 3022 negated sentences in the 9455 sentences. The IAA kappa coefficient is 0.95 for negated events and 0.94 for scope, both of which are relatively high compared to other Spanish negation annotated corpora. This corpus is publicly available under a Creative Commons Attribution Noncommercial-ShareAlike 4.0 International License (http://clic.ub.edu/corpus/es/node/171 (accessed on 13 November 2021)).

Italian: Altuna et al. annotated two corpora from contrasting domains [28]^[9]. The first consists of 71 documents for stories adapted from Fact-Ita Bank [68]^[44]. The second consists of 301 tweets used in the Factuality Annotation (FactA) task at Evaluation of NLP and Speech Tools for Italian (EVALITA) 2016 [69]^[45]. The authors based their annotation guidelines on earlier guidelines [51^[26][27],55], including negation cues, scope, and focus. In general, every negation cue is associated with its scope and focus, and the first corpus of 1290 sentences contains 282 negating cues and 278 negated sentences. The second corpus comprises 59 negated sentences and 71 negation cues. The authors are to make the annotated data available on the Human Language Technology-Natural Language Processing (HLT-NLP) website at the FBK organization to implement a system for negation detection in Italian. The agreement on the annotation of the scope achieved roughly 0.7 and 0.6 for focus, where IAA was based on the average pairwise F-measure.

English–Chinese: Many corpora in various natural languages have been annotated with monolingual negation to detect negation phenomena; however, they have not been studied in parallel across languages. The NegPar corpus is the first English–Chinese parallel corpus to be annotated with negation for narrative texts [15]^[5]. The corpus used four short stories from Conan Doyle’s Sherlock Holmes short stories and mapped them to their Chinese translation by Mengyuan Lin. The annotation guidelines of the ConanDoyle-neg corpus [41]^[6] were adapted for the English part of the corpus. Although the English side of the corpus had already been annotated, most was reannotated to capture semantic phenomena. Qianchu Liu, a native Mandarin speaker, created the annotation guidelines for the Chinese part with support from the other two authors. Chinese translation often converts a positive English statement into a negative one; therefore, the number of negated sentences in the Chinese corpus is slightly higher than in the original English corpus. The annotation process was based on projections from the English corpus, where 1304 and 1762 negated sentences were annotated for the English and Chinese parts, respectively. The annotation projection offers imperfect help in the annotation process, where the word-level F1 measure was 0.39, 0.45, and 0.24 for negation cue, scope, and event, respectively. The corpus, with its annotation guidelines, has been published for public use (https://github.com/qianchu/NegPar (accessed on 25 January 2022)).

French: ESSAI is a corpus of French clinical trials obtained mainly from the National Cancer Institute registry website [13,24]^[46][47]. The French protocol for such clinical trials has two parts: a summary of the trial, which presents its purpose and the methods applied, and its detailed description. ESSAI comprises 6547 sentences; 1025 sentences are negating, while 630 are speculative. The IAA measure of the negation annotation is 0.80, which is not high compared to English corpora [13]^[46].

French: Grabar et al. introduced another French corpus called CAS. This consists of 3811 sentences for clinical cases adapted from scientific literature and training materials [70]^[48]. These clinical cases were published in various journals and websites in French-speaking countries such as Belgium, Switzerland, and Canada, and relate to medical specialties [53]^[21]. The purpose of clinical cases is to describe clinical situations for real (de-identified) or fake patients. This corpus is automatically annotated with negation using different supervised learning techniques, trained on the ESSAI corpus. Uncertainty cues and their scope were annotated for CAS on the basis of heuristic rules. Of the 3811 sentences, 804 are negating and 226 include speculation. Later in 2021, two annotators manually verified CAS for the negation annotation, achieving a Kappa Cohen’s IAA of 0.84 [13]^[46].

Brazilian Portuguese: Dalloux et al. presented the Brazilian Portuguese clinical trial protocols provided by Brazilian registration of clinical trials (REBEC) [13]^[46]. Each protocol includes a sample title, scientific title, description, and inclusion and exclusion criteria. Negation in this corpus provides valuable information on the specification of the target and the patient’s recruitment. Three students annotated different parts of these protocols and came up with 643 negating sentences out of 3228, which is considered relatively low in this corpus. The authors compiled another Brazilian Portuguese corpus, collected from three hospitals in Brazil, covering medical specialties such as cardiology and nephrology [13]^[46]. This corpus contains 9808 sentences on clinical narratives. The negation cues and their scopes were manually annotated, and include 1751 negating cases with a Cohen’s kappa coefficient of 0.74.

Spanish: Negation and Uncertainty annotations in biomedical texts in the Spanish (NUBES) corpus make it the largest publicly available Spanish corpus for negation. It is the first to have annotations of speculative cues and scope and speculative events [18]^[3]. NUBES consists of 29,682 sentences from a Spanish private hospital’s anonymized health records. The corpus was extracted from several sections, namely Surgical History, Physical Examination, and Diagnostic Tests, and divided into 10 batches. Together, two linguists drafted the initial annotation guidelines sourced from IULA [9]^[15] then extended them to include uncertainty. One batch was annotated using the initial guidelines, then improved after a medical expert resolved the disagreements between the annotators. The Cohen kappa IAA was calculated at 0.8 based on the sample batch. Later, the same batch was annotated by a third annotator to resolve all disagreements and create a ‘gold standard’. Consequently, the other nine batches could be annotated by a single annotator using the final version of the annotation guidelines. This corpus of 29,682 sentences contains 7567 negated and 2219 uncertain instances. The authors enriched the IULA corpus by incorporating uncertainty and have made the guidelines publicly available (https://github.com/Vicomtech/NUBes-negation-uncertainty-biomedical-corpus (accessed on 20 January 2022)).

Spanish: NewsComm was the first Spanish corpus to have been collected from newspapers and annotated for negation phenomena. Also, it is considered to be the first Spanish corpus to be manually annotated with negation alongside negation cues and scope. The corpus consists of 2955 comments posted in response to 18 news articles on nine topics (immigration, politics, technology, terrorism, economy, society, religion, refugees, and real estate) in an online Spanish newspaper, two articles per topic [35]^[11]. Linguistic analysis of the negation focus arrived at 10 conditions for the various forms of negation in this corpus. The criteria for annotation are described in detail in the guidelines (http://clic.ub.edu/publications (accessed on 25 February 2022)). The two trained annotators selected to annotate this corpus manually had earlier annotated the SFU Review_SP NEG corpus [26]^[43]. They found 2965 negating structures in the corpus, with corresponding negation cues, scope, and focus. The result of the annotation process shows that 45% (2247 of 4980 sentences) are negating. Furthermore, the kappa IAA measure is 0.83, a high value for this first annotated Spanish corpus to have a negation focus. The authors made the corpus freely available for research purposes.

Mexican Spanish: T-MexNeg is the first Mexican Spanish corpus annotated for negating phenomena [17]^[49]. It consists of 13,704 tweets collected from September 2017 to April 2019 using the standard streaming APIs from Twitter (https://developer.twitter.com/en/docs/tutorials/consuming-streaming-data (accessed on 1 March 2022)). To limit the collection to Mexico, tweets were filtered by the language tag ‘es’ and the user location ‘mx’. Although the corpus is very large, it was annotated manually for negation cue, scope, and event. In addition, the authors adapted the SFU Review_SP NEG [26]^[43] guidelines to Mexican Spanish and the nature of the collected corpus. The annotation process was carried out in two stages: binary classification for the presence of negation, and manual annotation by three teams of two annotators, who were linguistics students, together with a linguist. As a result, the T-MexNeg corpus of 13,704 sentences includes 4895 tweets with negation. The IAA was measured with Cohen’s kappa coefficient and has a value of 0.89, which is relatively high. The authors made this corpus publicly available for research purposes (https://gitlab.com/gil.iingen/negation_twitter_mexican_spanish (accessed on 1 March 2022)).

Arabic: The ArNeg corpus is another Arabic corpus that has been annotated with negation for formal Arabic texts [36]^[50]. It consists of 6000 sentences collected from Wikipedia and the King Saud University-Corpus for Classical Arabic (KSUCCA). The corpus covers sentences from topics such as biography, media, science, and technology. Mahany et al. wrote clear annotation guidelines, to which the two independent Arabic native speakers adhered in the annotation process. The percentage of the negated sentences was found to be 18% for the Wikipedia sub-corpus and 29% for the KSUCCA sub-corpus. One of the annotators applied the guidelines to 20% of the entire corpus, and between all five the IAA was recorded at 0.98.

References

Vincze, V.; Szarvas, G.; Farkas, R.; Móra, G.; Csirik, J. The BioScope Corpus: Biomedical Texts Annotated for Uncertainty, Negation and Their Scopes. BMC Bioinform. 2008, 9 (Suppl. S11), S9.
Konstantinova, N.; De Sousa, S.C.M.; Cruz, N.P.; Maña, M.J.; Taboada, M.; Mitkov, R. A Review Corpus Annotated for Negation, Speculation and Their Scope. In Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012; European Language Resources Association (ELRA): Istanbul, Turkey, 2012; pp. 3190–3195.
Lima, S.; Perez, N.; Cuadros, M.; Rigau, G. NUBES: A Corpus of Negation and Uncertainty in Spanish Clinical Texts. In Proceedings of the LREC 2020—12th International Conference on Language Resources and Evaluation; European Language Resources Association (ELRA): Marseille, France, 2020; pp. 5772–5781.
Al-Khawaldeh, F.T. Speculation and Negation Annotation for Arabic Biomedical Texts: BioArabic Corpus. World Comput. Sci. Inform. Technol. J. (WCSIT) 2016, 6, 8–11.
Liu, Q.; Fancellu, F.; Webber, B. NegPar: A Parallel Corpus Annotated for Negation. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018); European Language Resources Association (ELRA): Miyazaki, Japan, 2018; pp. 3464–3472.
Morante, R.; Daelemans, W. ConanDoyle-Neg: Annotation of Negation Cues and Their Scope in Conan Doyle Stories. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12); European Language Resources Association (ELRA): Istanbul, Turkey, May, 2012; pp. 1563–1568.
Jiménez-Zafra, S.M.; Díaz, N.P.C.; Morante, R.; Martín-Valdivia, M.T. NEGes 2019 Task: Negation in Spanish. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019), CEUR Workshop, Bilbao, Spain, 24 September 2019; pp. 329–341.
Reitan, J.; Faret, J.; Gambäck, B.; Bungum, L. Negation Scope Detection for Twitter Sentiment Analysis. In Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Stroudsburg, PA, USA, 17 September 2015; pp. 99–108.
Altuna, B.; Minard, A.-L.; Speranza, M. The Scope and Focus of Negation: A Complete Annotation Framework for Italian. In Proceedings of the Workshop Computational Semantics Beyond Events and Roles; Association for Computational Linguistics (ACL): Valencia, Spain, 2017; pp. 34–42.
Zou, B.; Zhu, Q.; Zhou, G. Negation and Speculation Identification in Chinese Language. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA, 2015; pp. 656–665.
Taulé, M.; Nofre, M.; González, M.; Martí, M.A. Focus of Negation: Its Identification in Spanish. Nat. Lang. Eng. 2021, 27, 131–152.
Councill, I.G.; McDonald, R.; Velikovich, L. What’s Great and What’s Not: Learning to Classify the Scope of Negation for Improved Sentiment Analysis. In Proceedings of the ACL Workshop on Negation and Speculation in Natural Language Processing, Uppsala, Sweden, 10 July 2010; pp. 51–59.
Kolhatkar, V.; Wu, H.; Cavasso, L.; Francis, E.; Shukla, K.; Taboada, M. The SFU Opinion and Comments Corpus: A Corpus for the Analysis of Online News Comments. Corpus Pragmat. 2020, 4, 155–190.
Banjade, R.; Rus, V. DT-Neg: Tutorial Dialogues Annotated for Negation Scope and Focus in Context. In Proceedings of the Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016; European Language Resources Association (ELRA): Portorož, Slovenia, 2016; pp. 3768–3771.
Marimon, M.; Vivaldi, J.; Bel, N. Annotation of Negation in the IULA Spanish Clinical Record Corpus. In Proceedings of the Workshop Computational Semantics Beyond Events and Roles; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA, 2017; pp. 43–52.
Kang, T.; Zhang, S.; Xu, N.; Wen, D.; Zhang, X.; Lei, J. Detecting Negation and Scope in Chinese Clinical Notes Using Character and Word Embedding. Comput. Methods Prog. Biomed. 2017, 140, 53–59.
Cohen, J. A Coefficient of Agreement for Nominal Scales. Educ. Psychol. Measur. 1960, 20, 37–46.
Fleiss, J.L. Measuring Nominal Scale Agreement among Many Raters. Psychol. Bull. 1971, 76, 378–382.
Pyysalo, S.; Ginter, F.; Heimonen, J.; Björne, J.; Boberg, J.; Järvinen, J.; Salakoski, T. BioInfer: A Corpus for Information Extraction in the Biomedical Domain. BMC Bioinform. 2007, 8, 50.
Ohta, T.; Tateisi, Y.; Kim, J.D. The GENIA Corpus: An Annotated Research Abstract Corpus in Molecular Biology Domain. In Proceedings of the Second International Conference on Human Language Technology Research, San Diego, CA, USA, 24–27 March 2002; pp. 82–86.
Kim, J.-D.; Ohta, T.; Tateisi, Y.; Tsujii, J. GENIA Corpus-A Semantically Annotated Corpus for Bio-Textmining. Bioinformatics 2003, 19, i180–i182.
Kim, J.-D.; Ohta, T.; Tsujii, J. Corpus Annotation for Mining Biomedical Events from Literature. BMC Bioinform. 2008, 9, 10.
Konstantinova, N.; De Sousa, S.C.M. Annotating Negation and Speculation: The Case of the Review Domain. In Proceedings of the Second Student Research Workshop Associated with the International Conference on Recent Advances in Natural Language Processing (RANLP), Hissar, Bulgaria, 12–14 September 2011; pp. 139–144.
Farkas, R.; Vincze, V.; Móra, G.; Csirik, J.; Szarvas, G. The CoNLL-2010 Shared Task: Learning to Detect Hedges and Their Scope in Natural Language Text. In Proceedings of the CoNLL 2010—14th Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, 15–16 July 2010; ACL Anthology: New York, NY, USA, 2010; pp. 1–12.
Morante, R.; Schrauwen, S.; Daelemans, W. Corpus-Based Approaches to Processing the Scope of Negation Cues: An Evaluation of the State of the Art. In Proceedings of the 9th International Conference on Computational Semantics, IWCS 2011; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA, 2011; pp. 350–354.
Morante, R.; Schrauwen, S.; Daelemans, W. Annotation of Negation Cues and Their Scope Guidelines v1.0; University of Antwerp: Antwerp, Belgium, 2011.
Blanco, E.; Moldovan, D. Semantic Representation of Negation Using Focus Detection. In Proceedings of the ACL-HLT 2011—49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies; Association for Computational Linguistics (ACL): Portland, OR, USA, 2011; pp. 581–589.
Palmer, M.; Gildea, D.; Kingsbury, P. The Proposition Bank: An Annotated Corpus of Semantic Roles. Comput. Linguist. 2005, 31, 71–106.
Taboada, M.; Anthony, C.; Voll, K. Methods for Creating Semantic Orientation Dictionaries. In Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006; European Language Resources Association (ELRA): Genoa, Italy, 2006; pp. 427–432.
Morante, R.; Blanco, E. ∗SEM 2012 Shared Task: Resolving the Scope and Focus of Negation. In Proceedings of the SEM 2012—1st Joint Conference on Lexical and Computational Semantics; Association for Computational Linguistics (ACL): Montréal, ON, Canada, 2012; pp. 265–274.
Dalianis, H.; Hassel, M.; Velupillai, S. The Stockholm EPR Corpus: Characteristics and Some Initial Findings. In Proceedings of the 14th International Symposium for Health Information Management Research (ISHIMR 2009), Kalmar, Sweden, 14–16 October 2009; pp. 243–249.
Dalianis, H.; Velupillai, S. How Certain Are Clinical Assessments? Annotating Swedish Clinical Text for (Un) Certainties, Speculations and Negations. In Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010; European Language Resources Association (ELRA): Valletta, Malta, 2010; pp. 3071–3075.
Vincze, V. Uncertainty Detection in Hungarian Texts. In Proceedings of COLING 2014: Technical Papers, Proceedings of the COLING 2014-25th International Conference on Computational Linguistics, Dublin, Ireland, 23–29 August 2014; University and Association for Computational Linguistics: Dublin, Ireland, 2014; pp. 844–1853.
Szarvas, G.; Vincze, V.; Farkas, R.; Móra, G.; Gurevych, I. Cross-Genre and Cross-Domain Detection of Semantic Uncertainty. Comput. Linguist. 2012, 38, 335–367.
Vincze, V. Weasels, Hedges and Peacocks: Discourse-Level Uncertainty in Wikipedia Articles. In Proceedings of the Sixth International Joint Conference on Natural Language Processing; Asian Federation of Natural Language Processing: Nagoya, Japan, 2013; pp. 383–391.
Matsuyoshi, S.; Otsuki, R.; Fukumoto, F. Annotating the Focus of Negation in Japanese Text. In Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014; European Language Resources Association (ELRA): Reykjavik, Iceland, 2014; pp. 1743–1750.
Afzal, Z.; Pons, E.; Kang, N.; Sturkenboom, M.C.J.M.; Schuemie, M.J.; Kors, J.A. ContextD: An Algorithm to Identify Contextual Properties of Medical Terms in a Dutch Clinical Corpus. BMC Bioinform. 2014, 15, 373.
Cotik, V.; Roller, R.; Xu, F.; Uszkoreit, H.; Budde, K.; Schmidt, D. Negation Detection in Clinical Reports Written in German. In Proceedings of the 5th Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM 2016); The COLING 2016 Organizing Committee: Osaka, Japan, 2016; pp. 115–124.
Roller, R.; Uszkoreit, H.; Xu, F.; Seiffe, L.; Mikhailov, M.; Staeck, O.; Budde, K.; Halleck, F.; Schmidt, D. A Fine-Grained Corpus Annotation Schema of German Nephrology Records. In Proceedings of the Clinical Natural Language Processing Workshop; Clinical : Osaka, Japan, 2016; pp. 69–77.
Mutalik, P.G.; Deshpande, A.; Nadkarni, P.M. Use of General-Purpose Negation Detection to Augment Concept Indexing of Medical Documents. J. Am. Med. Inform. Assoc. 2001, 8, 598–609.
Cruz, N.; Morante, R.; Maña López, M.J.; Mata Vázquez, J.; Parra Calderón, C.L. Annotating Negation in Spanish Clinical Texts. In Proceedings of the Workshop Computational Semantics Beyond Events and Roles; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA, 2017; pp. 53–58.
Styler, W.F.; Bethard, S.; Finan, S.; Palmer, M.; Pradhan, S.; de Groen, P.C.; Erickson, B.; Miller, T.; Lin, C.; Savova, G.; et al. Temporal Annotation in the Clinical Domain. Trans. Assoc. Comput. Linguist. 2014, 2, 143–154.
Jiménez-Zafra, S.M.; Taulé, M.; Martín-Valdivia, M.T.; Ureña-López, L.A.; Martí, M.A. SFU ReviewSP-NEG: A Spanish Corpus Annotated with Negation for Sentiment Analysis. A Typology of Negation Patterns. Lang. Resour. Eval. 2018, 52, 533–569.
Minard, A.; Marchetti, A.; Speranza, M. Event Factuality in Italian: Annotation of News Stories from the Ita-TimeBank. In Proceedings of CLiC-it 2014, First Italian Conference on Computational Linguistic, Pisa, Italy, 9–11 December 2014; pp. 260–264.
Minard, A.-L.; Speranza, M.; Caselli, T. The EVALITA 2016 Event Factuality Annotation Task (FactA). In EVALITA. Evaluation of NLP and Speech Tools for Italian; Accademia University Press: Torino, Italy, 2016; pp. 32–39.
Dalloux, C.; Claveau, V.; Grabar, N.; Oliveira, L.E.S.; Moro, C.M.C.; Gumiel, Y.B.; Carvalho, D.R. Supervised Learning for the Detection of Negation and of Its Scope in French and Brazilian Portuguese Biomedical Corpora. Nat. Lang. Eng. 2021, 27, 181–201.
Dalloux, C.; Claveau, V.; Grabar, N. Speculation and Negation Detection in French Biomedical Corpora. In Proceedings of the Recent Advances in Natural Language Processing, Varna, Bulgaria, 2–4 September 2019; pp. 223–232.
Grabar, N.; Claveau, V.; Dalloux, C. CAS: French Corpus with Clinical Cases. In Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA, 2018; pp. 122–128.
Bel-Enguix, G.; Gómez-Adorno, H.; Pimentel, A.; Ojeda-Trueba, S.-L.; Aguilar-Vizuet, B. Negation Detection on Mexican Spanish Tweets: The T-MexNeg Corpus. Appl. Sci. 2021, 11, 3880.
Mahany, A.; Fouad, M.M.; Aloraini, A.; Khaled, H.; Nawaz, R.; Aljohani, N.R.; Ghoniemy, S. Supervised Learning for Negation Scope Detection in Arabic Texts. In Proceedings of the Tenth International Conference on Intelligent Computing and Information Systems (ICICIS), Cairo, Egypt, 5 December 2021; pp. 177–182.