Phyto products are widely used in natural products, such as medicines, cosmetics or as so-called “superfoods”. However, the exact metabolite composition of these products is still unknown, due to the time-consuming process of metabolite identification. Non-target screening by LC-HRMS/MS could be a technique to overcome these problems with its capacity to identify compounds based on their retention time, accurate mass and fragmentation pattern. In particular, the use of computational tools, such as deconvolution algorithms, retention time prediction, in silico fragmentation and sophisticated search algorithms, for comparison of spectra similarity with mass spectral databases facilitate researchers to conduct a more exhaustive profiling of metabolic contents.
1. Introduction
Natural products represent a big market, with 166 billion US-dollars of sales volume throughout the year 2019 in the US, which implies a growth of sales of 4.8% compared to 2018
[1]. This is the reason why phyto-analysis is an important field to assess metabolites in phyto samples for their possible compounds with health benefits and to control their quality in terms of contamination due to their environment, cultivation, or additives, e.g., pesticides, fertilizer, or other toxic compounds. A possible workflow for the detection of metabolites and contaminants in phyto samples using non-target screening is shown in
Figure 1. Examples for the need of quality control of phyto samples are pesticides in wine and the addition of diethylene glycol in wine
[2] in 1984 in Austria, which was used to simulate higher quality and to obtain a higher volume of product
[3]. However, due to the Matthew effect, the detection of those contaminants is hampered
[4]. This effect describes the psychological effect that decisions on targets of studies are based on occurrence in prior studies, rather than considering that additional factors may have an influence on the studied object. Those compounds represent the so-called “known knowns” and “known unknowns” described with the “Rumsfeld Quadrants” (
Figure 2) by Stein
[5].
Figure 1. Workflow for non-target screening.
Figure 2. “Rumsfeld Quadrants” showing the intersection of yes/no answers for whether analysts expect a compound to be identified in the sample (prior probability) and whether it was identified in a library search. Adapted with permission from
[5].
Despite that, real new knowledge is generated by finding unexpected compounds in a sample, which are described as “unknown knowns” and “unknown unknowns”. To find those unknowns, non-target screening represents the method of choice. For this purpose, three major techniques are used to acquire spectra of samples after sample preparation: one of these techniques is nuclear magnetic resonance (NMR)
[6][7], the other two techniques represent liquid chromatography coupled to a high-resolution mass spectrometer (LC-HRMS), and gas chromatography coupled to a mass spectrometer (GC-MS)
[8]. The latter was used for a long period of time as gold standard to detect volatile compounds, due to its instrument standardization and, therefore, its comparability of data acquired by different laboratories . A major disadvantage of GC-MS is that polar and thermally labile compounds must be derivatized in order to analyze them
[9]. In contrast, LC-HRMS/MS can be used for those compounds with minor sample preparation. In particular, the ability to obtain fragment spectra by collision-induced dissociation, after the ionization process in LC-HRMS
[10], which represents a fingerprint of the compound’s structure, facilitates the identification. After the acquisition of spectra, data processing has to be conducted to separate compound spectra from each other, using retention times and precursor masses. To be able to conduct non-target screening, spectral databases in combination with search algorithms are needed to compare the acquired fragment spectra with spectra of reference standards for the identification of compounds. However, the occurrence of these reference spectra is a bottleneck. For a total amount of 129 million registered compounds
[11], the biggest database for non-volatile compounds includes just more than 40,000 spectra of more than 15,000 compounds
[12]. This exemplifies that only a fraction of compounds in samples can be found by comparing LC-HRMS/MS fragment spectra with databases, and more effort must be taken to expand the number of compounds in these libraries. This technique of identification was widely used for the identification of water pollutants
[13][14][15][16][17] and drugs
[18][19][20]. Jorge et al. published a review describing several targeted applications of LC-MS for metabolite screening
[21].
2. Applications
Non-target screening can be applied in various fields of phyto-analysis. Several reviews were published describing workflows for non-target screening, and the most important steps are also mentioned in this research. These workflows can be found in references
[22][23][24][25]. Kalogiouri et al. gave an overview of different efforts for the analysis of olive oils
[26], in particular, distinguishing differences between extra virgin and virgin olive oil, which corresponds to its quality, by non-target metabolite profiling and the additional use of chemometrical tools. In addition, some articles were mentioned in which the origins of olive oils were discriminated with non-target screening. Another application of quality control with non-target screening in phyto samples is the determination of toxic compounds, such as natural toxins or pesticides, which could be facilitated by applying non-target screening routinely because of its high throughput capacity and relatively simple sample preparation, especially if more appropriate high-quality spectra are available in public repositories. Righetti et al. described in their review different methods for evaluating contaminations of mycotoxins originating from fungi infestation in crops
[27]. Carlier et al. identified and quantified toxic compounds of sea mango
[28]. An additional approach was published by Pérez-Ortega et al.
[29] by using an in-house database for the annotation of contaminants, such as mycotoxins and pesticides in food samples.
The identification of pesticide residues and other contaminants in phyto samples is an important task due to their possible toxic effects, which is why these and other compounds are regulated worldwide
[30]. Interesting studies were published which investigated the pesticide contamination of alcoholic beverages, such as beer and wine
[31][32]. Bolaños et al. studied the presence of pesticides in 5 beer samples and 15 wine samples by acquiring data with a UHPLC–MS/MS system
[2]. Due to matrix effects in wine, the wine samples were diluted. The results showed that no pesticides were present in beer samples, by several were found in several wine samples. This result corresponds to the study of Inoue et al.
[33]. This research determined the number of pesticides present throughout the different steps of beer brewing by using LC-MS/MS. For this purpose, the authors spiked ground malt with more than 300 pesticides and brewed beer with this contaminated malt. The results showed that most pesticides were reduced in the wort and adsorbed onto the grain after meshing. At the end of the brewing process, only pesticides having a log P below 2 were found in the finished beer. Apart from beer, several articles on wine analysis have also been published. A review which describes different studies of untargeted wine analysis was published by Pinu et al.
[34]. Ruocco et al. were able to determine the vintage and color of German and Italian wines by conducting metabolite profiling
[35]. In addition, Arbulu et al. identified 411 metabolites of Graciano red wine and were able to differentiate Tempranillo and Graciano wines from each other using 15 metabolites as biomarkers
[36]. Further, the authors created a database containing 2080 oenological compounds. Arapitsas et al. could differentiate six different wine cultivars by conducting chemometrical marker detection and a principal component analysis
[37]. Diaz et al. could differentiate three wines of protected denominations of Spanish origin using their metabolic profile
[38]. A similar approach was described by Li et al. for the differentiation of five truffle species by their metabolic profiles
[39].
Most of the non-target approaches concerning phyto-analysis had the goal of identify phenolic compounds
[40][41][42][43][44]. The majority of authors of these articles performed a manual annotation of the compounds with data in the literature or in spectral databases. Lin et al. conducted a profiling of oligomeric proanthocyanidins by comparing the acquired MS/MS spectra with computed fragment spectra
[45]. A metabolite profiling of
Rhus coriaria (Sumac) annotating metabolites with the literature was conducted by Abu-Reidah et al.
[46]. Furthermore, Regazzoni et al. profiled gallotannins and flavonoids with mass spectral databases
[47].
El Sayed et al. conducted a characterization of aloe vera species by comparing acquired MS/MS spectra with the literature
[48]. This plant is also widely used in cosmetics, and was screened for illicit contents by Meng et al.
[49]. In this study, 123 cosmetic samples were screened with Orbitrap using an in-house mass spectral database for the annotation of compounds in these samples.