1. Introduction
Camellia sinensis (L.) Kuntze. (green tea) produces a common herbal drink that is used by many nations around the world, and is ranked second after water in terms of popularity and preference
[1][2][3]. Although
C. sinensis is native to China
[4], it is now commercially grown in both tropical and subtropical parts of the world
[5].
Certain biological phenomena may be revealed through transcriptional regulation (transcriptomics), genes end-products (proteomics), and metabolic product (metabolomics) analysis
[6]. The huge bioinformatics database of various plant species has been rapidly accumulated and is widely used for molecular analyses of secondary metabolism, abiotic stress tolerance, and so on. Plants provide teas to human consumption, and about 80% of the worldwide population relies on traditional teas for primary health care, especially in Asian and African countries
[7]. Green tea contains about 4000 bioactive compounds, of which polyphenols represented one-third
[8]. Green tea phytochemicals’ screening revealed the presence of alkaloids, saponins, tannins, catechin, and polyphenols (), which are considered tea quality parameters
[9].
Figure 1. Chemical compounds in Camellia sinensis.
Living herbal medicine-associated endophytic bacteria reside in the internal tissues of the host plants without causing any harmful effect, thereby stimulating secondary metabolites production with various biological activities
[10]. Bacterial endophytes may also be beneficial to their hosts, producing a wide range of natural products that could be harnessed for potential use in medicinal, agricultural, or environmental fields
[11]. Some of the endophytic bacteria may produce the same secondary metabolites as plants, making them a promising source of novel compounds
[10]. Endophytic bacteria colonize internal tissues to form symbiotic, mutualistic, commensal, and trophobiotistic host relationships. Several endophytic bacteria appear to derive from rhizosphere or phyllosphere; some of them may be transmitted through seeds of the parent plant. In addition, endophytic bacteria can promote plant growth and can act as biocontrol agents. Many endophytes are members of the
Pseudomonas, Burkholderia, and
Bacillus genera, which are known for their wide range of secondary metabolic products, including antimicrobials, growth promoting molecules, and volatile organic compounds
[11] ().
Figure 2. Role of endophytic bacteria and environmental stress on primary and secondary metabolites that affect antioxidant and antimicrobial activities of Camellia sinensis, and how omics technologies allow understanding of these biological phenomena.
2. Transcriptomics
Transcriptomics provide information on the occurrence and relative abundance of RNA transcripts, indicating the active components within cell
[12]. Serial analysis of gene expression and microarrays have been applied to many model systems, aiming to study the predominantly expressed genes in stem cells. The resulting sequence typically reads 30 to 400 base pairs in length, commonly aligned to a reference genome and assessed for quality, depending on DNA-sequencing technology used. Data processing and quality assessment tools generally provide diagnostic visualizations
[13].
Using the Illumina sequence analytical method, Wang
, et al.
[14] obtained about 57.35 million RNA-Seq reads pooled into 216.831 transcripts, with an average length of 356 bp and an N50 of 529 bp. Analysis of pathways showed that both carbohydrate metabolism and calcium signaling pathways could play an important role in
C. sinensis cold stress responses. The transcriptome from poly (A) + RNA of
C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs) by Shi et al.
[15] in order to identify most of the genes associated with flavonoid, theanine, and caffeine biosynthetic pathways. Tan et al.
[16] analyzed a floral transcriptome of
C. sinensis and assembled 26.9 million clean readings in 75.531 unigenes, averaging 402 bp. Both
C. sinensis transcriptome information and genetic maps provide a valuable basis for molecular biology investigations, such as functional gene isolation, quantitative trait loci mapping, and marker-assisted selection breeding in this important species.
MicroRNAs (miRNAs) are small RNAs endogenous that have an important role in plant development and growth, as well as in stress responses
[17].
C. sinensis has 47.452 expressed sequence tags (EST) available and 14 new
C. sinensis miRNAs were identified by EST analysis. These miRNAs target 51 mRNAs, which can act as transcription factors and participate in several cell process such as oxidation-reduction, transmembrane transport, signal transduction, and stress response. Indeed, gene ontology analysis based on these targets suggested that 37 biological processes were involved, and the Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis inferred that the identified miRNAs participate in 13 metabolic networks. Jaiprakash et al.
[18] applied an RNA isolation protocol that used guanidine hydrochloride on lyophilized leaves of
C. sinensis. Thus, the use of lyophilized tissues for RNA isolation from tea leaves was explored. High RNA yields (~500 μg/g dry weight of leaf tissue) were obtained, and RNA was suitable for Northern blotting, reverse transcription, and microarray analysis. Therefore, it was revealed that RNA obtained from lyophilized leaf has high quality, and is undegraded and useful for all downstream applications
[18]. In a comparative transcriptome analysis by RNA-sequencing, it was elucidated that exogenous calcium increased
C. sinensis thermotolerance
[19].
3. Proteomics
Proteomics provides expertise in identifying and quantifying the cellular levels of each protein encoded by the genome
[12]. The most popular methods are based on the combination of two-dimensional gel electrophoresis with mass spectrometry (MS). Other methods, such as high-throughput, quantitative, and western-blot analysis have also been implemented, but require extraordinary resources and efforts. With these methods, the proteomes of several cell structures and organelles, such as the mitochondria and cytoskeleton, can be assessed. In MS experiments, compounds are identified by accurate measurements of their mass-to-charge ratios
[13]. In proteomic analysis, typical MS data sets are formed by a list of proteolytic peptides characterized by their mass-to-charge ratios (MS spectra, MS1). Moreover, these peptides may be further fragmented and measured from the resulting mass spectra (MS-MS spectra or tandem MS spectra, MS2), and this information may be used to deduce their sequences. In some complex samples, a fractionation followed by a separation of proteolytic peptides using high performance liquid chromatography (HPLC) prior to MS analysis (LC-MS) should be performed. Several search engines have been developed to predict peptides and proteins
[20][21], comparing experimentally measured spectra with theoretical spectra.
The reliability of a given protein or peptide identification is measured by quality scores. The overall quality of entire MS data sets is commonly measured by the false discovery rate (FDR)
, which is the “expected” proportion of incorrect assignments among the accepted assignments. One of the most popular approaches to calculate FDR is based on target-decoy database use. In addition, several MS visualization tools has been developed
[13].
MS-based proteomics is becoming the standard approach for systematic characterization of post-translational modifications, e.g., phosphorylation, glycosylation, ubiquitination or sumosylation, acetylation, and methylation
[22]. Post-translational modifications are important biochemical processes for regulating various signaling pathways and determining specific cell fate. Therefore, the identification and comprehension of these covalent modifications is critical in the study of the phytochemicals present in plants and the external factors that affect its composition. Li et al.
[23] compared the protein complement of
C. sinensis tea pollen under different storage conditions. Protein was partially identified using a combination of two-dimensional polyacrylamide gel electrophoresis (2DPAGE), matrix-assisted laser desorption-ionization time-of-flight mass spectrometry (MALDI-TOF/MS), MASCOT, and Xproteo search engine. The analysis revealed that more stress response-related proteins, nucleic acid and fat metabolisms, and membrane transport are lost at room temperature than at −20 °C, while defense- and energy metabolism-related proteins showed a reverse relationship. A rapid quality control methodology was developed for
C. sinensis arabinogalactan proteins (AGP)
[24]. Using vectorial angle method and IR spectrum analysis, the 1200–800 cm
‒1 region in second-derivative IR spectra was determined as the key fingerprinting region of
C. sinensis AGP, with the 1090–900 cm
−1 region reflecting its common and conservative characteristics. The major monosaccharides showed intense peaks at about 1075 cm
−1 (galactose) and 1045 cm
−1 (arabinose), and uronic acids at about 1018 cm
−1 in second-derivative IR spectra. About 1134–1094 cm
−1 and 900–819 cm
−1 of the variable region was identified, and this was probably due to compositional and structural differences between AGPs. The constructed methodology was tested on
C. sinensis AGP extracted by 3 treatments and purified for apparent homogeneity as water-, pectinase-, and trypsin-extracted
C. sinensis AGP, with an Arabinose/Galactose ratio of 1.37, 1.57, and 1.82, respectively
[24].
4. Metabolomics
Primary and secondary plant metabolites are final receptors of cell biological information flow and their levels effect protein stability and gene expression
[25]. Measurements of these metabolites reflect the plant cellular state and produce critical insights into cellular processes that control biochemical phenotypes of the cell, tissue, or even whole organism. Metabolomics allows evaluation of medicinal plants, not only on the basis of pharmacologically important metabolites, but also based on the fingerprints of minor metabolites and bioactive molecules
[26][27]. Further, metabolomics can also be used for a better characterization and on quality control of plant extracts, tinctures, and phytotherapeutic products
[26].
In fact, metabolomics is conceived as the comprehensive, qualitative, and quantitative study of all the small molecules in an organism
[28]. The four conceptual approaches in metabolomics are:
- (1) Target analysis;
- (2) Metabolite profiling;
- (3) Metabolomics; and
- (4) Metabolic fingerprinting [29].
Target analysis covers the identification and quantification of a small set of known metabolites (targets) using a particular analytical technique, with best performance for the chemicals of interest. Metabolomics employs complementary analytical methods to determine and quantify the largest possible number of metabolites, either identified or unknown. Metabolic fingerprinting is applied to generate a metabolic signature or mass profile of the sample of interest for comparison with a large sample population, to track differences between samples.
Metabolomics discipline allows one to identify the complete set of metabolites in the cell
[12]. Medicinal plant metabolomes are of particular interest as a valuable natural resource for evidence-based development of new phytotherapeutic agents and nutraceuticals
[30][31][32][33][34]. Platforms dedicated to compare metabolomics are evolving into new technologies to monitor drug metabolism, chemical toxicology, and disease development. Non-target metabolomics is a useful method for the simultaneous analysis of many compounds present in herbal products. LC-MS can determine presence, amount, and sometimes structure of plant metabolites in complex plant mixtures. Nuclear magnetic resonance (NMR) is a common method used in metabolomics, and in contrast to MS-based approaches, in most cases an analytic separation is not required
[13]. Pattern recognition, such as principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), and batch-learning self-organizing map (BL-SOM) analysis are usually employed to reduce a metabolite fingerprint and to classify the analyzed samples
[27]. PCA and PLS-DA are latent variable methods, where new components are computed as linear combinations of the original variables. PCA is an unsupervised method and PLS-DA is a supervised method, where the information about each sample’s group is defined. Metabolomics typically use multivariate analysis to statistically process the huge amount of analytical chemistry data. PCA and hierarchical cluster analysis allow differentiation of green tea samples from black tea samples
[35].
The study of the chemical composition of a typical tea beverage shows that the primary metabolite content is similar (% wt/wt solids) in green and black tea, with 3% amino acids, 6% peptides and proteins, 2% organic acids, 7% sugars, 4% other carbohydrates, and 3% lipids in both
[36].
C. sinensis leaf contains a very high level of polyphenols, especially flavonoids, an important characteristic that distinguishes it from other plants
[37]. These flavonoids and their oxidative products, formed through fermentation and drying processes, largely produces the color and taste of black tea. The flavonoids content of various tea clones correlates with chalcone synthase activity in tea leaf. Le Gall et al.
[38] collected
C. sinensis from various countries for metabolomics analysis by
1H NMR, in order to establish whether teas could be discriminated according to the country of origin or with respect to quality. With PCA application, it was found that Longjing teas (highest quality Chinese tea), compared to other teas, had higher levels of theanine, gallic acid, caffeine, epigallocatechin-3-gallate (EGCG), and epicatechin (EC) gallate, and lower levels of epigallocatechin (EGC). The different cultivation methods affect green tea quality by altering the metabolomics profile
[37][39].
Metabolome changes were investigated in green tea and shade cultured
C. sinensis by LC-MS and gas chromatography-mass spectrometry (GC-MS) coupled with a multivariate dataset
[40]. PCA and orthogonal projections to latent structures DA (OPLS-DA) of green tea clearly showed higher levels than tencha of galloylquinic acid, EGC, EC, succinic acid, and fructose, along with lower levels of gallocatechin, strictinin, apigenin glucosyl arabinoside, quercetin
p-coumaroylglucosylrhamnosylgalactoside, kaempferol
p-coumaroylglucosylrhamnosylgalactoside, malic acid, and pyroglutamic acid. The effects of climatic conditions on
C. sinensis metabolites in three different growing areas of Jeju Island, South Korea, were investigated through
1H NMR spectroscopy
[41]. Pattern recognition methods, such as OPLS-DA and PCA, revealed clear discriminations of green teas from the different growing areas. Variations of theanine, quinic acid, glucose, EC, EGC, EGCG, caffeine, and amino acid profile were responsible for the discriminations. The dependence of global
C. sinensis metabolome on plucking positions was investigated through
1H NMR analysis coupled with multivariate statistical dataset
[42]. OPLS-DA and PCA were employed to find a metabolic discrimination among fresh
C. sinensis leaves plucked at different positions, from old to young leaves. The results showed a clear metabolic discrimination among
C. sinensis leaves (increased levels of theanine, caffeine, and gallic acid levels, and decreased levels of catechins, glucose, and sucrose) as the green tea plant grows up. Moreover, a different metabolism of the tea plant was observed between the tea leaf and stem.
Seasonal variations of phenolic compounds were studied, using a HPLC method, in fresh tea shoots grown in Australia
[43]. The content of EGCG, ECG, and CG were higher in fresh tea shoots in the warm months and lower in the cool months. Mechanisms that induce seasonal variations in tea shoots may be day length, sunlight, and temperature, which vary markedly across seasons. In another study, GC coupled with time-of-flight MS and multivariate data analysis was employed to evaluate green tea quality, showing changes dependent on green tea varieties and manufacturing processes
[44].
In conclusion, the synthesis of several phenolic compounds, such as EGCG, ECG, and CG, related with tea quality, is temperature sensitive or dependent
[43][45][46]. Moreover, it has also been suggested that EGCG synthesis depends on daytime length or stronger sunlight during the summer months
[36][39].