The inexorable progress in genomics, bioinformatics, and chemical analytics greatly facilities lasso peptides discovery during the last decade. In addition to the class-defining modifications as leader peptide excision and core peptide cyclization, a series of unique PTMs including disulfuration, phosphorylation, C-terminal methylation, acetylation, hydroxylation, etc., have been unveiled recently, further increasing the diversity of structures, properties, and complicating the maturation mechanisms.
2. Disulfuration
Disulfide bonds are rare even among all known RiPPs families, and may play an auxiliary role in maintaining the correct configurations, which is curial for biological activities. To the best of the knowledge, disulfide bonds are only characterized in three classes of RiPPs: glycocins, the post-translationally glycosylated bacteriocins featuring two nested disulfide bonds that stabilize their unique helix–loop–helix structures and sugar moieties on Ser, Thr, or Cys residues
[8]; cyclotides, featuring a head-to-tail cyclic peptide backbone with a cystine knot arrangement of three conserved disulfide bonds
[9]; and conopeptides, the cone-snail-derived RiPPs containing a high frequency of PTMs involving disulfide bond(s)
[10], albeit a few examples in other classes such as lanthipeptides, cyanobactins, sactipeptides, and lasso peptides also contain disulfide(s). Two thiol-disulfide oxidoreductases and a protein-disulfide isomerase (PDI) were reported for the disulfide bond(s) formation in glycocins and cyclotides, respectively
[11][12], while the formation of disulfide bond(s) in conopeptides still remains elusive.
3. Phosphorylation
Phosphorylation was the earliest characterized tailoring process in lasso peptides. Paeninodin, originated from firmicute strain
Paenibacillus dendritiformis C454, is a class II lasso peptide with a Ser residue in the C-terminus, of which the BGC encodes an additional putative tailoring kinase (PadeK) (
Figure 2a)
[13]. Both unphosphorylated paeninodin and phosphorylated paeninodin were detected in the extract of heterologous expression for paeninodin cluster in
Escherichia coli. Deletion of the kinase gene
padeK resulted in the production of merely unphosphorylated paeninodin, while restitution of the knocked-out gene by co-expression with another vector-bearing
padeK led to restoration of the phosphorylated compound, suggesting the direct link between the function of kinase PadeK and the occurrence of the tailoring phosphorylation process on paeninodin. Precursor peptide PadeA instead of the threaded lasso peptide was verified to be the substrate of kinase PadeK, which specifically modified the hydroxyl group of the C-terminal Ser, the extremely conserved site in the precursor sequences from various lasso peptide BGCs featuring a homologous kinase gene, suggesting the modification step prior to the fundamental maturing process catalyzed by B2 and C proteins (
Figure 2b)
[13]. Owing to the low solubility of PadeK, the homologous kinase ThcoK from another firmicute
Thermobacacillus composti KWC4 was chosen instead of PadeK to be characterized in vitro. Replacing
padeK with
thcoK in the paeninodin heterologous expression system successfully produced the phosphorylated peptide with only minor amounts of unmodified compound, suggesting the feasibility of the hybrid gene cluster. Sequence alignments of lasso peptide-tailoring kinases exposed a conserved His-Lys-Asp-Asp motif. The imperative roles of these four catalytic residues were further demonstrated via site-directed mutations
[13].
Figure 2. C-terminal phosphorylation of paeninodin and related lasso peptides. (a) The BGCs of phosphorylated lasso peptides. (b) Proposed biosynthetic pathway of paeninodin. The precursor peptide is phosphorylated by PadeK at the C-terminal Ser residue and then maturated by B1, B2, and C proteins to generate paeninodin. (c) Polyphosphorylation of lasso peptides. ThocK and SyanK polyphosphorylated the precursor peptide at the C-terminal Ser residue as well. (d) The structure of pseudomycoidin.
4. Methylation
Methylation is a versatile modification in the biosynthesis of various natural products. Lassomycin discovered from
Lentzea kentuckyensis sp. is an absorbing lasso peptide that exhibits outstanding activities against a variety of
Mycobacterium tuberculosis strains with minimum inhibitory concentration (MIC) values of 0.8–3 μg/mL and is inactive against symbionts of the human microbiota
[14]. Although the initial structure elucidation indicated that lassomycin adopted an unthreaded structure
[14], the subsequent chemical synthesis of this peptide showed that the reported structure was incorrect and a characteristic threaded conformation was essential for its anti-tuberculosis (TB) activity
[15][16]. In addition, lassomycin features a unique methyl ester in the C-terminal carboxyl group, and the putative
O-methyltransferase, LasF, from its BGC was considered to be responsible for the C-terminal methylation (
Figure 3a)
[14].
Figure 3. C-terminal methylation of lassomycin and related lasso peptides. (a) The BGCs of methylated lasso peptides. (b) Sequence alignment of precursor peptides. (c) Proposed biosynthetic pathway of C-terminal methylated lasso peptides. StspM methylates the C-terminal carboxyl group of precursor peptide.
5. Acetylation
A novel lasso peptide BGC encoded for albusnodin was found in
S. albus DSM 41398 which includes a putative acetyltransferase gene (
albT) as well as the canonical genes
albA, albB, and albC [17]. The only observed heterologous expression product was the threaded, C-terminal Cys truncated albusnodin with an acetyl group attached to the ε-amino group of Lys10 (
Figure 4)
[17]. Sequence alignments showed that Lys10 was highly conserved among precursor peptides in an array of lasso peptide BGCs that resembled the BGC architecture of albusnodin. Heterologous expression of the albusnodin cluster lacking the acetyltransferase gene
albT led to no trace of the predicted unacetylated intermediate
[17], surmising that the acetylation is vital and occurs in the early stage of albusnodin biosynthesis rather than the last step. Moreover, the BGC of the antitumor lasso peptide ulleungdin also contains an acetyltransferase gene in the downstream of
B2, yet acetylated ulleungdin was not detected
[18]. It seems that this acetyltransferase is unrelated to ulleungdin.
Figure 4. The BGC and structure of albusnodin. The acetyl group attached to the ε-amino of K10 is highlighted in red.
6. Hydroxylation
Lasso peptides RES-701s originally isolated from
Streptomyces sp. RE-896 are regarded as selective endothelin type B receptor (ETBR) antagonists. RES-701-2 and RES-701-4 contain a C-terminal 7-hydroxy-tryptophan compared to the unhydroxylated RES-701-1 and RES-701-3
[19][20][21]. Recently, RES-701-3 and RES-701-4 that differed in the hydroxylation of the C-terminal tryptophan residue were rediscovered through genome mining from the marine
S. caniferus CA-271066, and their BGC (hereafter termed
res) was identified with an additional gene (
resE) encoding a hypothetical protein (
Figure 6b). Despite lacking any evidence, ResE was proposed for the 7-hydroxylation of the C-terminal tryptophan residue, which remains to be proved in the future
[22].
7. Epimerization
The function of MslH was further validated for the epimerization of the C-terminal
l-Trp in vitro
[23]. The full-length precursor MslA is the most favorite substrate for MslH, as the compared reaction with leaderless core peptide only produced a minor amount of
d-Trp. Just like CanB1 in canucin A biosynthesis, MslB1 is also a bifunctional protein that not only assists the proteolysis of leader peptide catalyzed by MslB2, but also remarkably enhances the epimerization activity of MslH. Only about 50% conversion of MslA to epi-MslA was observed, implying that MslH generated an equilibrium mixture of the epimers. Since the C-terminal
l-Trp derivative has never been detected in the MS-271 producer, the following MslB2 and MslC maturation processes probably recognize epi-MslA as the sole substrate and drive the equilibrium to
d-Trp containing precursor peptide (
Figure 6b). Furthermore, MslH could epimerize other aromatic residues such as W21F and W21Y at considerable levels, and chimeric substrates with the sviceucin N-terminal core peptide sequence and the C-terminal “CFW” (
Figure 1b), displaying a broad substrate tolerance
[23].
Figure 6. The BGC (a) and proposed biosynthetic pathway (b) of MS-271. MslH epimerizes the C-terminal Trp residue of precursor peptide with the aid of MslB1, similar to the cooperation of CanE and CanB1.
d-amino acids are limited in RiPPs and only a few mechanisms have been verified. For instance, the single radical
S-adenosylmethionine (SAM) peptide epimerase PoyD introduces up to 18
d-amino acids in the biosynthesis of polytheonamides
[24], another radical SAM epimerase YydG epimerizes the formation of a
d-Val and
d-
allo-Ile residues in the biosynthesis of the epipeptide YydF
[25]. Additionally,
d-Ala and
d-amino butyric acid (
d-Abu) residues are introduced into lanthipeptides by the hydrogenation of 2,3-didehydroalanine (Dha, dehydrated Ser) and 2,3-didehydrobutyrine (Dhb, dehydrated Thr) via different oxidoreductases, including the zinc-dependent dehydrogenases termed LanJ
A [26], the flavin oxidoreductases termed LanJ
B [27], and the F
420H
2-dependent reductases termed LanJ
C [28]. The characterization of the metallophosphatase superfamily protein MslH provides a novel biosynthetic mechanism for
d-amino acids in RiPPs.
8. Citrullination
Citrullination, referring to Arg deimination to produce non-proteinogenic amino acid citrulline (Cit), had never been reported in RiPPs until the lasso peptide citrulassin A was discovered from
S. albulus NRRL B-3066 using the Rapid ORF Description and Evaluation Online (RODEO) genome-mining tool. The conversion of Arg9, which is invariable among the citrulassin family, to Cit was certified by in silico analysis of the precursor peptide sequence and nuclear magnetic resonance (NMR) analysis of the maturated citrulassin A (
Figure 7b). Heterologous expression of the citrulassin A cluster with ~20 kb upstream and downstream regions only produced
des-citrulassin A with unmodified Arg9, suggesting the enzyme responsible for citrulline generation is remotely encoded in the genome
[29].
Subsequent research revealed that the peptidyl arginine deiminase (PAD) is responsible for deimination of Arg to generate Cit (
Figure 7a), as the distantly encoded
pad gene was ubiquitous in the genomes of citrulassin producing strains with only one exception, while strains lacking
pad correlated to Arg-bearing
des-citrulassin production. Heterologous expression of the
pad gene in native
des-citrulassin D producer (
S. katrae NRRL B-16271) resulted in the conversion to deiminated citrulassin D (
Figure 7c)
[30]. Future work is necessary to unveil the timing of deimination during citrulassin biosynthesis.
Figure 7. The BGC, structure, and conversion of citrulassin A. (a) The BGC of citrulassin A. The pad gene is distantly encoded in the genome. (b) The structures of citrulassin A and des-citrulassin A. The oxygen atom in Cit9 is highlighted in red. (c) PAD catalyzes the deimination of Arg9 to generate Cit9.
9. Succinimidation
Protein
l-isoaspartyl methyltransferases (PIMTs) usually have a crucial role in protein repair, recognizing and repairing abnormal isoaspartate (isoAsp) residues to
l-Asp through a SAM-dependent methyl esterification reaction
[31]. In total, 48 lasso peptide BGCs were uncovered bearing genes annotated as
O-methyltransferases that belong to PIMT homologues, and the extremely conserved Asp6 in all the putative precursor peptides suggested that it might be the modification site
[32]. Heterologous expression of two clusters from actinobacterium
Thermobifida cellulosilytica (
tce) and firmicute
Lihuaxuella thermophila (
lih) (
Figure 8a) resulted in the discovery of cellulonodin-2 and lihuanodin, featuring an unconventional succinimide moiety (also known as aspartimide) in the macrolactam ring. It was experimentally proved in vitro that TceM and LihM catalyzed the methylation of Asp6 to the corresponding methyl ester, followed with spontaneous nucleophilic attack of the adjacent Thr7 amino group to form a stable succinimide moiety without further hydrolyzation. Notably, TceM and LihM carried out dehydration on Asp instead of isoAsp, which is in stark contrast to canonical PIMTs. In addition, both TceM and LihM only recognized the threaded lasso peptides rather than linear precursors or isopeptide-bonded rings (
Figure 8b)
[32].
The functions of TceM and LihM are distinct from the previously reported PIMT OlvS
A involved in the biosynthesis of lanthipeptide OlvA (BCS
A), since the OlvS
A catalytic succinimide group was followed with non-enzymatic hydrolysis to either Asp or isoAsp and this process was reversible as isoAsp could be recognized by OlvS
A as well to regenerate succinimide
[33].
Figure 8. The BGCs (a) and biosynthetic pathways (b) of cellulonodin-2 and lihuanodin. Zoomed-in images of the Asp6 and Thr7 residues in the threaded lasso peptides are provided for further illustration. Both TceM and LihM could only recognize the threaded lasso peptides as substrates.
10. Linearization
The threaded topology is proved to be necessary for isopeptidase hydrolysis. The hydrolyzation could be detected by retention time changes in HPLC and mass increases in MS
2 spectra, but no alteration was observed for unthreaded astexin-2 with AtxE2, suggesting the requirement of lariat knot configuration
[34]. The crystal structures of AtxE2 and SpI-IsoP showed that isopeptidases consisted of an N-terminal open
β-propeller domain and a C-terminal
α/
β-hydrolase domain
[35][36]. The latter featured a conserved Ser-His-Glu/Asp catalytic triad of serine protease, and the isopeptide bond was cleaved via nucleophilic attack by the Ser alkoxide
[34][35][36]. Cocrystallization of AtxE2 in complex with tail-truncated astexin-3 further demonstrated that isopeptidase recognizes lasso peptide by shape complementarity rather than specific amino acid sequence, as the Ser10-Gln14 loop region of astexin-3 is suitably accommodated in a narrow and slightly acidic pocket of AtxE2 and a few specific interactions within the complex interface exist
[36].