Carbohydrates are the crucial constituent on the surface of many bacterial species, mainly present as capsular polysaccharides (CPSs), glycoproteins, and glycolipids (like lipopolysaccharide, LPS)
[1]. These molecules have unique characteristics attributed to specific bacteria and different bacterial serotypes (STs)
[2]. Moreover, these molecules play a fundamental immunomodulation role in the host after a pathogen attack and have been widely used in the immune response to protect against pathogenic bacterial infectious diseases
[3]. With the progress of immunological evaluation and structural identification of bacterial surface polysaccharides (PSs), various carbohydrate-based vaccines have been developed against infectious diseases
[4]. Currently, these carbohydrate vaccines require a fundamental understanding of the immune epitopes of glycan antigens. To produce a robust antibody response against bacterial surface glycans (BSGs), the identification of key epitopes is essential for developing carbohydrate-based vaccines
[5]. The epitope characteristics of BSGs are suggested to be substantially associated with the frameshifts, length, terminal sugars, sequence, and side-chain constituents
[2]. It is well-known that bacterial PSs often include various non-carbohydrate constituents, including phosphate, acetyl, pyruvate ketal, and amino acids, which increase the structural diversity of BSGs
[6][7][8][6,7,8]. Although the biomedical activities of most functional groups in BSGs have not yet been identified, they are considered crucial immunological determinants, constituting an essential part of the immunodominant epitopes. Previous studies have shown that some functional groups in BSGs are essential immunological determinants
[5][6][9][5,6,9], while others may mask crucial epitopes from the immune system, thus inhibiting the antibody response and facilitating immune evasion
[6][10][11][6,10,11]. Therefore, it is urgent to clarify the biomedical importance of functional groups in more BSGs, which can provide strong guidance for their medical application. The primary reason for the lack of research on the activities of BSG functional groups is the difficulty of obtaining well-defined sugar chains with and without functional groups. Only
O-acetyl-modified bacterial glycans have been widely studied, as they can be easily removed under alkaline conditions
[6]. With the advancements in structural identification and modern synthetic methods, it is easier to acquire well-defined bacterial glycans with or without these non-carbohydrate constituents, providing a means to investigate the biomedical role of these functional groups
[5][9][12][5,9,12]. Moreover, the development of analytical techniques, for instance, nuclear magnetic resonance (NMR) spectroscopy and X-ray crystallography, has significantly improved the discovery and identification of non-carbohydrate functional groups and allows the exploration of structure–activity relationships.
Multiple functional groups in BSGs can be linked to sugars in different ways, mainly as
O-modified,
N-modified, and carboxy-modified substituents by ester, acetal, ether, and amidic linkages. Identification of the structures and types of these functional groups will allow a better understanding of their biomedical activity.
2. Bacterial Glycans’ O-Modified Side-Chain Functional Groups
The
O-acetyl moieties are frequently observed and have been determined as targets for many PS antigens
[6][13][6,13] (
Table 1). On the other hand, various rare
O-linked acyl groups have been found in some BSGs; for instance, in
V. anguillarum O-antigen
, an
O-linked propanoyl group has been identified
[14] (
Table 1).
O-Methylation is also a frequent modification, and parent polymers’ structural and compositional profile revealed that the
O-methylation is either partial, stoichiometric, or, in some cases, confined to the non-reducing terminus unit
[15]. Ethers with (
S)- and (
R)-lactic acids, generating 1-carboxyethyl derivatives were observed in O-antigens from multiple
enterobacteria, such as 4-
O-[(
R)-1-carboxyethyl]-D-glucose from
Shigella dysenteriae [16], while 2-
NAc-3-
O-[(
S)-1-carboxyethyl]-2-deoxy-D-glucose has been determined as an O-PS component of
Proteus penneri [17]. In the structure of the O-antigen of
Providencia alcalifaciens, stereoisomeric 2,4-dihydroxypentanoic acids were found to be linked to different monosaccharides via ether linkages
[18]. Cyclic (
R)- or (
S)-pyruvate ketals are common in BSGs, e.g., CPS and LPS. Pyruvate groups are usually present as 4,6-
O-ketals which modify the 4-OH and 6-OH of various monosaccharides
[19]. 3,4-
O-Pyruvate ketal isomeric forms have been found on 3-OH and 4-OH of both D-galactose and L-rhamnose in some BSGs
[20]. However, 2,3-
O-pyruvate ketal-containing sugar residues have been identified only in a few cases; for example, a 2,3-
O-pyruvate ketal-α-D-galactose was identified in the CPS of
Streptococcus pneumoniae ST4
[21]. Phosphoric esters are frequently observed in BSGs, mainly interlinking monosaccharides in the PS chain
[22][23][22,23]. Conversely, they may also attach as substitutes or add an amino alcohol or alcohol to the main chain. Glycerol, 2-aminoethanol, and ribitol are the most frequent phosphate-linked non-sugar parts. There are also some uncommon compounds, such as choline
[24] in
Morganella morganii O-antigens and arabinitol in
H. alvei 1191
[25].
Table 1.
The known
O
-modified functional groups in bacterial PS antigens.
O-Modified Groups |
Representative Bacterial Polysaccharides |
Glycan |
References |
acetyl |
→4)-β-D-GlcpNAc3RA-(1→4)-α-L-FucpNAm3R-(1→3)-α-D-Sugp-(1→ |
Flavobacterium columnare ATCC 43622 O-antigen |
[13] |
propanoyl |
β-L-Quip3NAc-(1-[→4)-β-L-Quip3NAc-(1→4)-[α-D-QuipNAc3/4R-(1→2)]-β-L-Quip3NAc-(1-]→ |
Vibrio anguillarum O-antigen |
[14] |
methyl |
α-D-Manp3OR-(1-[→3)-β-D-Man |
,38]. Besides the
N-linked amino acid derivatives of amino sugars, bacteria utilize different fatty acids to activate the amino sugars for relevant amido group formations (
Table 2). Among them, the most abundant are (
R)- and (
S)-3-hydroxybutanoic acids that exist in various BSGs
[39][40][39,40], while 3,5-dihydroxyhexanoic acid was found in
Flavobacterium psychrophilum O-antigen
[41], and 2,3-dihydroxypropionic acid was found in
Pragia fontium 97U124
[42].
N-linked dicarboxylic acids have also been observed, such as malonic and L-malic in
P. mirabilis [43] and
Pseudoalteromonas rubra [44], respectively. In addition to the
N-acyl derivatives, the less common
N-linked substituent methyl group was also identified in bacterial glycans; for instance, the 2,4-diamino-L-fucose residue that exists in the terminal of
Bordetella pertussis LPS contains a methyl moiety as a 4-amine functional group
[45].
Table 2.
The known
N
-modified functional groups in bacterial PS antigens.
N-Modified Groups |
Representative Bacterial Polysaccharides |
Glycan |
References |
free amino |
→4)-α-L-AltpNAcA-(1→3)-β-D-FucpNAc4N-(1→ |
Shigella sonnei phase I O-antigen |
[26] |
acetyl |
→4)-α-D-GalpNR-(1→4)-β-D-GlcpNR3NRA-D-(1→3)-α-D-FucpNR-(1→3)-α-D-QuipNR-(1→ |
p3OAc4OAcA6NR-(1→3)-β-D-GalpNAc-(1→4)-β-D-GlcpA-(1→Pseudomonas aeruginosa O1 O-antigen |
Shigella boydii O8 O-antigen[46] |
[50] |
p | -(1→2)- |
formyl | α-D-Manp-(1→2)-α-D-Manp-(1-]→ |
→4)-α-Psep4OAc5NAc7NR-(2→4)-β-D-Xylp-(1→3)-α-D-Fuc |
L-alanine/L-serine/L-threonine (2:2:1) | pNAc-(1→ |
→4)-β-D-GlcpNAc-(1→3)-β-D-ManpKlebsiella O5 and Escherichia coli O8 O-antigen |
Pseudomonas aeruginosa O8 O-antigen[15] |
NAcA6NR-(1→ | [ | 46 | ] |
Haemophilus influenzae type d CPS |
[52] |
(S)-1-carboxyethyl |
→3)-β-D-GlcpNAc6OAc-(1→6)-β-D-GlcpNAc3OR-(1→3)-α-D-Galp-(1→ |
acetimidoyl |
→4)-β-D-ManpNAc3NRA-(1→4)-β-D-ManpNAc3NAcA-(1→3)-α-D-FucpNAc-(1→ | Proteus penneri 62 O-antigen |
Pseudomonas aeruginosa O5 [17] |
O | -antigen |
[ |
R1 = L-lysine, R2 = L-alanine |
→3)-[β-D-GlcpNAc-(1→4)]-β-D-GlcpA6NR1-(1→3)-α-D-GalpA6NR2-(1→3)-β-D-GlcpNAc-(1→ |
Proteus mirabilis O27 O-antigen | 46] |
[53 |
2,4-dihydroxypentanoic acid 2-ethers |
→4)-β-D-GalpNAc-(1→3)-β-D-GalpNAc-(1→3)-[β-D-Manp4OR-(1→4)]-α-D-Galp-(1→ |
Providencia alcalifaciens |
] |
D-alanyl | O31 O-antigen |
R1 = L-serine, R2 = L-lysine | →8)-α-Legp5NAc7NR-(2→4)-β-D-GlcpA-(1→3)- |
→4)-α-D-GalpA6NR2β-D-GlcpNAc-(1→ |
-(1→4)-α-D-Galp-(1→3)-α-D-Galp4OAcA6NR1-β-D-GlcpNAc-(1→E. coli O161 O[18] |
-antigen |
Proteus mirabilis | O28 O-antigen | [29 |
[54] |
4,6-O-pyruvate ketal |
→3)-α-D-QuipNAc-(1→4)-[α-D-GalpNAc4,6R-(1→6)]-α-D-GalpNAc-(1→4)-α-D-GalpNAcA-(1→ |
Acinetobacter baumannii D78 CPS |
[19] |
] |
L-alanyl |
→4)-β-D-GlcpA-(1→3)-β-D-GlcpNAc-(1→6)-[α-D-GlcpA-(1→4)]-β-D-GlcpNR-(1→ |
L-threonine |
→4)-α-D-GlcpA6NR-(1→4)-α-D-GlcpA-(1→4)-α-D-GlcpA-(1→ | Proteus penneri 25 O-antigen |
[ |
Rhodopseudomonas sphaeroides ATCC 17023 LPS30] |
[55] |
3,4-O-pyruvate ketal |
→3)-β-D-GlcpNAc-(1→4)-[β-D-Galp3,4R-(1→3)]-β-D-GalpNAc-(1→4)-β-D-GlcpNAc-(1→ |
P. mirabilis O24 O-antigen |
[20] |
D-aspartyl |
→4)-β |
D-allothreonine | -D-GlcpNAc3NAcA-(1→4)-β-D-Man |
→4)-αpNAc3NA(L-ornithine)-(1→3)-β-D-GlcpNAc-(1→3)-α-D-Fucp4NR(1→ |
-D-GalpA6NR-(1→2)-α-L-Rhap-(1→2)-β-D-Ribf-(1→4)-β-D-Galp-(1→3)-β-D-GalpNAc-(1→Treponema medium ATCC 700293 glycoconjugate |
Hafni. alvei 1206 O-antigen[31] |
[23] |
2,3-O-pyruvate ketal |
→3)-β-D-ManpNAc-(1→3)-α-L-FucpNAc-(1→3)-α-D-GalpNAc-(1→4)-α-D-Galp2,3R-(1→ |
Streptococcus pneumoniae ST4 CPS |
[21] |
N-acetyl-glycyl |
→3)-β-D-Quip4NR-(1→4)-α-D-GalpNAc3OAcAN-(1→4)-α-D-GalpNAcA-(1→3)-α-D-GlcpNAc-(1→ |
S. dysenteriae D7 O-antigen |
[15] |
phosphoric ester |
→6)-α-D-Glcp-(1→2)-β-D-Glcp-(1→3)-β-D-GlcpNAc-(1→3)-[β-L-Rhap-(1→4)]-α-D-GlcpNAc-(1-PO3H→ |
E. coli O152 O-antigen |
[23] |
glycerol-P- and choline-P- |
→4)-[α-D-GalpN-(1→3)]-β-D-Galp2PCho-(1→3)-β-D-GalpNAc6OAc-(1→3)-Gro-1-P-(O→ |
Morganella morganii O-antigen |
[24] |
3. Bacterial Glycans’ N-Modified Side-Chain Functional Groups
The structural diversity provided by various amino sugars in BSGs, such as CPS, glycoconjugate, LPS, and other exopolysaccharides, is further increased after encountering diverse
N-acyl substituents (
Table 2). Identification of numerous amino sugar structures indicates that amino function is mostly linked with various acyl group substituents and is rarely free-form. The 2-acetamido-4-amino-D-fucose was first observed in the O-antigen of
S. sonnei as a key residue and always occurs with the free amino group at the
C-4 position
[26]. The amino sugars’ amino group is usually acetylated and formylated; however, acetimidoyl has also been identified in various bacterial glycans
[27]. Moreover, amino acids are one of the most important substituents of amino sugars. They contribute to the PS charge and may promote bacterial PS antigens’ immunospecificity
[28]. Several
N-linked amino acids have been discovered, including D- and L-alanine in
E. coli O161
[29] and
Proteus penneri 25
[30], and D- and L-aspartic acids in
Treponema medium ATCC 700293
[31] and
Proteus spp.
[32], glycine in
S. dysenteriae D7
[15], L-serine in
E. coli O114
[33], L-threonine in
Pseudoalteromonas agarivorans KMM 232
[34], and L-allothreonine in
V. cholerae O43
[35]. Moreover, some 5-oxoproline derivatives have been revealed as amino sugars’
N-acyl substituents
[36][37][38][36,37
L-ornithine |
→4)- |
β |
-D-Glc |
p |
NAc3NAcA-(1→4)- | β-D-ManpNAc3NA6NR-(1→3)-β-D-GlcpNAc-(1→3)-α-D-Fucp4NAsp(1→ |
Treponema medium ATCC 700293 glycoconjugate |
[31] |
N-acetyl-D-aspartyl |
→6)-α-D-GlcpNAc-(1→4)-α-D-GalpA-(1→3)-α-D-GlcpNAc-(1→3)-β-D-Quip4NR-(1→ |
Providencia stuartii O33 O-antigen |
[47] |
N-acetyl-L-aspartyl |
→3)- |
glycine |
→4)-α-D-Quip3NAcyl-(1→4)-β-D-Galp-(1→4)-β-D-GlcpNAc-(1→4)-β-D-GlcpA6NR-(1→3)-β-D-GlcpNAc-(1→ |
E. coli O91 O-antigen |
[56] |
β-D-GlcpNAc-(1→3)-[β-D-Quip4NR-(1→4)]-β-D-Galp-(1→6)-β-D-GlcpNAc-(1→3)-β-D-Galp-(1→ |
Providencia alcalifaciens O4 O-antigen |
L-glutamic acid |
→3)-β-D-Glcp-(1→3)-[β-D-GlcpA6NR-(1→4)]-β-D-Galp2OAc-(l→3)-α-D-Galp-(l→ |
Klebsiella K82 CPS | [32] |
[ | 57 | ] |
N-acetyl-L-seryl |
Nε-[(S)-1-carboxyethyl]-L-lysine | →3)-α-D-GlcpNAc-(1→4)-β-D-Qui |
→4)-αp3NR-(1→3)--D-Galpβ-D-Ribf-(1→4)-β-D-Gal-(1→ |
NAc-(1→3)-α-D-GlcpNAc-(1→3)-α-D-GalpA6NR-(1→Escherichia coli O114 O-antigen |
Providencia rustigianii O14 O-antigen[33] |
[ | 58 | ] |
N-acetyl-L-threonyl |
→3)-α-D-FucpNR-(1→3)-[β-D-ManpNAcA-(1→4)]-α-D-GalpNAc-(1→3)-α-L-Rhap-(1→ |
Pseudoalteromonas agarivorans KMM 232 O-antigen |
[34] |
N-acetyl-L-allothreonyl |
→3)-β-D-Quip4NR-(1→3)-α-D-GalpNAcA-(1→4)-α-D-GalpNAc-(1→3)-α-D-QuipNAc-(1→ |
V. cholerae O43 O-antigen |
[35] |
(2S,4S)-N-[1-carboxyethyl]-alanyl |
→4)-[β-D-Quip4NR-(1→6)]-α-D-GalpNAc-(1→6)-α-D-Glcp-(1→4)-β-D-GlcpA-(1→3)-β-D-GalpNAc-(1→ |
P. alcalifaciens O35 O-antigen |
[48] |
N-[(S)-3-hydroxybutyryl]-D-alanyl |
→3)-β-D-Quip4NR-(1→6)-α-D-GlcpNAc-(1→3)-α-L-QuipNAc-(1→3)-α-D-GlcpNAc3OAc-(1→ |
E. coli O123 O-antigen |
[49] |
2,4-dihydroxy-3,3,4-trimethylpyroglutamoyl |
→3)-α-D-GalpNAcAN-(1→4)-α-D-GalpNFoA-(1→3)-α-D-QuipNAc-(1→3)-β-D-ViopNR-(1→ |
Vibrio anguillarum V-123 O-antigen |
[36] |
3-hydroxy-2,3-dimethyl-5-oxoprolyl |
→2)-β-D-Quip3NR-(1→3)-α-L-Rhap2OAc-(1→3)-α-D-FucpNAc-(1→ |
P. shigelloides O74 O-antigen |
[37] |
(R,R)-3-hydroxy-3-Methyl-5-oxoprolyl |
→3)-β-D-QuipNAc4NAc-(1→4)-[α-D-Fucp3NR-(1→3)]-β-D-ManpNAcA-(1→ |
Vibrio cholerae O5 O-antigen |
[38] |
(R)-3-hydroxybutyryl |
→4)-β-D-GlcpNAc3NRA-(1→4)-α-L-FucpNAm3OAc-(1→3)-α-D-QuipNAc-(1→ |
P. shigelloides O51 O-antigen |
[39] |
(S)-3-hydroxybutyryl |
→3)-α-L-PnepNAc4OAc-(1→4)-α-L-FucpNAc-(1→4)-α-L-FucpNAc-(1→4)-α-L-FucpNAc-(1→3)-β-D-QuipNAc4NR(1→ |
Plesiomonas shigelloides O1 O-antigen |
[40] |
(3S,5S)-3,5-dihydroxyhexanoyl |
→4)-α-L-FucpNAc-(1→3)-α-D-Quip2NAc4NR-(1→2)-α-L-Rhap-(1→ |
Flavobacterium psychrophilum 259-93 O-antigen |
[41] |
D-glyceroyl |
→3)-α-L-FucpNAc-(1→3)-α-L-FucpNAc-(1→3)-β-D-QuipNAc4NR-(→ |
Pragia fontium 97U124 O-antigen |
[42] |
L-maloyl |
→4)-α-L-GalpNAm3OAcA-(1→3)-α-Sugp-(1→4)-β-D-GlcpNAc3NRA-(1→ |
Pseudoalteromonas rubra ATCC 29570T O-antigen |
[44] |
methyl |
α-D-GlcpNAc-(1→4)-β-D-Man2NAc3AcA-(1→3)-β-L-Fucp2NAc4NR-(1→6)-[α-LD-Hepp-(1→4)]-α-D-GlcpNAc-(1→ |
Bordetella pertussis LPS |
[45] |
4. Bacterial Glycans’ Carboxyl-Linked Side-Chain Functional Groups
Various BSGs contain glycuronic acid residues in which the carboxyl groups are linked to the amino group of amino compounds by forming amides
[50] (
Table 3). The difference in carboxyl-linked substitutes significantly improves the structural variety of the natural carbohydrates. In the simplest examples, these are primary amides-CONH
2, such as the 2-acetamido-2-deoxy-galacturonamide residue in
P. aeruginosa O6 O-antigen
[51]. The 2-aminopropane-1,3-diol occurs as an amide with the carboxyl group of uronic acids in
S. boydii O8 O-antigen
[50]. The other known amide linkages are formed with the amino groups of various amino acids, including L-alanine in
H. influenzae type d
[52], L-lysine in
P. mirabilis O27
[53], L-serine in
P. mirabilis O28
[54], L-threonine in
R. sphaeroides ATCC 17023
[55], D-allothreonine in
H. alvei 1206
[23], L-ornithine in
T. medium ATCC 700293
[31], glycine in
E. coli O91
[56], and L-glutamic acid in
Klebsiella K82 CPS
[57]. The
Nε-[(
S)-1-carboxyethyl]-L-lysine, a derivative of L-lysine, has been identified in
P. rustigianii O14 O-antigen
[58].
Table 3.
The known carboxyl-linked functional groups in bacterial PS antigens.
Carboxyl-Modified Groups |
Representative Bacterial Polysaccharides |
Glycan |
References |
carboxamide |
→3)-α-L-Rhap-(1→4)-α-D-GalpNAc3OAcAR-(1→4)-α-D-GalpN(formyl)A-(1→3)-α-D-QuipNAc-(1→ |
P. aeruginosa O6 O-antigen |
[51] |
2-aminopropane-1,3-diol |
→3)-β-D-GlcpNAc-(1→2)-β-D-Gal |