Polysaccharide Sequence Determination

Polysaccharide Sequence Determination: Comparison

Please note this is a comparison between Version 1 by Huai N. Cheng and Version 2 by Jason Zhu.

NMR analysis combined with statistical modeling offers a useful approach to investigate the microstructures of biobased polymers. In particular, this approach may be used to study the microstructures of polysaccharides. If the polysaccharides are composed of two or three monosaccharide units, and the NMR spectral features are sensitive to the different sequence placements of these units, then a detailed analysis of a polysaccharide (or its fractions) can be made by NMR with the help of statistical modeling.

alginate
biobased polymers
chitosan
NMR
pectin
polysaccharides
carbohydrates

1. Introduction

In recent years, polysaccharides have emerged as possible alternatives to synthetic plastics due to their abundant natural sources, renewability, and biodegradability ^[1][2][71,72]. Through appropriate modifications and/or processing techniques ^[3][4][5][6][73,74,75,76], polysaccharides can be transformed into a range of functional materials with properties suitable for packaging, biomedical applications, and various other industries.

As expected, the structure of polysaccharides can be studied with NMR ^[7][8][9][77,78,79], and the information available includes the composition, type, and degree of substituents, the presence of minor components and impurities, and sometimes even the number-average molecular weight. Moreover, NMR can be utilized in combination with other analytical techniques, such as methylation, esterification, fractionation, mass spectrometry, and chromatographic methods, to analyze complex polysaccharides or mixtures ^[10][11][80,81].

As for the use of NMR for the direct comonomer sequence determination of copolysaccharides, its feasibility depends on the polymers involved ^[7][8][77,78]. In some copolysaccharides, one or more NMR peaks in a saccharide residue of the polymer are split due to the sensitivity of the chemical shifts to the presence of different neighboring saccharide units. In this case, the split peaks need to be assigned to the appropriate sequences (e.g., diads, triads, or tetrads), and the percent distribution of the sequences can be determined by taking the areas of the resolvable peaks. However, if the NMR peaks of a copolysaccharide do not show resolvable peaks for different sequences, then NMR cannot be easily used for sequence determination for that copolymer. Three examples of polysaccharides where the NMR sequence peaks are resolvable are alginate, pectin, and chitosan. The NMR/statistical modeling of these polymers is shown below.

2. Alginate Mannuronic/Guluronic Sequence Analysis

Alginate is a naturally occurring polysaccharide found in various types of brown seaweed, including kelp and other marine brown algae ^[3][12][13][73,82,83]. It is used as a thickener, gelling agent, controlled release agent, coating, and stabilizing agent in the food industry, pharmaceuticals, and various other applications. Structurally, alginate is a linear copolymer made up of two types of monosaccharides: β-D-mannuronic acid (M) and α-L-guluronic acid (G). The sequence distribution of M and G residues along the polymer chain, as well as the overall composition (M-to-G ratio), determines many of the physical and chemical properties of alginate.

For alginate, ¹H NMR can be used to compute overall composition (% M and % G), diad sequences, and G-centered triads, and ¹³C NMR can provide both M- and G-centered triad intensities. Previously, the NMR data of the whole polymer and fractions of alginate extracted from Laminaria digitata, as published in the literature ^[14][15][84,85], were analyzed ^[16][86], and four structural components were found: two mostly homopolymer blocks, one somewhat alternating copolymer block, and one or more random copolymer blocks.

An alternative approach is to use the hyphenated size exclusion chromatography (SEC)-NMR method ^[17][18][87,88]. The SEC instrument is connected to the NMR probe, and NMR spectra are obtained by stopping the flow during NMR data acquisition. Three commercial alginate samples were evaluated in this way. The NMR data were satisfactorily treated with two-component first-order Markov statistical models. The results were consistent with the earlier finding ^[16][86] that these alginate samples are compositionally heterogeneous, consisting of mixtures of components with different microstructures.

The NMR triad data for two commercial alginate samples with different M/G ratios were reported earlier by Kawarada et al. ^[19][89]. These data have been analyzed by both discrete and continuous models ^[20][90] and are reported. As an example of the analysis, the triad data and model results for the sample with a high M/G ratio are shown in Table 14 (column 2). The one-component B model clearly did not fit the data well (column 3, with a mean deviation of 5.4). In contrast, both the discrete two-component B/B model (column 4) and the continuous perturbed B model (column 5) gave a mean deviation of 0.6, indicating a much better fit with the observed data. Thus, this alginate sample was heterogeneous in M/G sequence distribution, just like the earlier alginate samples from L. digitata.

Table 14. The triad sequence intensities (in mole %) of a commercial alginate sample; observed triad data were taken from ref. ^[19].

The triad sequence intensities (in mole %) of a commercial alginate sample; observed triad data were taken from ref. [89].

, the NMR data of chitosan were shown to be compatible with a compositionally heterogeneous polymer.

Table 47. The triad sequence analysis of partially deacetylated chitosan; observed triad data were taken from refs. ^[27][28].

The triad sequence analysis of partially deacetylated chitosan; observed triad data were taken from refs. [97,98].

NMR Triad	Obsd. %	Discrete Models		Continuous Model
NMR Triad	Obsd. %	Calc % (for B)	Calc % (for B/B)	Calc % (for Perturbed B)
AAA	15	15	16	15
AAD	28	27	25	28
GEG	13.4	14.1	13.3
DAD	13.3
10	12	EGE	5.5	4.8	5.3	5.3
10	9
ADA	14	13	13	14	GGE	26.7	28.2	26.7	26.6
DDA	16	23	20	17	GGG	41.7	41.7	41.7	41.7
DDD	17	10	Mean dev.		0.8	0.1	0.1
Reaction probabilities		P_E = 0.253	Component 1: w₁ = 0.337 P_E = 0.586 Component 2: w₂ = 0.159 P_E = 0.414	P_E = 0.264 σ = 0.0928 τ = 0.0003

NMR Triad	Obsd %	Discrete Models		Continuous Model
NMR Triad	Obsd %	Calc % (for B)	Calc % (for B/B)	Calc % (for Perturbed B)
MMM	39	39	39	39
MMG	17	29	19	18
GMG	8	5	7	7
MGM	10	14	9	9
GGM	14	11	14	15
GGG	12	2	12	12
Mean dev.		5.4	0.6	0.6
Reaction probabilities		P_M = 0.731	Component 1: w₁ = 0.592 P_M = 0.858 Component 2: w₂ = 0.408 P_M = 0.338	P_M = 0.648 σ = 0.253 τ = −0.004

3. Pectin Galacturonic Acid/Ester Sequence Analysis

Pectin is a well-known commercial product, typically produced from citrus peels and used as a gelling agent (especially in jams and jellies, dessert fillings, and sweets), as a food stabilizer in fruit juices and milk drinks, and as a source of dietary fiber ^[3][21][22][73,91,92]. It is commonly found in the cell walls of terrestrial plants. It has a complex structure, but the major functional unit is galacturonic acid. This acid can exist either as a carboxylic acid or as a methyl ester, and the plant can carry out this conversion enzymatically as needed. The gelling properties are related to the ratio of galacturonic acid to its ester. For high-methoxy (HM) pectin, pectin is usually mixed with sucrose to form a gel. For low-methoxy (LM) pectin, it is mixed with a calcium salt for gel formation. Thus, the amount and the placement of the acid and the ester along the polymer chain are important information for product development and formulation.

NMR is a good technique to measure the amount of acid/ester present in pectin, as well as the heterogeneity of their placements along the polymer chains ^[7][77]. It is known that selected peaks in ¹H and ¹³C spectra are split by the acid/ester sequence effects so that the triad sequence distributions can be obtained. Analyses were previously reported for selected pectin samples using statistical modeling ^[23][24][93,94], and the data were shown to fit well to both discrete and continuous models. For illustration, the reported triad distribution data for an HM pectin sample ^[24][94] extracted from lemon peel are shown in column 2 of Table 25, where G and E denote galacturonic acid and the ester, respectively. The data were re-analyzed for three types of Bernoullian (B) models: a simple B model, a two-component (B/B) model, and a continuous perturbed B model. From the analysis shown in Table 5, the simple B model (column 3) gave a mean deviation of 1.2, the two-component B/B model (column 4) 0.6, and the perturbed B model (column 5) 0.4. Thus, the NMR triad sequence data suggest that the particular HM pectin sample was compositionally heterogeneous, and the NMR data could be fitted with either the two-component B/B model or the perturbed B model.

Table 25. The triad sequence analysis of an HM pectin sample; observed triad data were taken from ref. ^[24].

The triad sequence analysis of an HM pectin sample; observed triad data were taken from ref. [94].

NMR Triad	Obsd. %	Discrete Models		Continuous Model
NMR Triad	Obsd. %	Calc % (for B)	Calc % (for B/B)	Calc % (for Perturbed B)

A second example can be given using an LM pectin sample. The reported triad distribution data for such a sample (also from lemon peel) ^[24][94] are shown in column 2 of Table 36, and the analysis results of these data using the same three models are given in columns 3–5 of Table 36. In this case, the mean deviations for the three models (B, B/B, and perturbed B) were 0.8, 0.1, and 0.1, respectively, also suggesting that the NMR triad sequence data conformed better to the two-component B/B model or the continuous perturbed B model.

Table 36. The triad sequence analysis of an LM pectin sample; observed triad data were taken from ref. ^[24].

The triad sequence analysis of an LM pectin sample; observed triad data were taken from ref. [94].

NMR Triad	Obsd. %	Discrete Models		Continuous Model
NMR Triad	Obsd. %	Calc % (for B)	Calc % (for B/B)	Calc % (for Perturbed B)
EEE	32.4	32.4	32.5	32.4
EEE	2.1	1.6	2.4	2.5
EEG	28.0	29.5	28.0	27.5
EEG	10.6	9.6	10.6	10.6	GEG	6.4	6.7	7.1	6.7
EGE	13.0	14.8	14.0	13.8
GGE	14.0	13.5	14.0	13.4
GGG	6.2	3.1	4.4	6.2
Mean dev.		1.2	0.6	0.4
Reaction probabilities		P_E = 0.687	Component 1: w₁ = 0.793 P_E = 0.724 Component 2: w₂ = 0.207 P_E = 0.489	P_E = 0.675 σ = 0.118 τ = −0.008

The above analyses suggest that both HM pectin and LM pectin samples extracted from lemon peels are compositionally heterogeneous. In a separate work ^[17][87], pectin fractions from the same source material were analyzed in combination with NMR, and this is another way to confirm the compositional heterogeneity. Thus, NMR and statistical modeling can be helpful for the analysis of citrus pectin samples.

4. Sequence Analysis of Partially Deacetylated Chitosan

Chitin is abundant in the exoskeletons of crustaceans, insects, and fungi. Structurally, it is a homopolymer of 2-acetamido-2-deoxy-β-D-glucopyranose (GlcNAc). Chitosan is obtained by partially deacetylating chitin and may be regarded as a copolymer of GlcNAc units and 2-amino-2-deoxy-β-D-glucopyranose (GlcN) units ^[3][25][26][73,95,96]. Chitosan is a versatile polymer with a myriad of applications. Its unique properties, including biocompatibility, biodegradability, and non-toxicity, make it invaluable in various applications. It also has notable antimicrobial properties, making it a possible additive for use in wound healing, tissue engineering, and other medical products. Moreover, it is employed in the food and pharmaceutical industries for its ability to function as a natural preservative and a controlled delivery vehicle.

For partially deacetylated chitosan, the NMR spectra can detect chemical shift differences for the different sequences of acetylated and deacetylated units. Previously, the ¹H and ¹³C NMR spectra of chitosan were published by Varum et al., who also reported the triad sequence intensities of selected samples ^[27][28][97,98]. The triad sequence intensities for one sample are shown in column 2 of Table 47. In an earlier analysis ^[29][99]

17
17
Mean dev.
	3.0	1.5	0.4
Reaction probabilities		P_A = 0.531	Component 1: w₁ = 0.906 P_A = 0.357 Component 2: w₂ = 0.094 P_A = 0.009	P_A = 0.547 σ = 0.083 τ = −0.031

From the analysis of the results in Table 47, the use of the simple B model clearly gave a rather large mean deviation of 3.0 (column 3). In the two-component B/B model (column 4), a second minor component (9.4%) consisting of mostly deacetylated units (P_A = 0.009) was incorporated, and the mean deviation was reduced to 1.5. The best result was found for the perturbed B model (column 5), where the mean deviation was cut down to 0.4. Thus, this chitosan sample was heterogeneous in composition, as shown by the analysis of NMR data with statistical modeling.

5. Comments

It may be noted that only some polysaccharides are amenable to being studied by the NMR/statistical methods described herein. First, the polysaccharide in question should have a structure that contains only two or three repeating units, thereby generating a manageable number of structural sequences. (Thus, a complex polysaccharide with many different types of monosaccharide units may be too complex to be studied; sometimes, fractionation or other separation procedures may be needed prior to NMR analysis.) Secondly, the different structural sequences should give detectable differences in the NMR spectra, typically through chemical shift differences. If the shift difference in a polymer spectrum is too small to be resolved by NMR, then NMR microstructural studies of this polymer will be difficult. However, as the NMR instrumentation continues to improve in sensitivity, resolution, and magnetic field strength ^[30][31][100,101], it is possible that small chemical shift differences may be resolvable in the future, thereby rendering more polymer microstructures accessible to NMR sequence analysis.