1. Cellulose
Cellulose is among the most abundant biopolymers on Earth, and it roughly comprises around 30% of all vascular plants
[1]. Cellulose provides mechanical stability and durability to plants. In one or another form, cellulose has been used by humanity for thousands of years. In the recent decades, it has attracted a renewed interest in the scientific community due to its widespread abundance and excellent mechanical properties, which could be further altered by chemical modifications
[2][3].
Cellulose is a polysaccharide made of α-D-Glucopyranose units (AGU) linked by
β(1→4) glucosidic bonds. The polymer chains consist of hundreds to thousands of glucose units depending on the cellulose source—plants or bacteria. The individual cellulose chains are arranged in a precise crystallographic pattern denoted as cellulose I, where two naturally occurring allomorphs exist—cellulose Iα and Iβ
[4]. Up to 36 cellulose chains can form the so-called primitive fibril (cellulose nanofibril, or CNF) in plants, and several such fibrils can form micro- and macrofibrils. Finally, bundles of many macrofibrils form cellulose fibers
[5][6][7]. However, during the industrial processing, cellulose fibers can be broken down to the primitive fibrils and moreover only the crystalline part of the fibrils can be extracted after acid treatment
[8]. These crystals are often referred to as cellulose nanocrystals (CNCs), and have dimensions most often of hundreds of nanometers in length and around 3–5 nm in width
[9]. CNCs have high aspect ratio and excellent mechanical properties. Therefore, CNCs were utilized for many applications in various fields ranging from medicine
[10] to environmental science
[11][12] and energy storage
[13], to name a few. However, one particular characteristic of cellulose can be even as important as its excellent mechanical properties and this is that cellulose is a biobased, and thus biodegradable material. This makes cellulose a promising sustainable material with many possible applications.
A substantial part of the scientific knowledge about cellulose was obtained empirically. During past decades, theoretical modelling, molecular simulations and ab initio calculations became the valuable and often essential tools providing a fundamental insight into materials’ properties. Cellulose has also been studied through the years using theoretical and simulation approaches. However, the modelling of experimentally relevant systems can be hindered by the time and spatial scales which are too large and thus, can be hardly accessible by standard all atomistic molecular simulations. Therefore, many researchers have developed and used coarse-grained (CG) simulations to model cellulose systems on large spatio-temporal scales. In the CG simulations, several atoms are combined in so-called beads, which makes it possible to model larger systems on longer time scales. Many different CG models of polysaccharides were developed during the last few decades, and they will be presented in this review. There have been an increased number of newly developed CG models in the recent years, which to some extent correlates with the renewed interest to cellulose from the wider scientific community.
2. Coarse-Grained Modelling
Coarse-grained (CG) models describe the atomistic structure (fine-grained) of molecules and materials with a reduced representation. The atoms representing a given functional group or a molecule are grouped (mapped) into a single particle, often referred to as the “interaction site”, “bead”, “particle”, or “superatom”. Thus, the number of particles required to describe a given molecular structure are smaller compared to all-atom models. In this way, the simulations can be significantly accelerated and therefore it is possible to reach longer simulation times and bigger spatial domains, compared to the corresponding all-atom simulations. It is noteworthy that a particular choice of the beads in the CG mapping is not unique, and is often guided by the experience and chemical intuition. Therefore, for a particular molecule, many different mapping schemes may exist. The mapping of the atoms into one bead is typically carried out with the help of a mapping operator (
MI) that determines the coordinates of the
I-th CG particle (
RI) from a linear combination of the Cartesian coordinates (
ri) of the atoms belonging to a particular group in the atomistic representation.
The beads’ positions determined in this way are usually chosen as the center of mass or center of geometry (geometrical center) of the particular group of atoms from the atomistic representation. With the advancement of computational techniques and machine learning, new ways to construct and map the atoms into the CG bead have been developed, which might help to remove the human bias and potentially develop better CG models
[14][15][16].
After mapping the atomistic structure into a desired CG representation, one should assign the potential functions determining the bonded and non-bonded interactions of the model, and consequently validate these functions against atomistic simulation or experimental results. Using atomistic simulations to develop the CG potentials is known as a bottom–up approach, while using experimental data to fit the potential is referred to as a top–bottom approach. There are several techniques that are often used to fit a CG potential based on the all-atom simulations, including Boltzmann inversion, inverse Monte Carlo
[17][18], relative entropy
[19], force matching (multiscale coarse-graining)
[20][21], and the realistic extension algorithm via covariance hessian (REACH)
[22]. The purpose of this review, however, is not to provide detailed descriptions of these methods, and we therefore can refer an interested reader to other review papers describing these methods in more detail
[23][24][25].
Undoubtedly, the most widely used CG force field nowadays is the Martini force field
[26]. This force field was parameterized against experimental partition, vaporization and hydration free energies and therefore is considered as a top–down approach. Martini was originally developed for biomolecular systems, such as lipid membranes. However, since then it has been used to model different material systems and many CG molecule models have been built with Martini, including cellulose and other polysaccharides
[27][28][29][30].
3. Coarse-Grained Models of Carbohydrates
Different CG models for mono- and polysaccharides have been developed through the years
[27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57][58]. Often, these CG models possess different mapping of the anhydroglucose units (AGUs), thus providing a different level of detail of the models. presents the most commonly used mapping schemes (found in more than one publication) for polysaccharides. In many studies, each AGU has been represented by three CG beads
[27][28][29][31][35][36][38][39][42][43][57]. Two widely used mapping schemes with three CG beads are illustrated in a,b.
Figure 1. Different coarse-grained mapping schemes for polysaccharides and cellulose with mapping of (a) 3-to-1 (coarse-grained (CG) beads per 1 glucose unit), (b) Martini force field 3-to-1 (CG beads per 1 glucose unit), (c) 1 CG bead per 1 glucose unit and (d) 1 CG bead for 2 glucose units.
Molinero and Goddard
[31] pioneered the CG modeling of polysaccharides with their M3B model, which uses a mapping scheme presented in a. Originally, the M3B model was derived for
α(1→4) D-glucan oligosaccharides and its extension to other linkage types, such as
α(1→6) and
β(1→4) glucans, was discussed as well. This model used Boltzmann inversion to parameterize the bonded interactions and a Monte Carlo simulated annealing to obtain the non-bonded parameters, which were modeled by the Morse potential. The same mapping as in the M3B model was subsequently adopted in several other studies
[35][36][38][39][57]. Although in these studies, the mapping scheme is similar, the potential functions used to represent the interactions (bonded and non-bonded) are different. Hence, the models are effectively different from each other. Liu et al.
[35] used a multiscale coarse-graining to derive parameters for
α-D-glucopyranose in an aqueous solution. A CG model of cellulose I
β microfibril hydrophobic surface was developed using the same three bead mapping scheme by Bu et al.
[36]. This model reproduced the crystal structure of the I
β fibril well and modeled its interactions with the
Trichoderma reesei cellobiohydrolase I enzyme, which is used in cellulose enzymatic degradation. Two more models utilizing the CG scheme depicted on a were developed for
β-D-glucose, cellobiose and cellotetraose
[38][39]. Markutsya et al. in the same paper
[39] also presented a six particles model, showing that it provides a higher level of structural accuracy than the three beads model for glucose. This is not surprising because of the usage of twice the number of beads for glucose, which also increases the computational cost of the CG model. However, an interesting observation is that placing the CG beads either at the center of mass (COM) or at the center of geometry (COG) of the respective groups of atoms leads to a difference in the radial distribution functions (RDFs). In the three beads model, the usage of COM provides better results, while in the six beads model the COG mapping yields more accurate RDFs. In a follow-up study, an evaluation of different CG mapping schemes in a cellulose microfiber was performed for mapping of one, two, three or four CG beads per glucose unit
[43]. According to this study the one- and two-bead models are suitable for simulating long time-scales properties of microfibers, such as transitions from crystal to amorphous structures. The more detailed models (three and four beads) provide a better level of detail and are able to reproduce the conformational changes in the glucose residues.
The Martini models of mono- and oligosaccharides adopt the mapping schemes presented in a,b, respectively
[27]. It was shown that with these models, a broad palette of polysaccharides can be simulated and their structural and dynamical properties can be well described, although hydrogen bonds are in principle neglected. Wohlert and Berglund
[28] adopted the Martini model with three bead mapping (b) to construct a CG model of a cellulose I
β crystal. The crystalline microfiber model for cellulose was further utilized and extended showing the transformation of cellulose I
β to III
I allomorph
[29]. A recent noteworthy application of the Martini force field is the extension to
N-glycans by Shivgan et al.
[30]. The authors explored various branching patterns in
N-glycans and they used the CG mapping for the glucose monomer unit presented in b.
Although the three bead models described above were successfully used to model cellulose nanocrystals and fibers, they still can be computationally expensive for long fibers typically studied experimentally and found in nature. Therefore, many simulations utilize even more coarse-grained representation for cellulose fibers, such as mapping one or two glucose units (c,d) into a single CG bead
[33][37][40][44][46][47][48][50][51][52][53][59]. The positions of the CG beads are located either at the center of the glucose unit or at the O4 atom, i.e., at the center of the glucosidic bond. With careful parameterization against atomistic molecular dynamics (MD) (mostly utilizing the Boltzmann inversion method), all of these single bead models reproduce the crystallographic structures of the I
α or I
β cellulose allomorphs very well. Recently, Shishehbor and Zavattieri
[53] developed a coarse-grained model that uses one CG bead for two glucose units (d). In their model, the bonded interactions are represented by Morse potential, which enables bond breaking and allows one to describe cellulose fibril dissociation under mechanical stress. The model is thoroughly validated against various mechanical and interfacial properties.
Up to now, we described the most commonly used mapping schemes for the CG modeling of polysaccharides, with a main focus on cellulose models. However, in the recent years, several standalone models with different mapping philosophies than the ones described above, have been developed
[41][54][55][56][60][61]. Wu et al.
[56] developed a CG model similar to the one that they used earlier for
α-1,3-glucans,
[62] to the directional hydrogen bonding of cellulose chains. This model, depicted in a, has one backbone bead (BB) representing the pyranose ring, one “satellite” bead (SB) for the hydroxymethyl group and a linker bead used as a connection point for some of the bonded interactions in model. The hydrogen bonds are accounted for by the so-called “hydrogen bonding beads” (DCX).
Figure 2. Mapping schemes for cellulose with different degree of mapping where one glucose unit is mapped with (
a,
b) several beads or (
c–
e) whole section of the cellulose fibril is mapped with 1 CG bead. Figure adapted with permissions from references (
a)
[56], 2020, American Chemical Society (ACS) (
b)
[41], 2012, American Chemical Society (ACS) (
c)
[54] 2020 Springer, (
d)
[60] 2020 Elsevier and (
e)
[55] 2020, American Chemical Society (ACS). CNC: cellulose nanocrystal, AA: all-atom.
Bellesia et al.
[41] developed a CG model of cellulose to study the conversion of cellulose I
β to III
I. The model, shown on b, is constructed from five beads and successfully predicts the thermodynamical properties and structures of the concerned cellulose allomorphs in the implicit solvent using Langevin dynamics.
Recently, several CG models for cellulose nanofibers were developed where the whole part of the nanofiber including many glucose units are mapped into one CG bead
[54][55][60]. The supra coarse-grained (sCG) model
[54] (c) uses seven beads to map the entire cross-section of the 36 chain-model of the cellulose nanofiber. Each surface bead represents 20 glucose units (five units across nanofiber and four units along the axial direction) and two types of surface beads were designed to reproduce surfaces with different water affinity (hydrophilic and hydrophobic). The central beads describe 24 glucose units and act as the backbone of the nanofiber, being mainly responsible for the mechanical properties (elastic modulus) of the structure. Each central bead is connected to the neighboring central beads and to six surface beads by the harmonic bonds. There are no CG bonds between the surface beads. The bonded interactions of the sCG model were parameterized with the force matching procedure and the non-bonded interactions against the atomistic potential of mean forces. The model was used together with Langevin dynamics; thus, no explicit water molecules are needed, which allows one to study a large system on the scale of hundreds of nanometers in a microsecond range.
The “bead-spring” CNC mesoscopic model
[60][61] developed by Qin and co-workers
[61] and later utilized by Li et al.
[60] (d) maps the whole cross-section of 36 chains with three repeated units into one CG bead. This model was used to investigate the mechanical properties of CNC nanopaper and CNC bulk materials forming a porous network structure.
Another mesoscopic model for CNC was developed by Rolland et al.
[55]. This model is an extension of the Kern–Frenkel patchy particle model
[63] where one CG bead represents a cross-section of 36 chain-model of a CNC with eight glucose units along the axial direction (e). Thus, one CG bead contains around 6000 atoms. The bead particles have an aspherical shape, and each bead is decorated with anisotropic patches, which are different depending on the surface type, with 100, 110, 1–10 surfaces being considered. The patches describe the non-bonded interactions between the CNC particles. The bonded and non-bonded interactions in the model were parameterized against atomistic MD simulations, including the bond and angle distributions and potentials of mean forces, respectively. It should be noted that the patchy particle models have great potential for future use, since one can create different types of patches and interactions, allowing for large flexibility and fine tuning of the interactions.
A scalable CG scheme was recently used to model CNC where the mapping and interaction potentials are scaled on three different levels (Level 1–3) and are based on atomistic simulations
[59]. The different levels represent 1 glucose unit (Level–1), six Level-1 beads are mapped into one Level-2 bead and seven Level-2 chains form one Level-3 bead.
Shishehbor et al.
[64] presented a continuum-based approach to model cellulose nanocrystals, and although this model is not strictly a particle-based coarse-grained one, it is also worth to be mentioned in this review. The individual chains in the CNC are modeled as elastic beams passing through the center of mass of the CNC. The elastic characteristics of the beams are obtained from atomistic simulations of cellulose chains, and each beam element represents one or more glucose monomers.