DNA carries the genetic information required for the synthesis of RNA and proteins and plays an important role in many processes of biological development. Understanding the three-dimensional (3D) structures and dynamics of DNA is crucial for understanding their biological functions and guiding the development of novel materials. Molecular dynamics (MD) simulations can generally reproduce the behavior of DNAs in a computer, providing detailed structural and dynamical insights that enhancing our comprehension of relevant experimental data. In recent years, MD simulations using classical force fields such as AMBER and CHARMM have provided highly detailed and flexible descriptions of DNA dynamics, including structural transformations, stability of non-canonical conformations, salt ion cohesion effects, twist-stretch coupling of stress, flexibility under methylation modifications, and interactions with other macromolecules. It is always fascinating to obtain microscopic insights into DNA dynamics through MD simulations. However, the innumerable degrees of freedom, interconnected in complex ways, can make it practically impossible to detect DNA dynamics on biologically relevant time scales and length scales using currently available computer hardware
1. Introduction
1. Introducton
M
olecular dynamics (MD
) simulations of DNA systems are typically performed by calculating the force on each atom as a function of their positions using all-atomic force fields (such as AMBER, CHARMM, GROMOS, and OPLS) (
Figure 1a). These force fields are parameterized using experiments or quantum chemistry calculations of small systems
[1][2][3][4][5][6][23,24,25,45,46,47]. CHARMM36
[3][25] and AMBER ff99bsc1
[6][47], which have been validated and improved through multiple revisions, are commonly used for DNA simulations. Although these force fields have limitations, such as AMBER potentially overestimating base stacking effects and CHARMM weakening base pairing
[7][8][48,49], they have been successfully employed to simulate DNA systems, providing atomistic resolution and establishing quantitative relationships between structure and conformational energy
[9][10][50,51].
Figure 1. (
a) Schematic diagram of MD simulations. (
b) Diagram of calculations on the flexibility of dsDNA in MD simulations. Left: initial conformation. Right: equilibrium conformation. (
c) MD simulations for short ssDNA in ion solutions. (
d) Distributions of the angle between contiguous P-C4′-P atoms from experimental structures (red) and MD simulated conformations (blue). The
three-dimensional (3D
)f structures are shown with PyMol (
http://www.pymol.org).
2. Structural Dynamics
MD simulations have been effective in accurately probing the atomic motions and structural dynamics of DNAs
[11][12][13][14][15][16][17][18][52,53,54,55,56,57,58,59], enabling
reus
earchers to understand the DNA functions. To address the question of how long an MD simulation of a B-DNA helix needs to be to sample the dominant structural and dynamical features, Galindo-Murillo et al. presented an extensive analysis using multiple μs-length MD simulations of a dsDNA (d(GCACGAACGAACGAACGC)) with Amber 14 and a ff99SB parmbsc0 or CHARMM C36 force field on multiple computer architectures (including Anton, CPU, and GPU). The results showed that despite the underlying differences in hardware, the simulations performed on different architectures exhibited minimal structural variation with respect to one another. These MD simulations, including the longest one at ~44 μs, also suggested that the structure and dynamics of the DNA helix, excluding the terminal base pairs, reach near-full convergence on the ~1–5 μs timescale. This indicates that the current force field is reasonably robust. However, the convergence of the terminal base pair opening events occurs on time scales significantly longer than 10 μs and cannot be fully captured through ensembles of shorter and independent MD simulations
[19][20][60,61]. In a separate study, Yang et al. performed umbrella MD simulations of A-T sequence-rich B-DNA using the Amber force field and reproduced the experimental conformational transition path from Watson–Crick to Hoogsteen base pairs observed in NMR relaxation dispersion spectroscopy
[21][62]. This indicates that MD simulations have the power to describe large-scale structural dynamics at short timescales using an advanced-sampling approach
[22][63].
In addition, MD simulations can also provide detailed insight into DNA structure dynamics. For example, Chakraborty et al. employed the AMBER12 package and Joung/Cheatham ion parameters to explore the transition between B- and Z-dsDNAs. Their study found that the free energy landscape exhibits two distinct funnels, leading to the B-DNA and Z-DNA conformations. This suggests that the reversal of chirality is caused by the stretched DNA structure or mutual competition at the B–Z junction
[23][64].
3. Structural Flexibility
In recent years, MD simulations have been widely used to study the flexibility of DNAs, as DNA structural flexibility is closely associated with many biological processes involving the storage or encoding of genetic information
[24][25][65,66]. Although many results from single-molecule experiments can be well-described by the commonly accepted WLC models
[26][27][28][27,28,29], atomistic MD simulations are extensively used to obtain microscopic descriptions of DNA flexibility, such as the width and depth of the major/minor grooves and the distances/twist angles between neighbor base pairs
[29][30][31][67,68,69]. For example, to explain the experimental results that short DNAs consisting of tens of base pairs (bps) may have seemingly higher flexibility than those of kilobase pairs, Wu et al. performed MD simulations for short dsDNAs with a finite-length of 5–50 bps using the Amber parmbsc0 force field. Their microscopic analyses (the calculation of stretching and bending at the base-pair level) revealed that the apparent high flexibility of short dsDNAs arises from significantly strong bending and stretching flexibilities at each helix end, consisting of ∼6 bps
[32][70]. In addition to the length-dependent flexibility of DNA, Marin-Gonzalez et al. performed over 1μs-long constant-force MD simulations of 18 bp-long dsDNAs (CGCG(NN)
5CGCG, with NN as the AA, AC, AG, AT, CG, and GG). They found that the DNA crookedness (a sequence-dependent deformation of DNA that consists of periodic bends in the chain of base pair centers) and its associated flexibility can regulate DNA mechanical properties at short length scales. This unveiled a one-to-one relation between DNA structure and dynamics
[33][71]. To understand the distinct differences in the flexibility of dsRNA and dsDNA helices, Liebl et al. performed unrestrained/restrained MD simulations for a 16 bp dsDNA or dsRNA using the AMBER12 package with the parmbsc0 force field. Their detailed analysis of helical deformations, coupled with twist, indicated that the interplay of helical rise, base pair inclination, and displacement from the helix axis during twist changes is responsible for the different twist–stretch correlations
[34][72]. Coincidentally, Marin-Gonzalez et al. investigated the difference between dsDNA and dsRNA (16 bp) using microsecond-long MD simulations under constant stretching forces within the range of 1–20 pN. They showed that the opposite twist–stretch coupling of both molecules is due to the markedly different evolution of inter-strand distance with the stretching force, which is directly correlated with the slide base-pair parameter and sugar pucker angle
[35][73]. Recently, Bao et al. also conducted extensive MD simulations for larger dsDNA and dsRNA (40 bp) without applying stretch force, using the AMBER ff99bsc0 force field. Their work provides a more quantitative understanding of the relative flexibility of dsRNA and dsDNA in terms of both stretching and twist–stretch coupling. They noted that the striking difference in twist–stretch coupling between dsRNA and dsDNA is attributed to the apparently stronger base-pair inclination in dsRNA compared to dsDNA (
Figure 1b)
[36][74].
In addition, MD simulations can be used to reproduce the effect of base modifications or base-pair mutations on DNA flexibility
[37][38][75,76]. For example, Aksimentiev et al. combined MD simulations (using the NAMD program with a CHARMM36 force field) with a single-molecule cyclization assay to study how different cytosine modifications influence the physical properties of dsDNA (70 bp). They elucidated the microscopic mechanisms behind the changes in DNA flexibility induced by cytosine modifications: these modifications can promote or dampen structural fluctuations through the competing effects of base polarity and steric hindrance
[39][77]. Given that the appearance of mismatched base pairs (MMs) can result in the development of inherited genetic diseases, cancer, and aging, Rossetti et al. presented the first comprehensive study on the structure of MM-containing DNA duplexes (12 MMs, including A·A, A·C, A·G, C·A, C·C, C·T, G·A, G·G, G·T, T·C, T·G, and T·T, placed in the center of 13 bp duplexes, e.g., d(CCATACXATACGG)). They employed MD simulations (Gromacs v.4.5.5 program with parmbsc0 force field) and NMR spectroscopy and found that the presence of mismatches produced significant local structural alterations due to the flexible MMs (especially in the case of purine transversions). These alterations could be propagated far from the mismatch site, influencing the global structures of DNA
[40][78]. On the other hand, Bouchal et al. also employed MD simulations (Amber 16 package with parmbsc1 force field) to calculate the thermodynamic stabilities of MMs in similar dsDNAs (e.g., d(GGTTAAXTTAACC) with anti/anti, anti/syn, and syn/anti MM combinations) as a function of two geometry parameters of the base pair (opening and shear). However, their detailed analysis showed that there was no clear dissection between the canonical and mismatched base pairs
[41][79]. This discrepancy suggests that MD simulations may be less credible in capturing the local sequence effects on DNA flexibility
[36][74] due to the empirical force field.
4. DNA–Ion Interaction
Since DNA is an anionic polyelectrolyte, the solvent environment plays a significant role in DNA structures
[42][43][44][45][46][80,81,82,83,84]. Pasi et al. performed microsecond MD simulations for 39 dsDNAs (with a length of 18 bp and different sequences) under physiological salt conditions using the parmbsc0 force field with Dang parameters for the ions. They provided a comprehensive state-of-the-art perspective on sequence-dependent potassium ion populations. For example, they observed that potassium ions within the grooves are more likely to accumulate around electronegative base sites rather than the anionic phosphate groups
[47][85]. Considering the experimental results showing that high-valent cation can lead to the opposite effect on the elasticities of DNA and RNA duplexes, Fu et al. used MD simulations for 20 bp dsDNA and dsRNA in trivalent ion solutions (i.e., CoHex
3+). They found that these results were caused by different binding modes of the cations on dsDNA and dsRNA
[48][86]. More recently, Cruz-Leon et al. also combined high-resolution MT experiments with MD simulations (parmbsc1 force field on 33 bp dsDNA) to show that increasing ion concentration leads to a decrease in helical radius and crookedness, an increase in sugar pucker, and ultimately an increase in a twist. This is due to the increased screening of electrostatic repulsion between phosphate groups
[49][87].
Furthermore, MD can provide an atomistic understanding of how DNA–ion interactions vary with different metal ions (
Figure 12b,c). For example, Long et al. performed MD simulations to sample the structures of a 23 bp DNA duplex in various ion solutions (such as Mg
2+, Ca
2+, Sr
2+, or Ba
2+). They demonstrated that these ions exhibit a preference for binding to the phosphate backbone rather than the major groove
[50][88]. To investigate the competitive binding of divalent and monovalent ions to dsDNA, Xi et al. performed all-atom MD simulations for a 24 bp dsDNA in mixed Mg
2+/Na
+ solutions using the Amber parmbsc0 force field with Joung/Cheatham ion model for Na
+/Cl
− and the Aqvist ion model for Mg
2+. Their comprehensive analysis suggested that the global binding of Mg
2+ over Na
+ to nucleic acids is primarily dependent on the surface charge density and Mg
2+/Na
+ concentrations
[51][89].
5. Limitations
In the last 40 years, MD simulations have made significant progress in providing atomistic insights into DNA structures, including dynamics, flexibility, and ion binding. Although recent efforts combining experiments and simulations show promise for improving the accuracy of nucleic acid force fields, MD simulations are not always effective, particularly for ssDNAs
[52][53][90,91]. Recently,
rwe
searchers performed MD simulations for unstructured ssDNA (with a random sequence: 5′-CTGCCACGCCATGCCTGTGACGA-3′ at 1 M [Na
+]) and tried to extract the bonded parameters from the equilibrium conformations. However,
rwe
searchers found that the distributions of several angles in MD conformations deviated from those observed in PDB structures. For example, the P-C4′-P angle showed a deviation of ~11° from its optimal value, as shown in
Figure 12d. In addition, the ion parameters, which are optimized based on a set of experimental solution properties such as solvation-free energies, radial distribution functions, water exchange rates, and activity coefficient derivatives, could be limited in their transferability to quantitatively describe biomolecular systems
[54][55][92,93]. Thus, further investigations of diverse DNA structures (e.g., ssDNA, pseudoknots, G-quadruplexes, i-motifs, and DNA complexes) in ion solutions are needed to further assess the quality of these force fields
[52][53][56][57][90,91,94,95].
Furthermore, MD simulations in equilibrium are not always adequate to sufficiently explore the structural space needed for accurate property estimation
[58][59][96,97]. In MD simulations, the initial conformation is usually established based on an experimentally known structure. If the molecule acquires another stable conformation that is separated by a high free energy barrier, the system’s acquisition of this alternative conformation within a realistic computational cost becomes challenging due to the barrier
[53][91]. Finally, sampling remains an issue in some nucleic acid simulations, thus requiring the extension of simulation time scales and exploration of efficient enhanced sampling methods (e.g., temperature replica exchange, Hamiltonian and multi-dimensional replica exchange, metadynamics, and umbrella sampling). These efforts are important for future advancements
[60][61][62][98,99,100].