Nanomaterials and nanoparticles (NPs) possess unique physico-chemical properties (size, shape, chemical composition, physiochemical stability, crystal structure, surface area, surface energy, and surface roughness), which give them beneficial characteristics. Quantitative structure-activity relationship, or QSAR, is an area of molecular modeling that studies relationships between structure and activity using mathematical statistics and machine learning methods. QSAR is efficiently used to predict toxicity of chemical substances.
1. Introduction
Nanomaterials and nanoparticles (NPs) possess unique physico-chemical properties (size, shape, chemical composition, physiochemical stability, crystal structure, surface area, surface energy, and surface roughness
[1]), which give them beneficial characteristics. For this reason, nanotechnology is a new and rapidly growing field of knowledge which includes design, development, and usage of NPs and nanomaterials. According to the Organization for Economic Co-operation and Development (OECD), there exist 11 types of engineered nanomaterials (ENMs): Cerium oxide, dendrimers, fullerenes, gold nanoparticles, multi-walled carbon nanotubes (MWCNTs), nanoclays, silicon dioxide, silver nanoparticles, single-walled carbon nanotubes (SWCNTs), titanium dioxide, and zinc oxide.
The toxicity of ENMs and their influence on humans and the environment should be carefully evaluated
[2][3]. Generally, there are five key mechanisms of ENMs’ toxicity: (1) Direct lesion by ion detachment; (2) oxidative stress induced by reactive oxygen species; (3) adsorption of biologically active molecules; (4) photochemical and redox reactions; and (5) Trojan horse effects (NPs may act as vectors for the transport of toxic compounds into cells)
[4][5][6][7][8]. Not only is complete experimental characterization of the toxicity for all varying preparations extremely laborious, but predictions of the theoretical descriptions of the correspondence between structure/composition of ENMs and their biological activity are in demand.
Quantitative structure-activity relationship, or QSAR (
Figure 1), is an area of molecular modeling that studies relationships between structure and activity using mathematical statistics and machine learning methods. QSAR is efficiently used to predict toxicity of chemical substances
[9][10][11][12][13]. Classical QSAR is a so-called Hansch analysis
[14], which stands on the assumption that bioactivity of compounds is correlated with geometrical and physicochemical descriptors. Generally, a molecular descriptor can be considered as a “number” describing a certain molecular property, which might be experimentally determined (i.e., dipole moment) or calculated (i.e., potential energy), or determined from the chemical structure (i.e., number of methyl groups). However, a molecular descriptor may be a mathematically obtained property (i.e., Wiener, Balaban, or Randic indices)—chemical graph theory is often used to derive mathematical descriptors
[15]. Three-dimensional QSAR is another approach which allows building relations between the spatial structure of molecules, interaction fields, and activity. The first application of the three-dimensional (3D) QSAR technique was proposed in 1988 by Cramer and co-authors
[16], when they were first to develop comparative molecular field analysis (CoMFA). CoMFA supposes that differences in bio-activity depend on the change of strength of non-covalent interaction fields (electrostatic and van der Waals) around the molecules. Another 3D QSAR method is comparative molecular similarity indices analysis (CoMSIA), which takes into account the same molecular interactions as CoMFA, but with the addition of hydrophobic interactions and hydrogen bonding. CoMSIA was developed in 1994
[17]. Three-dimensional QSAR provides multiple benefits to a researcher who studies organic compounds. However, 3D QSAR and classical molecular descriptors are unable to express the specificity of nanoparticles, because their exact structure is usually unknown. This circumstance leads to a lack of sufficient molecular descriptors appropriate for nano-QSAR modeling
[18].
Figure 1. A typical workflow of QSAR modeling for nanoparticles (NPs).
Nano-QSAR (
Figure 2) allows the efficient study of nanoparticles and determination of correlations between their structure and activity
[19]. Nano-QSAR may use all three approaches: One-dimensional (1D), two-dimensional (2D), and 3D QSAR
[20][21][22]. However, it also raises a question: Which technique (nano-Hansch, nano-CoMFA, or nano-CoMSIA) is the best way to study nano-objects? There have been attempts to answer this question. Jagiello and co-authors compared the performance of nano-QSAR and 3D nano-QSAR, studying the activity of fullerene derivatives
[23]. They concluded that nano-QSAR is a more universal approach, which allows gathering general information about the mode of biological activity of nanomaterials: Not only the receptor-based response, but also cell- and organism-based responses. The latter allows efficiently predicting the toxicity of nanoparticles. However, the application of 3D QSAR should be used to study the receptor-based response and would help in understanding such activity in detail
[23]. In general, application of QSAR modeling of nanomaterials can reduce the need for time- and labor-consuming cytotoxicity tests, which are extremely important and economically feasible.
Figure 2. A general scheme of nano-(Q)SAR modeling. 0D, zero-dimensional; 1D, one-dimensional; 2D, two-dimensional; 3D, three-dimensional.
Metal oxide NPs are used in renewable energy, wastewater treatment, electronics, cosmetics, textiles, foods, agriculture, medicine, pharmaceutics, and for many other purposes. Metal oxides are probably the most well-studied object of nano-QSAR research. The pioneer work by Hu et al. investigated seven nano-sized metal oxides: ZnO, CuO, Al
2O
3, La
2O
3, Fe
2O
3, SnO
2, and TiO
2. They applied the multiple linear regression (MLR) method. The cytotoxicity towards
Escherichia coli was found to be highly correlated with metal cation charge. The higher the cation charge, the lower the cytotoxicity of the nano-sized metal oxide
[24]. The cytotoxicity of metal oxide ENMs were measured in terms of LD
50: The dosage of NPs shown to cause the death of 50% of
E. coli cells.
The oxidative stress potential of metal oxide NPs could be predicted by looking at their band gap energy
[5]. Puzyn and co-authors developed a model describing the cytotoxicity towards
Escherichia coli of nanoparticles based on 16 different metal oxides and SiO
2 [20]. All quantum-chemical calculations were performed using the PM6 semi-empirical method. They applied the MLR method combined with a genetic algorithm. The model obtained was characterized by R
2 = 0.862. The model reliably predicted the toxicity of all metal oxides and included only one descriptor—ΔH
Me+—which is the enthalpy of formation of a gaseous cation. The endpoint of cytotoxicity measurement was LD
50. Log(1/LD
50) was used as a dependent variable in the MLR equation.
The structure–cytotoxicity relationship for the same dataset of 17 metal oxide NPs was further investigated in a succession of papers
[18][25][26][27][28][29][30][31]. Density functional theory (DFT)-based descriptors (energy gap, hardness, softness, electronegativity, and electrophilicity index), in conjunction with the MLR statistical method, were used to find a high correlation between experimental and predicted activity values
[26]. The absolute electronegativity is defined as half of the summation between the ionization potential and the electron affinity. The absolute hardness is defined as half the difference between the ionization potential and the electron affinity. Within the Koopmans’ theorem approximation, these parameters can be expressed as the highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) energies. Thus, electronegativity (χ) is determined according to the equation:
Hardness (η) is determined according to the equation:

In a model by Kar et al., electronegativity (χ) and charge of the metal cation were used as molecular descriptors to build QSAR models for the prediction of cytotoxicity of metal oxide NPs (
Table 1). They hypothesized that small particles of metal oxides release an electron much easier than the same particles in the crystal structure; small fragments initiate formation of reactive oxygen species, which invoke the oxidative stress condition to bacteria
[27]. A simple QSAR model with high predictive ability (R
2 = 0.87) was built based on two descriptors: Absolute electronegativity of metal and electronegativity of metal oxide
[31]. In addition, a high correlation (R
2 = 0.804) was obtained to predict the photo-toxicity of metal oxide NPs using two descriptors: Molar heat capacity and LUMO energy of the metal oxide
[31]. The best model by Mu et al. associated cytotoxicity of 16 metal oxide NPs towards
E. coli with enthalpy of formation of a gaseous cation (ΔH
me+) and polarization force (Z/r)
[32]:
Log(1/EC50) = (4.412 ± 0.165) + (−0.121 ± 0.068) Z/r + (0.001 ± 2.57 × 10−4)ΔHme+
The model by Pan et al. used the same dataset, a simplified molecular input line entry system (SMILES)-based optimal descriptor and the MLR method, and showed the highest predictive ability towards both training (R
2 = 0.89–0.98) and test set (R
2 (test) = 0.82–0.87)
[18]. Other works
[20][26][27][31][32] also used the MLR method.
Table 1. Main features of (Q)SAR models predicting cytotoxicity of metal oxide nanoparticles.
1 Missing R2 value means that an SAR model was built instead of QSAR. 2 If software record is missing, then it was not mentioned in the original paper.
3. Other Metal-Containing Nanoparticles
In a pioneer work
[55], an SVM classification model was developed using the experimental data of 44 different NPs from Shaw et al.
[56]. The model used four experimentally determined descriptors: Size, zeta potential evaluating the intensity of charge on their surface, and R1 and R2 relaxivities estimating their magnetic properties. The authors concluded that QSAR is an appropriate methodology for predicting the cytotoxicity of novel nanomaterials, as well as for the design and manufacture of safer NPs. Fourches and co-authors also analyzed a dataset by Weissledder et al.
[57], where cellular uptake was evaluated. They used both SVM classification and kNN regression to build predictive models. The most important descriptors were lipoplicity and a number of double bonds
[55]. Yet another nano-QSAR study for the prediction of the cytotoxicity of metal-containing NPs was conducted in
[58] using smooth muscle cells from Shaw et al.
[56]. The model was built based on cytotoxicity data for 31 NPs using MLR and a Bayesian regularized artificial neural network. The model predicting smooth muscle apoptosis (SMA) consisted of three descriptors: Core material (I
Fe2O3), surface coating (I
dextran), and surface charge (I
surf.chg):
SMA = 2.26(±0.72) − 10.73(±1.05)IFe2O3 – 5.57(±0.98)Idextran – 3.53(±0.54)Isurf.chg
IFe2O3 was set to 1 for the Fe2O3 core and 0 when the core was Fe3O4. Idextran was equal to 1 in the case of dextran coating and 0 for the others. Surface functionality was equal to 1 (basic), −1 (acidic), or 0 (neutral). The model possessed a determination coefficient for the training set equal to 0.81 and 0.86 for the test set. Table 2 summarizes the information about nano-(Q)SAR models predicting cytotoxicity of metal-containing nanoparticles.
Table 2. Main features of (Q)SAR models predicting cytotoxicity of metal-containing nanoparticles.
4. Multi-Walled Carbon Nanotubes (MWCNTs)
Certain MWCNTs display asbestos-like toxic effects. To reduce the need for risk assessment, it has been suggested that the physicochemical characteristics or reactivity of nanomaterials could be used to predict their hazard. Fiber-shape and ability to generate reactive oxygen species (ROS) are important indicators of high hazard materials. Asbestos is a known ROS generator, while MWCNTs may either produce or scavenge ROS
[68].
Table 3 summarizes the information about nano-(Q)SAR models predicting cytotoxicity of MWCNTs.
Table 3. Main features of (Q)SAR models predicting cytotoxicity of multi-walled carbon nanotubes.
5. Fullerenes
Toropov et al. continued to study the toxicity of fullerenes in further publications. The experimental data on the cytotoxicity of C60 NPs towards
Salmonella typhimurium was examined
[74]. By means of quasi-SMILES descriptors obtained with the Monte Carlo method a mathematical model was constructed. The model was a function of dose, metabolic activation (S9 mix), and illumination (darkness or irradiation). Only one split into the training, calibration, and validation set was made. The statistical parameters of the model were not notably high: R
2 = 0.755, q
2 = 0.571
[76]. In the next study, two datasets were used for the bacterial reverse mutation test performed using either
S. typhimurium or
E. coli strain WP2 uvrA/pKM101
[74]. By means of the quasi-SMILES optimal descriptors calculated with the Monte Carlo method, mathematical models were built (several splits into the training, calibration, and validation set were made). The models were a function of the same experimental conditions as in the previous study: dose, metabolic activation, and illumination
[77].
Table 4 summarizes the information about nano-(Q)SAR models predicting cytotoxicity of fullerenes.
Table 4. Main features of (Q)SAR models predicting cytotoxicity of fullerenes.
6. Silica Nanomaterials
Silica (SiO2), or silicon dioxide, is one of the most commonly used ENMs. Silica can be divided into two types: Non-crystalline (amorphous) and crystalline. Amorphous SiO2 is also divided into natural amorphous silica and synthetic SiO2. SiO2 has been studied thoroughly, along with metal oxide NPs, which are discussed above. Here, we concentrate exclusively on silica NPs. Table 5 summarizes the information about nano-(Q)SAR models predicting cytotoxicity of silica nanomaterials.
Table 5. Main features of (Q)SAR models predicting cytotoxicity of silica nanomaterials.