Please note this is a comparison between Version 1 by Chao Yang and Version 2 by Sirius Huang.

High-entropy alloys (HEAs) have attracted worldwide interest due to their excellent properties and vast compositional space for design. However, obtaining HEAs with low density and high properties through experimental trial-and-error methods results in low efficiency and high costs. Although high-throughput calculation (HTC) improves the design efficiency of HEAs, the accuracy of prediction is limited owing to the indirect correlation between the theoretical calculation values and performances. Machine learning (ML) from real data has attracted increasing attention to assist in material design, which is closely related to performance.

- artificial intelligent design
- high-entropy alloy
- machine learning

The concept of high-entropy alloys (HEAs) has been raised by Cantor [1] and Yeh [2] since 2004. HEAs usually consist of four or five elements with atomic percentages (at.%) that are equal or nearly equal. Usually, the atomic fraction of each component is greater than five percent [3]. Their configurational entropy of mixing is high which is beneficial for the formation of the solid-solution phase [4]. They mainly possess Face-Centered Cubic (FCC), Body-Centered Cubic (BCC), and Hexagonal Close-Packed (HCP) structures [5]. Unlike conventional alloys, the complex compositions of HEAs lead to exceptional effects. HEAs usually exhibit outstanding physical and chemical properties, i.e., high mechanical properties, superior fatigue and wear resistance, good ferromagnetic and superparamagnetic properties, and excellent irradiation and corrosion resistance, etc. ^{[6][7][8][9][10]}[6,7,8,9,10]. Using optimized composition design, a lighter density and better performance of HEAs can be obtained to achieve the purpose of lightweight HEAs [7]. However, due to the flexible compositions and ample performance tuning space, obtaining HEAs with low density and high properties solely through experimental trial-and-error methods requires a substantial investment of time and labor, resulting in low efficiency and high costs.

In recent years, the use of a computer-assisted design method has made significant progress in the field of HEAs. High-throughput calculation (HTC) is one promising computer-assisted design method, which is characterized by concurrent calculations and an automated workflow, enabling efficient computations for tasks at a high scale, rather than sequentially processing multiple tasks [11]. It initially focuses on the quantum scale, effectively meeting the demand for expediting the discovery of new materials and exceptional performance. In recent years, the concept of HTC has been applied to micro-thermodynamic scales, becoming a rapid method for obtaining phase information in metal structural materials [12]. High-throughput first-principles calculations and thermodynamics calculations are two main technologies of the HTC methods. High-throughput first-principles calculations, which do not rely on empirical parameters, can predict material property data by inputting element types and atomic coordinates [13]. They play an indispensable role in understanding and designing target materials from a microscopic perspective and enable the quantitative prediction of composition optimization, phase composition, and the structure–property relationship of materials. High-throughput first-principles calculations demonstrate specific roles and advantages for three aspects of HEAs: (1) the accurate construction of long-range disordered and short-range ordered structures; (2) the precise prediction of the stability of HEA phases; (3) the accurate calculation of the mechanical properties of HEAs [14]. The process of screening HEAs based on high-throughput thermodynamics calculations combines equilibrium calculations with non-equilibrium Scheil solidification calculations [15]. Using high-throughput calculation to predict the melting point, phase composition, and thermodynamic properties of HEAs after processing, it rapidly obtains the alloy composition space that satisfies criteria such as the melting point and phase volume fraction [16]. This assists in the quick analysis of effective alloy compositions, reducing the frequency of experimental trial and error. High-throughput thermodynamics calculations demonstrate specific functions and advantages in three ways: (1) the accurate acquisition of the phase diagrams and thermodynamic properties of HEAs; (2) the rapid retrieval of key microstructural parameters for HEAs; (3) the implementation of cross-scale analysis [12]. However, HTC technology mainly uses the theoretical calculation values such as phases, melting points, and various energies as data sources. Although the amount of data used in HTC calculation is huge, the direct correlation with the performance of HEAs and the accuracy of performance prediction are far from satisfactory. Therefore, the current HTC method can only be used as a reference criterion for HEA design. A certain number of experiments are still needed to verify the accuracy of the HTC design results.

In the past decade, the rapid ascent of artificial intelligence (AI) has brought a transformative revolution [17]. This revolution has not only fundamentally reshaped various domains of computer science, including computer vision and natural language processing, but has also made a significant impact on numerous scientific fields, including materials science. AI success comes from its ability to comprehend complicated patterns, and these complicated AI models and algorithms can be systematically refined through learning from real data, which is closely related to performance [18]. This capability is further enhanced by the availability of computational resources, efficient algorithms, and substantial data collected from experiments or simulations. The exponential increase in relevant publications is indicative of this trend. In essence, with a sufficiently large dataset of high quality, AI can effectively capture the intricate atomic interactions through standard procedures of training, validation, and testing [19]. Additionally, AI models and algorithms can identify non-linear structure–property relationships, which are challenging to determine through human observation [20]. These attributes position AI as an effective tool to tackle the challenges associated with the theoretical modeling of materials [21]. Machine learning (ML) is one of the most important technologies used for the AI design of materials [22]. This method, based on comprehensive experimental and theoretical studies, enables rapid data mining, revealing underlying information and patterns, and accurately predicting material properties for target material selection [23]. However, a small number of datasets becomes a key issue in HEA design, leading to high requirements for accuracy and the generalization ability of ML models and algorithms.

ML is a multidisciplinary field involving probability theory, statistics, approximation theory, and algorithmic complexity theory [24]. The concept of ML, first introduced by Samuel in 1959, has evolved into a cross-disciplinary field spanning computer science, statistics, and other fields. Due to its efficient computational and predictive capabilities, ML has been gradually applied in materials science research [25]. In recent years, ML has gained widespread attention and demonstrated outstanding capabilities in the development of new materials and the prediction of material properties in the field of materials science [26]. A notable example is the 2016 article published in Nature, titled “Machine-learning-assisted materials discovery using failed experiments”, which successfully predicted chemical reactions and the formation of new compounds by mining a large dataset of failed experimental data, further fueling the momentum of research related to the application of ML to materials [27].

With an in-depth understanding of the concept of materials engineering, ML has found extensive applications in the design, screening, and performance prediction and optimization of materials [28]. Data-driven methods significantly expedite the research and development process, reducing time and computational costs. Whether on the micro or macro scale, this approach can be applied to new material discovery and the prediction of material properties in the field of materials science.

The discussion on the rules of phase formation has always accompanied the research on HEAs. The role of phases has been crucial in the design of HEAs. In the design strategy of HEAs, predicting the composition and phase stability of unknown alloy components is an essential aspect [29]. Widely used descriptors for phase prediction include entropy of mixing, enthalpy of mixing, elastic constants, melting temperature, valence electron concentration, electronegativity, etc. [30]. As research advances, the development involves utilizing differentiating alloy elemental contents as inputs or various combinations of the intrinsic properties of monatomic elements such as their physical and mechanical features. Examples of these features include atomic radius difference, valence electron count, configuration entropy, mixing enthalpy, etc. By directly modeling, relationships between the element combinations and phase formation can be obtained.

Besides phase formation, exploring the relationship between the compositions and properties of HEAs is also an essential task. By establishing a correlation model between feature parameters and properties such as strength, it is possible to achieve the rapid prediction of material performance based on chemical composition. This method, supplemented by a substantial amount of experimental data, offers valuable guidance for alloy composition design.

Model | Advantages | Limitations |
---|---|---|

NNs | (1) Powerful for complex, non-linear relationships. (2) Robust to noisy data. (3) Ability to learn from large datasets. |
(1) Prone to overfitting, especially with small datasets. (2) Requires careful tuning of parameters. (3) Black-box nature makes interpretation difficult. |

SVM [93] | (1) Effective in high-dimensional spaces. (2) Works well with small to medium-sized datasets. (3) Versatile due to kernel trick for non-linear classification. |
(1) Can be slow to train on large datasets. (2) Sensitivity to choice of kernel parameters. (3) Memory-intensive for large-scale problems. |

GP | (1) Provides uncertainty estimates for predictions. (2) Flexible and interpretable modeling. (3) Can handle small datasets effectively. |
(1) Provides uncertainty estimates for predictions. (2) Flexible and interpretable modeling. (3) Can handle small datasets effectively. |

KNN [94] | (1) Simple and easy to understand. (2) No training phase, making it fast for inference. (3) Robust to noisy data and outliers. |
(1) Simple and easy to understand. (2) No training phase, making it fast for inference. (3) Robust to noisy data and outliers. |

RF | (1) High accuracy and robustness. (2) Works well with high-dimensional data. (3) Handles missing values and maintains accuracy. |
(1) Can be slow to predict on large datasets. (2) Lack of interpretability due to ensemble nature. (3) May overfit noisy datasets if not tuned properly. |