A fault is defined as an unexpected, abnormal, and undesirable situation, behavior, or imperfection at the equipment, component, or sub-system level which may cause a failure. Faults influence the wear and corrosion resistance of the product, reduce the production quality, and produce non-usable materials in the worst case. Such a physical malfunction can lead to unavoidable crashes and stop the system from working properly. Fault prediction is the process of identifying fault-prone components related to specific domains based on predictive analytics. In other words, it predicts different deviations in materials from their expected or normal states. Determining fault types in an effective way can reduce unexpected waste, maintenance, repair, or replacement costs, as well as improve the quality level of products and production efficiency. Fault prediction leads to extend equipment lifetime and asset utilization in various industrial environments. Moreover, it avoids long-term decline in total profits of the related system and also the outflow of customer confidence. The higher level of quality a product requires, the better fault prediction technique the industries should develop. In this context, intelligent systems, derived from research on machine learning, have been established to handle this issue correctly and quickly.
The steel industry has been shown to be one of the primary industries that requires fault prediction to produce materials in the most meticulous way. From making machines to beautiful artworks, steel plates are commonly used in a diverse range of applications, namely in industrial machinery, building construction, automobile chassis construction, bridge structures, and shipbuilding. Having such widespread applications, high-accuracy control of steel plate surfaces is important for meeting strict quality requirements. However, the difficulty of flat steel sheet manufacturing has always been considered in the industry because of the deformation tendency which is often caused by the steel surface coming in contact with different machines in manufacturing steps such as casting, drawing, pressing, cutting, and folding. Consequently, this entry aims to recognize the types of defects that steel plates have. One of the traditional ways is the manual inspection of steel plates by human experts to detect defects. However, this practice is very time-consuming, inaccurate, and costly, which needs considerably more human effort and overlooked investigation. Therefore, automation of fault prediction is necessary to reduce costs and minimize the time needed for monitoring. Here, machine learning plays an important role by analyzing past data to find hidden patterns and then construct models to predict the faults. Machine learning-based fault prediction methods contribute to facilitating precautionary maintenance and avoiding quality problems of the materials by more accurate and efficient decisions.
2. Machine Learning-Based Fault Prediction
Machine learning-based fault prediction has been investigated with real-time monitoring in manufacturing environments. In 
, random forest (RF) classification was employed for the prediction of input data issues, and the NoSQL MongoDB as a big data technique was applied to the collected environmental dataset from the Internet of Things (IoT) sensors in an automotive manufacturing production line. Moreover, blockchain technology was utilized for covering system security. In another work 
, the utilization of machine learning models in the battery management system of a lithium-ion battery for the prediction of faults in the remaining useful life (RUL), charge state, and health state were presented by means of a neural network (NN) with a support vector machine (SVM), genetic algorithm back propagation neural network (GA-BPNN), RF, Gaussian process regression (GPR), logistic regression (LR), and long short-term memory recurrent neural network (LSTM-RNN). In another study 
, the authors focused on a bearing fault prediction method for electric motors by applying a medium Gaussian support vector machine (MG-SVM) on a motor bearing dataset.
The application of deep learning methods to predict faults has been investigated in various studies 
. In 
, a fault prediction workflow by deep learning for seismic data was developed, in which convolutional neural networks (CNNs) for image recognition, U-Net architecture for image segmentation, random forest for identifying the most important attributes, and GANs-based reconstruction approach for clarifying fault locations were used on the seismic data. As a result, the highest importance for the “discontinuity along dip” feature among seismic attributes was specified, and the prediction accuracy of fault probability maps was improved. Similarly, in another work 
, the authors proposed a structure-based data augmentation framework to boost the variety of the semi-real-semi-synthetic seismic dataset collected from various work areas in the Tarim Basin of China for improving fault prediction and identification on the basis of deep neural networks and U-Net, respectively. In another work 
, fault prediction and cause identification approaches based on deep learning in complex industrial processes were reported. The authors utilized deep learning to predict the fault events, long short-term memory (LSTM) to adapt to the branch structures, and an attention mechanism algorithm for fault detection and cause identification on the sensor-based data in a production line considering various fault types. Yang and Kim 
detected recurrent and accumulative fault situations and calculated the anomaly scores in the data by using the LSTM method.
Fault prediction in wind turbines has been investigated in previous studies 
since it is a critical issue for maintaining the reliability and safety of energy systems. In 
, a novel solution for predictive maintenance in the generator of wind turbines was developed by means of supervisory control and data acquisition (SCADA) systems to control the state of operations in generators. Principal component analysis (PCA), SVM, NN, K-nearest neighbors (KNN), and naive Bayes (NB) classifiers were used to discriminate the various statuses of wind turbine generators. The synthetic minority oversampling technique (SMOTE) technique was applied to manage the imbalanced dataset for the wind power plants consisting of numerous wind turbines located in China. Low deployment costs were considered in the presented work by diagnosing the specific type of generator faults with high accuracies. In another study 
, the authors focused on a stacking gearbox fault prediction model for wind turbines on the basis of the SCADA data for wind turbines in a wind farm. The applied main techniques were recursive feature elimination (RFE) for selecting appropriate features, and RF, extreme gradient boosting (XGBoost), and gradient boosting decision tree (GBDT) for describing the usual circumstances of the wind turbines. The results revealed that RF, GBDT, and XGBoost approaches outperformed KNN, SVM, decision tree (DT), and AdaBoost according to the high R2
scores, and the low mean absolute error (MAE) and root mean square error (RMSE) metrics for various turbine types.
Wan et al. 
presented a model based on the Dempster–Shafer (DS) evidence theory and a quantum particle swarm optimization back-propagation (QPSO-BP) neural network for the prediction of rolling bearing faults types under different operation conditions. They found the optimal initial weights and thresholds of the neural network. The authors used a rolling bearing dataset and achieved high-performance accuracy with the presented method in comparison to SVM-DS, DT-DS, RF-DS, KNN-DS, and K-means-DS regarding the macro area under curve (AUC) metric.
Yang and Li 
developed a fault prediction method for wind energy conversion systems to improve the performance of the fault prediction model, shorten the time of fault prediction, and reduce the deviation between the actual fault value and the fault prediction value. The outperformance of the presented method was proved based on the kurtosis factor in comparison with the revealed results for fault prediction in different wind energy conversion systems. In the other work 
, the performances of various machine learning approaches were reported for forecasting heating appliance failures with the aim of predictive maintenance. In the mentioned work, the necessary data were collected from installed sensors of boiler appliances in homes. The results indicated that the LSTM models achieved higher accuracy than DT, NN, and weighted NN models based on different metrics for no fault, light fault, and severe fault states. In the other study 
, a smart machinery monitoring system based on machine learning was implemented to simulate the operating state of machinery for fault detection with a reduced volume of transmission information in an industrial IoT. The obtained accuracy from the non-linear SVM algorithm was higher than the results of the NB, RF, DT, KNN, and AdaBoost algorithms.
Syafrudin et al. 
introduced a hybrid prediction model which includes a real-time monitoring system for automotive manufacturing on the basis of IoT sensors and big data processing. Various approaches, namely Apache Storm as a real-time processing engine, Apache Kafka as a message queue, MongoDB for storage of the sensor data, density-based spatial clustering of applications with noise (DBSCAN) for outlier detection, and RF classification for removing outliers were used in the mentioned study. In the other study 
, a fault prediction method was proposed to accelerate the speed of alarm processing and to improve the accuracy in the energy management system of microgrids via online monitoring, failure prejudging, and optimized SVM analysis. Early warning time and the high success rate of the proposed method were the consequences of their study. In another work 
, fault prediction of the in-orbit spacecraft was investigated based on deep machine learning and the massive telemetry and fault data. The algorithms such as least squares support vector regression (LS-SVR), auto-regressive integrated moving average (ARIMA), and Wavelet NN were utilized to determine the best model regarding normalized mean square error (NMSE).
Haneef and Venkataraman 
employed LSTM, RNN, and a computation memory and power (CRP) rule-based network policy for predicting fog device faults. They collected related data by running the Internet of Things applications on different fog nodes. Their proposed method outperformed the traditional LSTM, SVM, and LR methods in terms of improved accuracy, lower processing time, minimal delay, and faster fault prediction rates. In the other work 
, the authors developed a machine learning-enabled method for fault prediction in centrifugal pumps in the gas and oil industry through multi-layer perceptron (MLP) and SVM techniques. They gathered the related data from the process and equipment sensors of centrifugal pumps to generate fault prediction alerts properly in decision support systems for operatives. In another study 
, the authors reported a fault prediction model with the aim of real-time tracking of sensor data in an IoT-enabled cloud environment for a hospital by machine learning. They applied the DT, KNN, NB, and RF techniques for controlling unanticipated losses produced by different faults. In another work 
, a real-time fault prediction recommendation model was developed by machine learning for a sensor-based smart office environment by means of a fault dataset retrieved from the sensors of office appliances. In their study, KNN, DT, NB, and RF were compared, and as a result, the RF algorithm revealed the highest accuracy against the others.
3. The Application of the LMT Algorithm
LMT is a classification algorithm in the machine learning field that uses decision tree and logistic regression approaches to build a classifier as a special tree by taking advantage of both tree and regression concepts. In other words, it builds a tree with a logistic regression model at the nodes. LMT has been considered as an effective alternative for decision tree-enabled machine learning algorithms. The major benefits of LMT include working with numeric and binary values, nominal qualities, numeric variables, and missing data. In addition, LMT avoids data overfitting as a result of regression and classification techniques. Despite the advantages of LMT, building a single tree classifier may not be enough and may lead to less accuracy in the prediction. Therefore, in the current research, the researchers present an ensemble method, the logistic model tree forest, which builds many LMT trees and combines them together to make a final prediction.
LMT has been applied in various fields such as health 
, forensic science 
, environmental work 
, earthquake 
, agriculture 
, and transportation 
. For example, in 
, flash flood susceptibility maps were analyzed by the use of different machine learning algorithms, including LMT, multinomial NB, radial basis function classifier (RBFC), and kernel LR for solving the flood problem in Vietnam. The dataset consisted of flash flood features such as river density, land use, flow direction, and so on. The validity of the methods was measured regarding AUC and the best performance achieved by the LMT algorithm among the others. Their work was suggested for flash flood management by relying on the high accuracy of the model to specify flood-susceptible fields.
LMT was regarded as the best method among their counterparts in many studies 
. For example, in 
, a susceptible landslide detection model in the Cameron highlands of Malaysia was reported, in which RF, LR, and LTM algorithms were applied to various databases such as soil maps, digital elevation models, geological maps, and satellite imagery. The results revealed the superiority of LMT over LR and RF based on the AUC metric. In the other study 
, the authors constructed a trustworthy map of shallow landslide susceptibility for Bijar City in Iran by different machine learning algorithms, including LR, LMT, NB, SVM, and NN. The reliability of the models was tested according to various metrics (i.e., MAE, RMSE). The outperformance of LMT was proved in comparison with other mentioned algorithms. Thus, the authors recommended the utilization of LMT in shallow landslide phenomena to reduce the related damages.
The LMT algorithm has been used in various studies to suggest solutions for machine learning-based problems due to its high accuracy in terms of different evaluation metrics. For example, in 
, the biochemical features of oil palm plants were monitored by using the spectroradiometer, machine learning, SMOTE, and unmanned aerial vehicle (UAV) techniques. In addition, three types of imbalanced datasets (leaf-raw band, canopy-VI, and canopy-raw band) were utilized to analyze nutrients in plants optimally and ensure their health and harvest. The outperformance of the LMT-SMOTEBoost was reported among alternative ones. In another work 
, LMT was applied to the medical field to predict miRNA-disease association (LMTRDA) by combining various information such as miRNA functional similarity, miRNA sequences, disease semantic similarity, and known miRNA-disease associations. Their model achieved a high accuracy regarding both sensitivity and AUC metrics on the dataset.
Edited nearest neighbor (ENN) is a useful under-sampling technique focusing on eliminating noise samples 
. It aims the selection of a subset of data instances from the training examples that belong to the majority class to make the classifier more robust and improve computational efficiency 
. The previous studies 
showed that the ENN method allowed for achieving an improvement in the classification performance in terms of accuracy.