In industry, electric motors such as the squirrel cage induction motor (SCIM) generate motive power and are particularly popular due to their low acquisition cost, strength, and robustness. Along with these benefits, they have minimal maintenance costs and can run for extended periods before requiring repair and/or maintenance. Early fault detection in SCIMs, especially at low-load conditions, further helps minimize maintenance costs and mitigate abrupt equipment failure when loading is increased.
1. Introduction
SCIMs are used to power most industrial appliances because of their robust nature and ability to generate sufficient torque to effectively drive much larger machinery at an affordable cost through the process of electromagnetic induction. Injection sea water pumps, air conditioner compressor drives, gas circulators in power generating firms, and oil exporting pumps in the oil and gas drilling industries are only a few of the well-known applications of SCIMs
[1]. SCIMs are often prone to failures and breakdowns as a result of faults and prolonged operation, and, if left unmonitored, often suffer major damage or breakdowns. According to a review done by Bhowmik et al.
[2], severe operating environments, insufficient insulation, purposely overloading the power supply, and factory defects are the most typical causes of breakdowns and failure. The production down-times caused by these flaws have frequently resulted in revenue loss, among other pitfalls. Therefore, early fault detection is important/crucial to avoid these occurrences
[1]. The consequences of these failures have increased the need for SCIM failure diagnosis as a crucial module for overall equipment prognostics and health management (PHM).
State-of-the-art research studies on SCIM FDI and prognostics feature data-driven PHM technologies, with research ongoing; these studies show that data-driven AI-based PHM technologies rely heavily on the quantity and quality of data to train AI-based predictive modeling
[1][3]. However, the accuracy of these models is also dependent on how suitable the method befits the nature of the available data, which has led to various studies exploring numerous data-driven PHM methodologies. Researchers have used advances in artificial intelligence (AI), machine learning (ML), and deep learning to construct models that exploit current signals, vibration signals, and thermal signals generated by equipment via sensors, examining these signals separately or combining them for FDI
[1][4][5]. Fourier analysis has proven to be effective among the different ways of analysis owing to its convenience and nature of application, especially when it comes to current signature analysis and/or vibration signature analysis (VSA)
[5]. Fourier transforms (FTs) are essentially concerned with the decomposition of signals from their time domain to their frequency domain for analysis in both healthy and faulty motors, providing a superior platform for signal interpretation and feature extraction for FDI
[3][6]. Even though it can decompose signals to their frequency domains, a FT still has limitations, such as its lack of transient information and its nature of providing only the average time of the spectrum content, thereby lacking in providing details on variations in frequency with regard to time of the signals
[7]. Fast Fourier transforms with high computation speed and short-time Fourier transforms that decompose data into the time–frequency domain are frequently used to solve these challenges
[7][8]. Further, according to
[9], Fourier analysis is one of the highly efficient analytical tools that is compatible with MCSA for a variety of fault detection for SCIMs.
Although variable frequency drives (VFDs) have recently become more popular in industries than direct online starters due to their ability to provide flexible production control and soft motor start-up, variable frequencies, complex control systems, and harmonics generated at the drive output are still some of the major concerns they are associated with
[10]. Harmonics generated by VFDs pose a significant problem for motor bearings and stator windings since they raise their level of stress. Moreover, they have an impact on signal quality in terms of noise ratio, particularly when using stator current signals for FDI
[5][10].
2. Review of ML-Based Classification Algorithms
As previously stated, recent AI advancements have aided in the improvement of ML and deep learning models for effective FDI, which typically involves relying on intelligent models for improved FDI at a minimal false alarm rate even amid uncertainties. Even though the efficiencies of these methodologies have been reported in numerous studies, there are still underlying challenges associated with their use, such as computational cost and their tendency to deviate from core engineering concepts, which makes them sometimes irrelevant for cost-conscious industrial applications
[11]. Traditional ML algorithms, on the other hand, offer a more cost-effective and dependable platform for adequate FDI because their efficiency is rarely affected by data availability
[11][12]. The effectiveness of FDI algorithms is highly dependent on the nature of the discriminative content of the input signal of the device under monitoring; thus, significant discriminative feature extraction from raw signals is critical
[13]. As a result, the study was motivated to investigate various ML algorithms and their efficacy on current-based fault detection after peak-detection-based feature extraction. As a result, a handful of popular ML-based classifiers are presented and discussed to present their theoretical background for FDI.
DT is one of the most common, cost-efficient, and reliable known ML algorithms that has been effectively employed for both regression and classification problems. DT is an algorithm that uses a tree-like structure of decision-making rules to classify input data into subsets and to make predictions based on this classification
[14][15]. Its two main advantages are its ease of use and its ability to present solutions with various outputs
[15]. However, this model is prone to over-fitting and under-fitting, which can be overcome with pruning. Again, even with proper pruning, a perfect solution to the problem is not guaranteed
[13]. Random forest (RF), on the other hand, mitigates the major challenges of DT by establishing a great number of decision trees at the same instance
[13]. RF passes the presented sample through its various structures with different classifiers, computing and storing the output of each tree, which it further compares with single outputs of the popular trees to derive the final classifiers. By simply changing its key parameters, this model eliminates the major problem of DT
[13]. One of the well-known disadvantages of RF is its complexity, which can result in high computational cost
[16]. Booster algorithms such as Adaboost classifier (ABC), gradient boosting classifier (GBC), and XG boost (XGB) have been used to improve the efficiency and predictive accuracy of weak classifiers such as DT, regressors, and so on. These boosters are ensemble learning algorithms that combine weak learners to produce strong learners by minimizing their training errors
[16]. However, these boosters have their challenges, which provides justification for further development of other algorithms to address such issues. For example, as more trees are added to their structure, these models are prone to over-fitting. However, in comparison, each booster presents a distinctive advantage over the other. GBC outperforms ABC in terms of accuracy due to its immense flexibility, which allows the algorithm as many differentiable and convex loss functions as possible
[16]. On the other hand, XGB’s scalability presents a structure that achieves algorithmic optimization, distinguishing it from the other boosters
[17].
Interestingly, some ML algorithms make their predictions based on the assumption of a set of particular mathematical sequences or theories. For instance, k-nearest neighbor (KNN) is predicated on the assumption that any group of data with similar features will have similar feature values
[18]. As a result, KNN performs better in cases where the datasets are evenly distributed; however, in cases where the datasets differ slightly, the accuracy of KNN may be affected
[13]. On the plus side, normalization is critical in ensuring even representation of all feature values when feeding datasets to KNN for improved performance. Naive Bayes classifier (NBC) is a popular type of theorem-based learner; it is based on Bayes’ theorem, which defines the relationship between two conditional probabilities of a specific event based on available prior information about the event under consideration
[19]. NBC is a better classifier than other models whose principles are also based on Bayes’ theorem because it presents a simpler model with a simpler computational procedure
[13].
Overall, the accuracy of ML algorithms has improved over the years, as many algorithms employ techniques that would readily predict complex datasets to give an outstanding result—SVM is a unique ML algorithm that employs a hyper-plane to create its decision boundary using support vectors. It provides space for the user to define gamma parameters for decision boundaries, and its performance is based on: the distance of the sample on either side can change influence; its regularization parameter determines the distance between the decision boundary and separation; its various kernels (for nonlinear boundaries), radial-based function (RBF), and so on
[20]. SVMs are known to be computationally efficient; however, as the parameter values increase, the computational speed significantly drops
[13], which is a major drawback for its use on large datasets. Amongst ML-based learners, multi-layer perception (MLP) has a relatively high predictive accuracy compared to other methods. MLP is a feed-forward neural network (FFNN) with three structures by default: input, hidden, and output layers
[21]. It is very efficient for both supervised and unsupervised situations due to its architecture, learning sequence, and flexibility, making it ideal for classification
[13][21]. MLP’s difficulty in implementation and interpretation are some of its significant drawbacks
[21].