Fault detection/diagnosis has become a crucial function of the battery management system (BMS) due to the increasing application of lithium-ion batteries (LIBs) in highly sophisticated and high-power applications to ensure the safe and reliable operation of the system.
Owing to their high energy density, high power density, long service life, environmental friendliness and low self-discharge rate, lithium-ion batteries (LIBs) have become the prime energy storage system for many applications such as electric vehicles (EVs), grid-level power storage and several other consumer electronics 
. However, the safe and reliable operating area of the LIB is very narrow, which necessitates a battery management system (BMS) for effective operational control, protection and energy management 
. In addition, due to the limitation of the cell voltage and storage capacity of a single LIB cell, high power applications of LIBs such as EVs and grid-tied energy storage systems require hundreds or even thousands of single battery cells 
. Cell inconsistencies in a LIB pack are a common issue; thus an appropriate BMS is also indispensable for the safe and reliable operation of the LIB pack as well as every single cell of the battery pack 
. A BMS can be designed to serve many functions including but not limited to data acquisition, estimation of the state of charge (SOC) 
and state of health (SOH) 
, temperature measurement/estimation 
, cell balancing 
, fault detection/diagnosis 
and thermal management 
. Very recently researchers have started paying attention to the detection/diagnosis of faults after the occurrence of several accidents in e-transportation due to the failure of the LIB system 
. It was evidenced that extreme operating conditions, manufacturing flaws and battery aging were among the prime reasons behind the battery system failure.
The performance of LIB is affected by different abusive operating conditions such as overcharge or over-discharge event, startup at low temperature, vibration and higher heat generation resulting in metallic lithium plating, formation of solid electrolyte interphase layer and formation of lithium dendrite that eventually accelerate the aging of LIB and may lead to catastrophic failure during operation 
. Therefore, the presence of fault detection and diagnosis features of BMS is highly critical for LIB-powered systems, especially for high-power applications. Smarsly et al. 
demonstrated that a minor fault could eventually result in dangerous consequences without proper fault diagnosis and defense mechanism. Relevant discussion on the importance of fault diagnosis and defense mechanisms was also presented by Williard et al. in references 
. Studies on LIB fault mechanisms that tried to find the causes and consequences of LIB faults have been extensively reported in the literature. Research on LIB fault detection and diagnosis has gained momentum in the last few years. Faults in the LIB system are typically classified as internal and external faults. Some of the most frequently reported external faults are cell wiring faults, faults in the thermal management system and sensor faults such as temperature, voltage and current sensor faults, whereas some common internal battery faults are overcharged, over-discharged, internal short circuit (ISC), accelerated degradation and thermal runaway. Tran et al. 
presented a detailed classification of the commonly reported LIB faults. A few other studies also classified the LIB faults from control system perspectives 
. They grouped the overcharged, over-discharged, overheating, external short circuit (ESC), ISC, electrolyte leakage, battery swelling, battery accelerated degradation and thermal runaway faults as battery faults. On the other hand, the voltage, current and temperature sensor faults were grouped under the sensor faults and the terminal connector fault, cooling system fault, controller area network (CAN) bus fault, high voltage contactor fault and fuse fault were included under actuator faults.
Realizing the importance of fault detection/diagnosis for the safe and reliable operation of LIB, a significant number of research studies were conducted aiming towards developing an accurate, reliable, robust and easy to implement fault diagnostic strategy. Lu et al. 
briefly illustrated the reason why the development of an effective fault diagnosis system is crucial for the advancement of LIB-powered systems. Special concentration on the sensor fault diagnosis was provided by Xiong et al. 
. Lyu et al. 
presented a detailed discussion particularly on the failure mechanism of LIB and its possible solutions through a state-of-the-art review study. Fault diagnosis methods reported in the literature can be broadly categorized into model-based and non-model-based methods. However, a fusion of these two categories is also reported. A detailed classification of LIB fault diagnosis methods is presented in .
Figure 1. Classification of LIB fault diagnosis methods.
It is noticed from the literature that model-based and signal processing-based methods have been most extensively used for the LIB fault detection among the fault diagnosis methods mentioned in . Machine Learning (ML)-based techniques were very recently adopted; however, they are increasing at a much faster pace owing to some of the prominent advantages such as a high level of accuracy, compatibility with the highly nonlinear LIB system and reduced dependence on domain experts. The accuracy and reliability of model-based fault diagnostic strategies predominantly depend on the accurate equivalent circuit model (ECM) of LIB. Obtaining a highly accurate model is challenging as the internal characteristics of the highly nonlinear LIB are still not fully understood. This limitation is eliminated with the advent of ML-based techniques. Furthermore, the impacts of measurement noises that limit the application of signal processing-based methods are also reduced to a significant extent with the deployment of ML-based techniques. In short, ML-based techniques simplified the fault diagnosis by eliminating two complex and time-consuming steps: Collecting the battery’s accurate physical information and learning the nonlinear correlation between battery internal parameters and external measured parameters such as operating current, terminal voltage and temperature. These eventually reduce the requirement of domain-specific knowledge, time and the cost of the system development.
2. ML-Based Fault Diagnosis Techniques
Currently, ML techniques are extensively used in the BMS of LIBs. A summary of ML approaches in BMS is presented by Reza et al. through a review study 
. Here, this section aims to provide an overview of different ML techniques that are currently being used in BMS applications specifically for LIB fault diagnosis. A complete family of ML approaches that are successfully used in BMS of LIB is illustrated in . Among all these, the ML approaches that were already employed for the LIB fault diagnosis are highlighted in green color, whereas the remaining potential approaches are highlighted in yellow for the readers’ convenience. A brief description of each ML-based fault diagnosis technique as mentioned in is presented below.
Figure 2. A complete family of ML approaches, used in BMS of LIB.
2.1. Artificial Neural Network
Artificial Neural Network (ANN) is one of the most widely used frameworks of ML algorithms to perform a wide variety of tasks 
. It is inspired by the biological neural networks that constitute animal brains. ANN uses supervised learning approaches during model training. Features like self-adaptability and learning abilities of the animal brain enable ANN to perform tasks by considering examples, generally without being programmed with task-specific rules. Moreover, ANN is capable of effectively capturing the dynamics of a highly nonlinear system. All these features make it suitable for LIB due to its highly complex and nonlinear dynamic characteristics. The basic strategy is to form a nonlinear black-box of an ANN-based fault diagnosis model by learning implicit rules from known pairs of input and output data, then to validate the model by test input and output data that are unknown to the model. The training is typically conducted offline. Then the ANN model can effectively distinguish between the normal and abnormal conditions of the battery system provided the ANN model is well-trained with a sufficient amount of data. There are several variants of ANN that can be broadly classified into two subgroups. The classic neural networks subgroup includes wavelet neural network (WNN), back-propagation neural network (BPNN), radial basis function network (RBFN), feed-forward neural network (FFNN) and extreme learning machine (ELM). On the other hand, modern neural networks are often recognized as deep NNs, which mainly include recurrent neural networks (RNNs) and convolutional neural networks (CNNs). Often a combination of one of these techniques is also used in BMS applications such as long short-term memory network (LSTM) and RNN-LSTM. Here, LSTM is an expansion of RNN and CNN; similarly, RNN-LSTM is a combination of RNN and LSTM. There are several other variants of ANN that were discussed in detail by S. Walczak 
. However, among these ANN-based techniques only basic ANN, LSTM, RNN, RNN-LSTM, and few other hybrid techniques have so far been used for fault detection of LIBs as mentioned in .
2.2. Random Forest Classifier
Like ANN, Random Forest (RF) classifier is also a supervised ML approach that has demonstrated satisfactory performance while employed in various classification problems such as sleep stage classifications from electroencephalography (EEG) data 
, bearing fault identification from vibration data 
, facial expression detection from video data 
, crop type classification from hyperspectral images 
, lung vessel segmentation from computed tomography (CT) images 
and many more 
. RF uses the multiple numbers of trees of slightly different structures that are collectively employed for classifications. Collaboration among trees in RF makes the model more robust compared to any single classifier typically used in other statistical classification problems 
. RF is a linear classifier with reduced computational complexity when compared to some other popular classifiers, making it suitable for lightweight algorithms for real-time operation 
. Moreover, the scaling of the features is highly convenient and the parallel operation of the algorithm alongside the primary usage of the system is not an issue as well.
2.3. Support Vector Machine
Despite the requirement of highly complex quadratic programming, the Support Vector Machine (SVM) is increasingly used for solving classification and regression problems in recent times 
. SVM tries to form different data clusters by constructing hyperplanes in high dimensional space in order to distinguish a different class of data while dealing with classification problems. The typical criteria for finding optimal separation boundaries are to maximize the distance between the hyperplane and the nearest data point of any cluster. SVM is becoming a powerful tool for regression analysis in highly nonlinear systems like LIB. The use of SVM in regression is also termed Support Vector Regression (SVR). SVR uses different kernel functions and regression algorithms to transfigure a nonlinear model into a linear model for ease of analysis. There is also another variant of SVM, namely, kernel space vector machine (KSVM). Further details of SVM can be found in reference 
. Like ANN, SVM-based fault diagnostic techniques also do not require an equivalent battery model. SVM is also a supervised ML approach. So far, only SVR and SVM have been used for fault diagnosis of LIBs in this category.
2.4. Gaussian Process Regression
Gaussian Process Regression (GPR) is an unsupervised ML technique. The two primary goals of this technique are clustering the data into groups by similarity and dimensionality reduction to compress the data while maintaining its structure and usefulness of data. GPR also uses kernel-based ML approaches which can discover prognostics by leveraging prior knowledge based on the Bayesian model. Thereafter, it utilizes the variance around its mean prediction to provide information about the associated uncertainty in the system. So far very few studies have used GPR for LIB fault diagnosis.
2.5. Logistic Regression
Logistic Regression (LR) is a statistical classification technique used to classify observed data based on the pre-defined criteria 
. This method is the simplest method for two-class classification and has shown very good performance in linear and nonlinear regression. However, so far very few researchers have employed LR for fault detection/diagnosis in the LIB system