Data-Driven Predictive Maintenance

Data-Driven Predictive Maintenance: Comparison

Please note this is a comparison between Version 1 by Bruno Miguel Veloso and Version 3 by Bruno Miguel Veloso.

Cyber-physical systems in Industry 4.0 are reforming conventional decision-making processes, mainly through integrating entities and functionalities via telecommunication systems and intelligent data processing approaches. This reformulation brings new challenges and increases complexity. Nevertheless, these advancements might provide new solutions for typical problems, such as system failures, and thus, for maintenance approaches. Predictive Maintenance (PdM) is a data-based approach that emerged as a prominent field of research among many existing maintenance approaches. We have three main categories in PdM: model-based prognosis, knowledge-based prognosis, and data-driven prognosis. Data-driven PdM strategies appeared with great prominence and importance both in industry and academia. It uses statistical analysis, Machine Learning (ML) models, and Deep Learning (DL) solutions to model system behaviour. The ultimate goal is to discover trends and predict failures, which improves a system’s reliability.

condition-based maintenance
predictive maintenance
machine learning
deep learning
artificial intelligence
railway industry

1. Introduction

Maintenance corresponds to the process that deals with equipment or system components to ensure their normal functioning under any circumstances. Over the years, several different maintenance approaches have been developed, each representing a different generation over time due to technological advances. Three main maintenance approaches can be classified as below^[7] [7]:

Corrective maintenance: It means run-to-failure, which is the simplest and the oldest method. The idea is to act only after a machine or equipment fails. It would almost always lead to high (unexpected) downtime, besides having maintenance staff expenditure. This method usually generates a critical situation that will demand a great cost for companies.
Preventive maintenance: It provides planning of regular replacement of components and/or equipment. Considering historical failure data and/or the data provided by the equipment manufacturer, Mean Time To Failure (MTTF) is calculated, which in turn is used by the maintenance team to propose a preventive action plan. Although this approach prevents unexpected shutdown, it usually needs additional costs and an increased unexploited lifetime.
Predictive Maintenance (PdM): It needs direct monitoring of the mechanical condition and other parameters to determine the operating conditions over time. Indeed, due to technological advances, existing tools can process real-time data acquired from different equipment parts to predict any sign of failure.

In the last few years, many works have addressed data-driven Predictive Maintenance (PdM) by the use of Machine Learning (ML) and Deep Learning (DL) solutions, especially the latter. The monitoring and logging of industrial equipment events, like temporal behaviour and fault events — anomaly detection in time-series — can be obtained from records generated by sensors installed in different parts of an industrial plant. However, such progress is incipient because we still have many challenges, and the performance of applications depends on the appropriate choice of the best features and methods to capture the system behaviour.

2. Data-driven Predictive Maintenance

Predictive maintenance attempts to predict failures and avoid system shut down proactively, which differs from traditional maintenance techniques (e.g., corrective and preventive). Detecting and preventing failures in industries with high operational risk (e.g., the railway industry) is ultimately essential to improve not only the system efficiency (e.g., equipment utilisation) but also its effectiveness (e.g., the integrity of the environment and human safety). Industries seek to minimise the number of operational failures, minimise operating costs, and increase productivity, making maintenance management crucial. Consequently, planning and analysis strategies are necessary to assess the equipment’s operating status and useful life.

Due to the complexity involved in an industrial process, several automated solutions have been developed to support decision making by performing future projections about equipment state using signal processing techniques. Modern transportation, for example, is highly dependent on these automated solutions to move cargo and passengers. The global increase in production and logistics needs higher use of the railway industry. Thus, common damages will occur in the overall structure and components due to factors such as weather and degradation. These could potentially lead to accidents of different proportions, which can even cause fatalities^[2] [2].

Classification of automatic industrial maintenance approaches.

Figure 1 - Classification of automatic industrial maintenance approaches.

Over the years, PdM practices have been developed from several perspectives: Failure prediction, to predict equipment failure over time interval; Remain Useful Life estimation, to estimate the remaining useful lifetime of equipment; and Root Cause Analysis, identification of the causes of the failure. These two perspectives are illustrated in Figure 1 and are detailed next.

Failure Prediction is the most generic and direct perspective for the Predictive Maintenance practices for which the main goal is to predict the approximate moment where some failure could occur.
Remain useful life is strongly related to prognostics, which provides the amount of time equipment will be operational before it requires any repair or replacement. Prognostic is directly related to Mean Time to Failure (MTTF) estimation and the likelihood of system failure. It can be regarded as a forecasting process given the current machine conditions and its historical record
Root Cause analysis is related to diagnosis. The identification of the most probable causes of the failure.

In the past decade, many works addressed data-driven PdM using ML/DL approaches, but mainly the latter. The monitoring and logging of industrial equipment events, like temporal behaviour and fault events, can be obtained from data and records generated by various sensors installed on the equipment. Specifically, sensors can be implemented to PdM to decrease the failure rate and enhance the system reliability^[3] [3]. Such sensors can monitor and generate alerts for equipment with the need for attention. The progressive development of industrial (wireless) sensor networks and emerging technologies, e.g., IoT^[3][4][5] [3,4,5], brings about generating a massive amount of data with scale and higher reliability. In this perspective, ML/DL algorithms are particularly relevant to create advanced mining methods for the PdM.

Recent advances in sensors and computing technology have given rise to PdM, which maximises system utilisation, minimises maintenance costs, and improves safety, reliability, and efficiency. In particular, with recent technology advances in cloud storage, communication, and sensing, for the railway industry, we can monitor any part of the system more precisely and in real-time. Thus, more complex solutions are necessary to analyse data with more scalability, precision, and efficiency.

Research in PdM practices for the railway industry is progressively receiving more attention from the industry and academia. A recent literature review regarding Big Data Analytics in the railway industry can be found in^[6] [6], where the level and the types of big data models are reviewed and summarised for operations, maintenance, and safety applications. Most of the works focus on solutions that assess the infrastructure health state like railway points (switches) and interlocking systems. Although, in the case of trains, there are many other challenges related to internal conditions, like the general functioning of wagons (e.g., wheels, air compressed units, brakes), and external conditions, like weather, geographical position, in addition to other variables.

The dynamic context of the railway system is exceptionally challenging and these areas, by themselves, require the study of many combinations of analysis. In this sense, we define a taxonomy specific to the context of the railway industry. Differently, from^[6] [6], our taxonomy classifies the related works in three areas: infrastructure, scheduling policies, and vehicles. We also organise the works based on the type of data analysis method used to address PdM practices. We also employed a classification grounded on ML and DL algorithms, following the work in^[1] [1]. In practice, PdM needs a timely decision-making process that requires models to process data and adjust themselves on time.

3. Conclusion and Future Research Directions

Although the data-driven PdM is gaining more research attention, specifically in the past few years, the number of works specifically designed for the railway industry is quite limited.

Considering the research trends reviewed, we can observe some significant gaps in the literature. As noted, only a few works have faced the problem of using data as time series. Sensors typically gather data in the time-series format. Thus, we can envision this scenario as a task of anomaly detection in time series. Anomaly detection is the problem that identifies specific patterns or events in data that are pretty different from the rest and can arise in the data for many reasons.

In manufacturing systems, reducing downtime is critical, and anomaly detection enables PdM for downtime reduction. Recent works have addressed anomaly detection for PdM supported by learning strategies on sequential data ^{[1][2][3][4][5][6][7]}[2,39,106,115,116,117,118]. In the last few years, we can find several papers published approaching Anomaly Detection with Time-Series data applied to the most different domains ^{[8][9][10][11][7][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33]}[1,109,112,114,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140].

The major challenge is dealing with models with a high volume of time series in real-time to perform anomaly prediction. Moreover, currently used metrics are not feasible in this context. It will be indispensable to look for new alternatives that can efficiently evaluate models.

The other essential line of action is to look for different DL algorithms and architectures like RNN, GAN, TL and RL. Recent works have proposed approaches based on DL to resolve anomaly detection in time series ^{[34][18][20][32][35][36]}[28,125,127,139,141,142]. Nevertheless, new proposals in this research line will be necessary.

The last challenge would be to achieve the desired synergy between ML/DL methods and RCA by gaining automatic reasoning power to explain causality.