Regarding the risk of runway overrun for commercial aircraft, existing research can be broadly categorized into two types. The first type of research focuses on post hoc analysis of runway overrun incidents based on historical accident data
[5][6][7][8][9]. These studies mainly employ probabilistic statistical analysis methods, combined with accident investigation reports, to analyze the causes of runway overrun accidents and propose corresponding improvement measures. The main problem with this type of research is that there is a limited number of accident data samples. Therefore, it is difficult to fully explore valuable information within vast QAR data, thus leading to significant limitations. The second type of research focuses on flights where runway overrun incidents did not occur and uses QAR data to analyze the relative risk levels of different flights. The main issue with this type of research is that the risk assessment metrics are oversimplistic and fail to consider the dynamic runway overrun risk due to the pilot’s deceleration operation after touchdown.
2. Runway Overrun
As mentioned above, current studies on the risk of runway overrun can be roughly divided into two categories, i.e., studies based on real historical accident data and studies based on QAR data. In this section, a brief overview of the related work in these two categories is given.
For studies based on historical accident data, Kirkland et al.
[5] attempted to normalize historical accident data to facilitate future research. They further adopted bivariate analysis to build a probabilistic model for the risk factors of runway overrun incidents
[6]. Their model mainly considers factors such as the aircraft weight, tailwind, light conditions, weather, approach speed, and touchdown point. Valdés et al.
[7] added factors such as the aircraft type, airport elevation, and safety area length to their probabilistic model. Ayres et al.
[8] built a probabilistic model that considers the spatial distribution of accident locations to describe runway overrun and excursion accidents. Wagner et al.
[9] studied the severity of accident consequences using logistic regression and Bayesian logistic regression methods based on over 1400 runway overrun and excursion data from an ACRP database from 1970 to 2009, which included five types of accidents: runway overrun and runway excursion in both takeoff and landing phases and undershoot.
For studies based on QAR data, machine learning and risk assessment models are mainly employed to analyze QAR data, thereby identifying key factors contributing to the runway overrun risk. Wang et al.
[10][11] analyzed long landing incidents with a risk of runway overrun and used variance analysis and linear regression (LR) methods based on QAR data to analyze different factors of long landing. Subsequently, Wang et al.
[12][13] proposed a runway overrun risk assessment model based on QAR data, defining the long landing risk as the product of the probability of a certain landing distance and the severity of risk corresponding to that landing distance. Kang et al.
[14][15] investigated the long landing problem and proposed a deep sequence-to-sequence model to predict the landing speed and distance. Lv et al.
[16] defined runway overrun risk as a function related to the remaining runway distance and touchdown speed, dividing flights into high-risk and low-risk flights based on the magnitude of the risk indicator. They ultimately employed machine learning algorithms for high-risk flight classification. Ayra et al.
[17] took the remaining runway distance when the aircraft’s ground speed reaches 80 knots as a measure of runway overrun risk. They used Bayesian networks to analyze the influencing factors of runway overrun risk, including the crosswind, tailwind, surface contamination, approach mode, autobrake usage, and entry altitude. To address the problem of a lack of positive samples for runway overrun, Koppitz et al.
[18] employed subset simulation methods to calculate the changes in the probability of runway overrun incidents based on selected factor distributions, thereby identifying relevant risk factors.
The limitations of the existing studies are as follows: For studies based on historical accident data, they can only provide coarse-grained information, such as the weather condition, aircraft weight, aircraft age, etc. Therefore, compared to QAR data, the available information from historical accident data are very limited, and it is difficult to uncover valuable in-flight information to support flight safety analysis. For studies based on QAR data, the current risk assessment metrics are usually oversimplistic and they failed to consider the dynamic runway overrun risk of the pilot’s deceleration operation after touchdown.
3. Other QAR-Based Flight Safety Studies
In addition to studying runway overrun, some other studies have been performed regarding flight safety issues based on QAR data, among which the main category is exceedance events. Qi et al.
[19] proposed a new method for partitioning the risk subspaces of exceedance events based on rough set theory and a particle swarm multiobjective optimization algorithm. Liu et al.
[20] developed a risk assessment model for exceedance events, defining exceedance risk as the product of the probability of exceedance events and the severity of the events. They developed a quality assessment system for pilot operations based on their model. Wang et al.
[21] investigated the relationship between the risk cognition of pilots and exceedance events based on QAR. Li et al.
[22] investigated the tail strike risk and proposed an unsupervised learning method to discover patterns of unsafe pilot stick operations during the landing stage.
Hard landing is another typical flight safety incident that researchers are concerned about. Hu et al.
[23] designed a prediction model based on a support vector machine (SVM) for hard landing. Qiao et al.
[24] employed RBF neural networks and the K-means clustering algorithm to predict hard landing. Tong et al.
[25] addressed the problem of hard landing using a deep learning framework. Considering the temporal characteristics of QAR data, they proposed a hard landing prediction framework based on long short-term memory (LSTM) networks. Additionally, they applied LSTM networks to predict aircraft landing speeds
[26]. Chen et al.
[27] used scalar measurements and aggregated QAR data to detect the influential features of hard landing. Recently, Li et al.
[28][29] performed automatic classification and identification for the causes of hard landing using the K-means clustering algorithm. Chen et al.
[30] proposed a deep learning neural network model with time-aware attention for interpretable hard landing prediction. Jin et al.
[31] developed transfer learning methods for high-dimensional quantile regression and applied the methods to solve the problem of determining the hard-landing risk for flight safety.
In terms of abnormal flight analysis, Li et al.
[32][33][34] applied clustering and outlier detection methods to identify abnormal flights from massive QAR data. Later, Li et al.
[35] improved their model to detect the specific flight phase of abnormal flights in which the QAR parameters deviate from their normal behavior.