Safety in Traffic Management Systems

Safety in Traffic Management Systems: History

View Latest Version

Please note this is an old version of this entry, which may differ significantly from the current revision.

Subjects: Computer Science, Artificial Intelligence

Contributor:

Jing Li

Traffic management systems play a vital role in ensuring safe and efficient transportation on roads. However, the use of advanced technologies in traffic management systems has introduced new safety challenges. Therefore, it is important to ensure the safety of these systems to prevent accidents and minimize their impact on road users.

traffic safety
proactive safety methods
safety analysis

1. Introduction

Traffic safety is of paramount importance, particularly in the era of emerging technologies like automated vehicles and connected vehicles [1]. As these technologies continue to evolve and become more prevalent on the roads, the potential for safer transportation increases significantly. Automated vehicles have the potential to minimize human error, which is responsible for the majority of traffic accidents. With their advanced sensors and algorithms, they can detect and respond to potential hazards more swiftly and effectively than human drivers. Similarly, connected vehicles enable real-time communication between vehicles and infrastructure, allowing for enhanced awareness and coordination on the road. This connectivity facilitates the exchange of critical information, such as traffic conditions, weather updates, and road hazards, thereby enabling drivers to make informed decisions and avoid potential dangers.

2. Safety in Traffic Management Systems

The analysis of safety in traffic management systems involves evaluating and assessing the safety aspects of various components and processes within a transportation system. It aims to identify the potential hazards, assess the risks, and implement measures to mitigate those risks, ultimately ensuring the safety of road users and minimizing the occurrence of accidents. In addition to risk analysis, researchers also strive to analyze injuries with the goal of minimizing their occurrence and severity to the lowest possible level. The analysis of safety in traffic management systems is a multidisciplinary field that combines expertise from transportation engineering, data analysis, human factors, and policy making to ensure safer road environments and reduce the likelihood and severity of accidents and injuries.

2.1. Method

Matched-pair Analysis. Matched-pair analysis, also known as paired analysis or paired comparison, is a statistical method used to compare two related sets of data or observations. It is particularly useful when studying situations where it is difficult to establish a direct cause-and-effect relationship between variables or when dealing with data that exhibit a high degree of variability. In matched-pair analysis, each observation in one group or condition is paired or matched with a corresponding observation in the other group or condition. The pairing is conducted based on similarities or relevant characteristics between the observations, such as age, sex, or some other relevant factor. The pairing ensures that each pair of observations is as similar as possible, except for the variable being investigated. By pairing observations, it helps to control for individual differences or confounding variables that could affect the outcome being measured. This analysis method increases the precision and reduces the potential biases associated with unpaired comparisons. Matched-pair analysis was applied in [6] to analyze the relative crash risk during various types of precipitation (rain, snow, sleet, and freezing rain).

Mutual Information Theory. Mutual information theory is a concept in information theory that measures the amount of information that is shared or transmitted between two random variables. It quantifies the degree of dependence or association between the variables and provides a measure of the reduction in uncertainty about one variable given knowledge of the other variable. Entropy is a fundamental concept in information theory that characterizes the uncertainty or randomness of a random variable. It measures the average amount of information needed to specify the outcome of a random variable. Higher entropy indicates higher uncertainty. In addition, mutual information measures the amount of information that two random variables share. It quantifies the reduction in uncertainty about one variable by knowing the value of the other variable. Mathematically, it is the difference between the entropy of the individual variables and the joint entropy of the two variables. If the mutual information is high, it indicates a strong relationship between the variables, suggesting that knowledge of one variable provides substantial information about the other variable.

Matched Case Control. The matched case-control approach can be applied to analyze crash occurrences during special scenarios such as evacuations [7,28]. This approach allows for a thorough investigation of the potential risk factors or exposures that contribute to crashes in a special scenario while controlling for the confounding variables. For example, the authors in [7], discussed a study focused on understanding the factors contributing to increased crash occurrences during evacuations, particularly in the context of hurricanes. The researchers adopted a matched case-control approach and analyzed the traffic data collected shortly before each crash. They considered data from upstream and downstream detectors surrounding the crash location.

Structural Equation Modeling (SEM). Structural Equation Modeling (SEM) is a statistical modeling technique used to analyze complex relationships among observed and latent (unobserved) variables. It allows researchers to test and estimate the relationships between variables, examine causal relationships, and assess the overall fit of the model to the data. The researchers first specify a theoretical model that represents the hypothesized relationship between the observed variables and the latent variables. The model is typically represented as a set of equations that describe the relationships between the variables. Then, the SEM distinguishes between the observed variables and the latent variables. The relationship between variables is often depicted using a path diagram indicating the hypothesized direction and strength of the relationships. Overall, SEM is a versatile and powerful technique that can handle complex models with multiple variables, assess both the measurement and structural aspects of the model, and provide insights into the underlying relationships.

Logistic Regression. Logistic regression is a statistical modeling technique commonly used to analyze the risk factors associated with road safety. It allows researchers to understand the relationship between various risk factors and the likelihood of a specific outcome, such as road accidents or crash occurrences.

Negative Binomial Regression. Negative binomial regression is a statistical method used to analyze count data with overdispersion, which occurs when the observed variance is greater than the mean. It is a generalized linear regression model that is particularly suited for modeling count outcomes, such as the number of events or occurrences. The estimation of negative binomial regression is typically conducted using maximum likelihood estimation. The model provides estimates of the regression coefficients, which indicate the direction and magnitude of the relationship between each independent variable and the count outcome. Additionally, the model provides information on the dispersion parameter, which indicates the degree of overdispersion in the data. The authors in [35] utilized negative binomial regression models, using these indicators to anticipate the risk level of horizontal curve segments. Negative binomial regression was also used in [36] to examine the impact of risk factors independent of exposure when analyzing the risk of cycling crashes.

ANOVA (Analysis of Variance). It is a statistical technique used to compare the means of two or more groups to determine whether there are any significant differences between them. It allows for the examination of variation within groups as well as between groups. ANOVA tests the null hypothesis that the means of all groups are equal, and if the observed differences between the groups are larger than what would be expected by chance, the null hypothesis is rejected. ANOVA provides valuable insights into the significance of group differences and is widely used in various fields, including psychology, biology, and the social sciences. The study [38] used ANOVA to examine accident severity but highlighted the limitations due to incomplete or unclear data in the national census and statistics yearbooks.

Association Rule Mining. Association Rule Mining (ARM) is a data mining technique that aims to discover interesting relationships or patterns within a dataset. It focuses on identifying associations or correlations between different items or variables in large datasets. The ARM works by analyzing transactions or records to find frequent itemsets, which are sets of items that often appear together. From these frequent itemsets, association rules are generated, which describe relationships between items based on their co-occurrence. These rules consist of an antecedent (if) and a consequent (then) and can be used to uncover valuable insights, make predictions, or support decision making in transportation fields.

Autoencoder. An autoencoder is an unsupervised neural network architecture that aims to learn efficient representations of input data by reconstructing them from a compressed latent space. It includes an encoder network that converts the input data into a lower-dimensional representation, as well as a decoder network that reconstructs the input based on the encoded representation. Both the encoder and decoder are trained jointly to minimize the disparity between the initial input and the reconstructed output.

Propensity Score Weighting Approach. The propensity score weighting approach is a statistical method used to estimate causal effects in observational studies. It addresses the issue of confounding variables by creating a weighted sample that equalizes the distribution of covariates across treatment groups, mimicking a randomized controlled trial (RCT) design. The propensity score is estimated using a logistic regression model. It summarizes the covariant information into a single value for each observation and provides a way to create a pseudo-randomization by balancing the covariate distribution between different groups. Once the propensity scores are estimated, each observation is assigned a weight based on its propensity score. The weight reflects the inverse of the probability of the receiving control condition.

SHapley Additive ExPlanation (SHAP). The SHapley Additive ExPlanation (SHAP) is a method used to explain the predictions of machine learning models. It provides an interpretation of how each input feature contributes to the model’s output prediction. The SHAP is based on the concept of cooperative game theory and assigns values to each feature based on their contribution to the prediction in a fair and consistent manner. The SHapley value is calculated for each input feature, indicating its contribution to the model’s prediction. The SHAP calculates feature importance by considering all possible coalitions of features and measuring their contributions to the prediction. For each coalition, the SHapley value is calculated by averaging the marginal contributions of the features. This process accounts for the interactions and dependencies between features.

T-test. A t-test is a statistical analysis method employed to ascertain whether there exists a notable distinction between the means of two groups or conditions. It assesses whether the difference observed in sample means is likely to be a true difference in the population or simply due to random variation. Son et al. [43] utilized a t-test in a connected vehicle setting to assess the efficacy of an in-vehicle advanced warning information service for mitigating secondary crashes.

Mann–Whitney Test. The Mann–Whitney test, also known as the Mann–Whitney U test (i.e., the Wilcoxon rank-sum test), is a nonparametric statistical test used to compare the distributions of two independent samples. It is used when the data do not meet the assumptions of normality required for parametric tests such as the t-test. The Mann–Whitney test compares the ranks of the observations between the two groups rather than the actual values.

2.2. Research Outcomes

Analysis of Roadway Safety. Tobin et al. [6] highlighted that the relative crash risk is significantly higher during periods of precipitation compared to non-precipitation periods. In a similar vein, Wen et al. [41] demonstrated that the importance of risk factors varies across different crash types. For rear-end (RE) crashes, the speed limit emerged as a more crucial risk factor than lane width, right/left shoulder width, and median width. Conversely, for run-off-road (ROR) crashes, the opposite relationship was observed.

Analysis of Freeway Safety. Several research findings have contributed to our understanding of factors influencing crash occurrence and risk mitigation. Rahman et al. [7] identified a high variation in speed at a downstream station and high traffic volume at an upstream station as factors increasing the likelihood of crash occurrence. Zheng et al. [17] further supported this by demonstrating that the downstream average speed was the best crash precursor variable across different segment types. The effectiveness of warning information systems in preventing secondary crash risks was shown by Son et al. [43], with Jang et al. [45] reporting a significant reduction in the crash potential through the provision of warning information.

Analysis of Intersection Safety. Several research findings have shed light on various aspects related to crash risk and intersection safety. Kwon et al. [47] emphasized the significance of intersection characteristics, such as the proportional area of sky and roadway, in influencing the perceived crash risk among school-aged children. Mitra et al. [48] identified multiple factors that significantly influenced both the frequency and severity of crashes, including blocked carriageways, approach traffic volume, traffic configuration, type of minor road, presence of protected right turning phase, tram stops, all-red time, visibility of road markings, and non-motorized traffic.

Analysis of Vehicle, Pedestrian, and Cyclist Safety. Research findings in the field of road safety have provided insights into various factors affecting the crash risk for different road users. For vehicle-related factors, Baikejuli et al. [27] highlighted the contribution of multifactor interaction, such as environmental and vehicular factors, in increasing the crash probability for heavy-truck fatal crashes.

3. Operation or Control

3.1. Method

Convolutional Neural Networks (CNNs). Convolutional Neural Networks (CNNs) are deep learning models specifically designed for analyzing visual data like images. They use layers of filters to extract meaningful features from the input images. These filters perform convolution operations, highlighting patterns and structures. The network then learns to recognize complex features by stacking multiple convolutional layers. Pooling layers downsample the feature maps to capture important information. Fully connected layers process the extracted features and make predictions. CNNs are trained using labeled data, optimizing their parameters to minimize prediction errors. CNNs have revolutionized computer vision tasks by automatically learning relevant features directly from images, enabling them to achieve high accuracy in tasks like image classification and object detection.

LSTM. LSTM (Long Short-Term Memory) is a type of neural network that excels at processing sequential data, like text or time series. Unlike traditional recurrent neural networks (RNNs), LSTMs can remember long-term dependencies in the data by selectively retaining or discarding information using memory cells and specialized gates. This allows them to capture patterns and make accurate predictions in tasks involving sequences. Previous research often combines the convolutional layers with the long short-term memory (LSTM) neural network in a unified deep learning framework, utilizing LSTM to extract temporal features.

XGBoost. XGBoost, short for Extreme Gradient Boosting, is a powerful machine learning algorithm known for its exceptional performance in various predictive modeling tasks. It belongs to the gradient boosting family of algorithms and combines the strengths of decision trees with a boosting approach. XGBoost iteratively builds an ensemble of weak decision tree models, optimizing a specific objective function by minimizing the residuals of the previous model. It incorporates regularization techniques to prevent overfitting and employs parallel processing to accelerate training. XGBoost is highly efficient, scalable, and capable of handling large-scale datasets, making it a popular choice for tasks such as regression, classification, and ranking.

Reinforcement Learning (RL). Reinforcement learning is a branch of machine learning, where an agent learns to make sequential decisions in an environment to maximize a cumulative reward signal. It involves an agent interacting with an environment, taking actions, receiving feedback in the form of rewards, and adjusting its behavior over time through trial and error. Reinforcement learning is applicable to traffic management because it can optimize traffic flow, reduce congestion, and improve overall efficiency. By treating traffic management as a sequential decision-making problem, reinforcement learning algorithms can learn to control traffic signals, dynamically adjust the traffic flow, and optimize the traffic patterns based on real-time feedback.

Generative Adversarial Network. GAN stands for Generative Adversarial Network, which is a type of deep learning model consisting of two main components: a generator and a discriminator. The generator is trained to produce synthetic data samples that exhibit similarity to the real data, while the discriminator is trained to differentiate between the real and synthetic data. They are trained simultaneously in a competitive manner, with the goal of the generator producing data that can fool the discriminator.

Graph Neural Network. Graph Neural Networks (GNNs) are a type of neural network specifically designed to handle data with graph structures. GNNs can effectively capture and model the relationships between entities represented as nodes and edges in a graph. In the context of traffic management, GNNs can be applied to various safety-related tasks. For instance, GNNs can analyze road networks and capture the complex dependencies between different road segments, traffic intersections, and their associated attributes (e.g., traffic volume and speed limits). By leveraging this information, GNNs can predict traffic congestion, identify accident-prone areas, and even optimize traffic signal timings for improved safety. GNNs enable a holistic view of the traffic system by considering the spatial relationships and interactions between road elements, leading to more accurate and context-aware safety predictions and interventions in traffic management.

Transformer. The Transformer model is a powerful neural network architecture primarily used for sequence-to-sequence tasks, such as machine translation and natural language processing. Unlike traditional recurrent neural networks (RNNs), Transformers rely on self-attention mechanisms, enabling them to capture global dependencies between input elements efficiently. The model comprises an encoder and a decoder, each consisting of multiple layers of self-attention and feed-forward neural networks. The Transformer’s attention mechanism allows it to attend to relevant parts of the input sequence, facilitating parallel processing and capturing long-range dependencies effectively.

3.2. Research Outcomes

Control Methods for Highway, Roadway, and Urban Arterials. In general, researchers have explored strategies for enhancing deep learning models to attain improved performance, while simultaneously addressing the challenge of data imbalance and devising optimization techniques for it. Less complex deep models were found to achieve better performance according to Huang et al. [10], while Li et al. [56] demonstrated that their LSTM-CNN model achieved superior performance in terms of the AUC value, false alarm rate, and sensitivity compared to other models.

Control Methods for Intersections. Researchers have made notable contributions in optimizing safety and efficiency at signalized intersections. Ghoul et al. [14] proposed a signal-vehicle coupled control system that effectively enhanced safety with low computational intensity. Hu et al. [42] demonstrated that deep learning models, particularly the CNN model with an accuracy of 93.8%, are recommended for predicting risk levels at intersections by combining CV data and deep learning networks. Even with low CV penetration rates, this approach showed promise in determining crash risks.

Control Methods for Road Regions. In recent studies, several approaches have been proposed to address different aspects of traffic risk prediction. Zhou et al. [62] devised a dynamic graph neural network to capture real-time traffic variations and correlations among subregions, showcasing the efficacy of multitask learning. Wang et al. [63] combined a spatial–temporal geographical module with Graph Convolutional Networks (GCN) and Gated Recurrent Units (GRU), showcasing improved performance.

4. Crash Risk Prediction

Zheng et al. [68] introduced a method for forecasting the real-time probability of crashes at signalized intersections, focusing on individual signal cycles. Their approach relied on extracting traffic conflicts from informative vehicle trajectories as a foundation for crash prediction. To address the challenges posed by nonstationarity and unobserved heterogeneity in their model, they established a Bayesian hierarchical structure. The main contribution of their research lay in their novel measurement of traffic conflicts. Specifically, they employed computer vision techniques to extract traffic conflicts, quantified as modified time to collision, along with three cycle-level traffic parameters (shock wave area, traffic volume, and platoon ratio) from video data. In a subsequent study by Gu et al. [69], they further explored the realm of intersection safety by highlighting the advantages of connected vehicle technology. This innovation offers abundant vehicle motion data, establishing a stronger link between crash occurrence and driving behaviors. The authors also addressed the challenge of spatial dependence in crash frequency and the multitude of driving features involved in the prediction process. The novelty of their research lay in the introduction of a new artificial intelligence technique known as Geographical Random Forest (GRF). This technique effectively handled spatial heterogeneity and incorporated all the potential predictors.

In addition to crash risk prediction on highways, class imbalances are also present in the prediction of driving safety risks. To address this issue, Chen et al. [75] introduced a novel approach using a deep autoencoder network with L1/L2-nonnegativity constraints and cost sensitivity. This method effectively handled class imbalances and enhanced the prediction performance by determining the optimal sliding window size and automatically extracting hidden features from driving behaviors. The limitation of previous studies in analyzing driver factors and driving maneuvers, due to the absence of disaggregated driving or accident data, was highlighted by Mahajan et al. [76]. To overcome this issue, they introduced a method that considered both the likelihood and potential severity of a collision. This study focused on estimating the rear-end crash risk in specific traffic states and emphasized the significance of comprehending the evolution of crash risk under diverse traffic conditions for real-time crash prediction. Taking advantage of the progress made in deep learning technology, Li et al. [77] proposed an attention-based LSTM model for predicting lane change behavior. Their objective was to enhance both the accuracy and interpretability. Their approach involved two components: a prejudgment model utilizing a C4.5 decision tree and bagging ensemble learning and an LSTM model with an attention mechanism for multistep lane change prediction.

A rear-end collision prediction method for smart cities was proposed by Wang et al. [82]. The CNN model was used for prediction using real trajectory data, while synthetic oversampling using the genetic theory of inheritance was employed to address the class imbalance. Trirat et al. [8] introduced a deep gusion network for predicting traffic accident risk across urban areas, integrating hazardous driving statistics collected from in-vehicle sensors. The study examined the correlation between dangerous driving offenses and traffic accidents, revealing a strong correlation in terms of the location and time. The work by Elassad et al. [83] addressed the importance of real-time crash prediction and the development of fusion frameworks for intelligent transportation systems. They explored the use of machine learning models and fusion techniques to improve crash predictions by considering diverse data sources, including driver inputs, physiological signals, vehicle kinematics, and weather conditions. The paper highlighted the significance of addressing the class imbalance and presented the effectiveness of boosting in combination with k-NN, Naïve Bayes, Bayesian networks, and SVM with MLP as the meta-classifier.

This entry is adapted from the peer-reviewed paper 10.3390/designs7040100

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.