In the last few years, transportation researchers have carried out a lot of research on the occurrence of traffic congestion in road transportation and the prediction of the traffic flow at various road networks. However, few researchers have drawn on any structured research into traffic flow prediction at signalized road intersections using hybrid and heuristic predictive models. Previous studies by D’Andrea and his co-worker Marcelloni created an expert system for detecting traffic congestions at various road networks by using traffic data that comprises the past and current vehicular speed 
. Related research to 
was proposed by 
, in which a method called “scalable” was used to predict the traffic congestion of vehicles in a grid framework. Anwar and co-workers applied a spectral clustering-based method to supervise traffic congestions 
. Considering the traffic flow density and different types of roads, Liang and co-workers developed a novel prediction model capable of estimating the next-time step traffic volume using a single road traffic segment to clarify traffic congestions using traffic flow variables such as the current inflow, outflow, and traffic volume, etc. 
However, the research carried out by Xiangjie and co-workers improved the model of 
by using a support vector machine (SVM) for the prediction of the next time-step traffic speed and traffic volume and used it in the estimation of traffic congestion of segments roads 
. Researchers such as 
proposed a specialized density-based spatial clustering application (DBSCAN) using a noise algorithm. This was developed for the detection and analysis of a consistent congested cluster of grids. They investigated a deep-learning-based prediction model using a restricted Boltzmann Machine and a Recurrent Neural Network to predict the traffic flow at congested roads 
. A practical traffic flow parameter prediction model was created for traffic flow conditions estimations. An autoregressive model was combined with other predictive models 
. In their research, 
developed a model combining artificial neural networks and root mean squared error. Both were used as a metric by applying singular point probabilities. Traffic congestion has become a global pandemic that transportation researchers are racing against time to improve the effectiveness of intelligent transportation systems. Some researchers have been able to achieve good results when it comes to traffic flow prediction. Traffic flow prediction techniques are categorized into:
Traditional statistical techniques comprise the historical average method (HA) and a statistical technique called Autoregressive Integrated Moving Average (ARIMA) 
. Subsequently, the features of the ARIMA model consist of the combination of several models, such as ARIMA time series models (KARIMA) 
and the Seasonal Autoregressive Integrated Moving Average (SARIMA) 
. However, the major disadvantage of this type of model is the limitation in the processing capacities in terms of non-linear and challenging traffic flow data 
Compared with the above traditional models, traditional machine learning techniques can efficiently model complex non-linear traffic data. Typical examples are SVM 
and SVR 
. These traditional models can map low-dimensional non-linear data to high-dimensional space using kernel functions to evaluate traffic data characteristics for prediction. However, the selection of the kernel function is a primary determinant affecting the performance of predictive models. Apart from Bayesian models 
, K nearest neighbours 
and Artificial Neural Networks (ANN) 
have been applied for the prediction of traffic flow. The significant drawback of traditional machine learning is their reliance on engineering and the experience of experts 
. However, for these traditional methods, it is complex to improve the efficiency of these predictive models when processing and evaluating complex and highly non-linear data 
. Currently, deep learning techniques in transportation have yielded good results, especially in image processing and natural language processing 
Nowadays, transportation researchers are applying deep learning methods in traffic data mining using temporal and spatial correlation. Previous research performed by 
, in which they applied Deep Belief Networks (DBN) and Stacked Autoencoder Models (SAEs) to extend and deepen the network layers for the learning of the features in traffic flow data. Then, researchers such as 
applied the combination of traffic flow and weather information to enhance the predictive performance of the DBN model. Models such as Long Short-Term Memory (LSTM) 
, Gated Recurrent Unit Network (GRU) 
, and Nonlinear Autoregressive with External Input (NARX) 
were applied for the temporal correlation of traffic flow data to improve the traffic flow prediction. However, these predictive models failed to consider the spatial relationship in the structure of the traffic network. Even though Convolutional Neural Networks (CNNs) 
have made significant headway in the field of vision, transportation researchers went further in applying CNN to traffic flow prediction to capture local spatial characteristics. Hence, 
suggested Deep Spatio-Temporal Residual Networks (STResNet) to predict the flow of people in a transportation system.
Few recent surveys have comprehensive literature reviews on traffic flow prediction in specific contexts from various perspectives of road transportation, especially from the traffic flow of vehicles at road intersections. For example, 
investigated the techniques and applications from the past decade and explained in detail the ten challenges and issues experienced by pedestrians and motorists in terms of traffic flow. The investigations carried out by 
were more aimed at considering short-term traffic flow prediction. The literature reviews involved were primarily dependent on the conventional methods of traffic flow prediction. Another piece of research by 
focused on the prediction of short-term traffic flow by summarizing the methods applied in the prediction of traffic flow. They also made some cogent suggestions for future research.
Furthermore, research carried out by 
explained, in detail, how to acquire traffic data and aimed their research at conventional machine learning techniques. In addition to these, 
indicated the contributions and research frameworks of traffic flow prediction. The research carried out by 
summarized the applicable models that depend on conventional techniques and some early deep learning techniques. Alexander et al. 
outlined a comprehensive survey of deep neural networks to predict the traffic flow of vehicles. Their research discussed three well-known deep neural architectures comprising convolutional, recurrent, and feed-forward neural networks. However, some recent technological innovations involving graph-based deep learning were not discussed in their research 
. Likewise, researchers such as 
investigated a well-detailed survey of graph-based deep learning architecture, including their applications in the field of traffic flow. Furthermore, 
, in their research, outline a survey aimed at applying deep learning models in the evaluation and analysis of traffic flow data. However, their research neglects to focus on other areas of road transportation. They only carried out their investigations on the prediction of traffic flow. In general, there is other research on the prediction of traffic flow in road transportation that possesses standard features. It is always advantageous to consider all of the areas of traffic flow. Therefore, there is still insufficient research that contributes to traffic flow prediction, especially when it comes to traffic flow prediction using heuristics and nature-inspired algorithms.
Comparing different model specifications shows that testing results are significant in supporting the usefulness of a proposed prediction model. For example, 
investigated the usefulness and effectiveness of recent comparative research based on short-term traffic flow forecasting. They stated that not all model comparisons are efficient, especially when comparing a complex non-linear model and a simple linear model. In addition, there exists an almost non-existent difference between the accuracy, simplicity, and suitability of a model (Occam’s razor). In their research, 
recommend that as much as model accuracy is very significant, it shouldn’t only be used as a yardstick in determining the appropriate methodology for the prediction of the traffic flow of vehicles. Other challenges, such as time and effort, should be considered when determining the development of the model, techniques, and expertise, resulting from the transferability and suitability to changes in the temporal behaviour of traffic flow 
Even though choosing the ‘‘best’’ model in a group of baseline models using testing and comparison is significant, there is a need for a practical option to select a heuristic or metaheuristic approach to combine traffic flow predictions. The combination of predictive models may not likely result in a single well-specified model. A well-known case is the forecasting of complex traffic datasets. Different researchers in traffic flow forecasting have carried out this approach of combining predictive models; 
carried out research in which they offered statistical guidelines for traffic flow by dynamically shifting between different models. The only disadvantage of their research is that they did not provide combined forecasts of traffic flow. Furthermore, 
researched the combination of traffic flow forecasts from two neural networks by applying the Bayesian rule. In their research, 
investigated the combination of traffic flow predictions from various types of predictive models, while 
applied a fuzzy logic model to combine traffic flow forecasts. The research of 
was based on combining forecasts from three models by applying neural networks.
Traffic Flow Patterns at a Signalized Road Intersection
This subsection describes the use of a time-space diagram (Figure 1) to explain the traffic flow patterns at a four-way signalized road intersection.
Figure 1. Fundamental concepts of traffic flow at a signalized road intersection.
When drivers arrive at a signalized road intersection, the driver’s response to traffic lights is important in understanding the traffic flow patterns at a road intersection, i.e., the response of drivers when the traffic lights turn red, the beginning of the traffic signal interval when the traffic lights turn green, and the queue of the vehicles clearing from the road intersection without any traffic control delays. This process continues back and forth from traffic lights turning to red, then to green, and back to yellow, then to red again. These are the basic concepts behind the traffic flow of vehicles at signalized road intersections. To explain these concepts efficiently, we are going to use a time-space diagram. Some assumptions were made trying to explain these time-space diagrams. These assumption diagrams can be found in the book written by 
Let us assume that three vehicles are traveling at a uniform speed and are approaching a signalized road intersection. The “space” between the vehicles and the road intersection is shown on the y-axis, while the time is on the x-axis. The three circles display the traffic lights. These traffic lights can be either green, yellow, or red, depending on real-time traffic flow.
These three vehicles have been traveling at a uniform speed. These vehicles’ trajectories are parallel and linear. The traffic lights turn red as these vehicles reach the road intersection.
As the traffic lights turned red, the three vehicles approaching the intersections had to stop, and their speed dropped drastically. Two incoming vehicles meet the three vehicles at the road intersection, making it five vehicles in a queue at the intersection. Deceleration has occurred, and the vehicular speed is zero. In Assumption 3, as the speed of the vehicle drops due to the traffic lights turning red, the duration of time spent at the road intersection increases.
As the traffic lights turn green, the vehicles already waiting in a queue at the road intersection start accelerating and driving into the intersection.
The vehicles arriving at the road intersections after the queue has cleared will be delayed, as the traffic lights are still green.
This is when vehicles arrive at the road intersections when the traffic lights turn yellow. Their speed gradually reduces as they drive towards the road intersection, as the traffic lights can turn red anytime.
Now that the traffic lights have turned red, the incoming vehicles must stop and adhere to this traffic control delay and form a new queue.
This is called the “traffic shockwaves” of the queues of vehicles forming at a road intersection when the traffic lights turn red.
This is a traffic shockwave of vehicles when the traffic lights turn green.
This is a traffic control delay for each vehicle at the intersection. This is the arrival time when vehicles arrive at a road intersection and when they leave the intersection.
This is when two vehicles depart at the same time from the road intersection. It is called “saturation headway”.
This is the speed of the vehicles as they arrived at and departed from the road intersection.
This is called the time gap. It usually occurs between the departing vehicle and the arriving vehicle.
The driver responses at signalized road intersections are shown in the Assumption 9 diagrams using figures.
The driver stopped because the traffic light was red.
This is the driver driving through the intersection when the traffic light is green.
This is the driver driving through the intersection when the queue is cleared and no vehicles are waiting at the road intersection.
This is the driver reducing their speed because the traffic light has turned green.