1. Introduction
The development of urban cities calls for the optimisation of urban logistics systems to enhance efficiency and accommodate the needs of the growing population. Hence, digitalising last-mile delivery and implementing smart city logistics have emerged as prominent and highly discussed subjects. Smart city logistics require interdisciplinary knowledge, necessitating innovative advancements in logistics business practices and the integration of new technologies
[4,5][1][2]. Moreover, to effectively assess and manage logistics systems, the development and application of more comprehensive metrics are imperative.
Last-mile modelling comprises a comprehensive range of elements, spanning from operations modelling and routing to clustering and structural analysis.
2. Modelling Freight Last Mile
Various freight last-mile models have been created, including mathematical models, computer simulation models, and GIS-based models.
2.1. Vehicle Routing Problem
Existing mathematical techniques were applied to solve freight routing, called vehicle routing problems (VRPs). The variants of VPRs include the node routing problem (NRP)
[6[3][4],
7], travelling salesman problem (TSP)
[8,9[5][6][7],
10], arc routing problem (ARP)
[11[8][9][10],
12,13], rural postman problem (RPP)
[14[11][12],
15], and Chinese postman problem (CPP)
[16,17,18][13][14][15]. Locations and routes are simplified as vertices, arcs (directed routes), and edges (undirected routes). These mathematical models have a prescribed objective function and constraints
[19][16]. They are used to optimise the route in terms of travel time, travel distance, and transportation costs.
The benefit of using mathematical models is the model is easy to formulate. However, in order to reduce the complexity of calculation, mathematical models used to solve the problem need to make simplifications
[20,21][17][18]. Furthermore, stochastic parameters and various systems and operations are difficult to incorporate into these models
[22][19], so it can be challenging to represent real logistics cases.
2.2. Computer Simulation Techniques
Computer simulation techniques are applied to solve logistics problems. Typical techniques are discrete-event simulation (DES) and agent-based simulation (ABS). The main difference between them is that ABS takes into account individual behaviour, whereas DES structures the entire system
[23][20]. These techniques can deal with stochastic randomness by using probability distributions and the Monte Carlo method and capture the detailed operational behaviour of systems. Therefore, the result can be more realistic and can reflect the variability of the system
[24][21]. Hence, these simulation models tend to be more versatile than conventional mathematical models to describe complex systems
[19][16].
DES has been applied to various areas of logistics. A DES model of a rail network in SIMUL 8 was designed by Marinov and Vigeas
[25][22]. The rail network was decomposed into flat-shunted rail yards, rail freight terminals, railway double lines, and rail passenger stations. Automatic guided vehicles (AGVs) were evaluated in a production logistics system by DES
[26][23]. The number, speed, and load capacity of AGVs and logistics buffers were optimised regarding resource allocation. Service network design was conducted by DES to include stochastic transportation times
[27][24]. The vehicle routing problem was involved in estimating intermodal and unimodal transport in terms of costs and delays.
One approach used to simulate last-mile delivery across multiple clusters is to develop a DES model based on intersections
[28][25]. Nevertheless, due to the effort involved in the development of the model, it is not well-suited for representing the intricate activities that occur within urban regions. A two-tier architecture was proposed to simulate freight operations for a large cluster incorporating cluster analysis
[29][26]. The study outlines the methodology for modelling freight operations and how the cluster approach can be utilised to construct a DES model.
While computer simulation techniques are powerful, they have the limitation of being unable to conduct any geographical analysis. They need the delivery network to be predefined before building the simulation model; hence, they are unable to optimise geographic parameters.
2.3. Geographic Information System
GIS is a tool that can be used to manage and analyse a large set of spatial data. It has aided in solving various logistics problems, including hub location problems
[30][27], emergency logistics distribution
[31][28], urban logistics system design
[2][29], CO
2 emissions reduction of distribution
[32][30], and logistics process monitoring
[33][31]. Clustering analyses were also conducted by taking advantage of GISs to display larger spatial data
[34,35,36][32][33][34]. Specifically, it has been used to develop a last-mile delivery model by solving the TSP algorithm
[28][25].
However, there are some limitations of GIS-based last-mile delivery models, particularly their requirement for deterministic input data. They cannot accommodate variability or stochasticity. Although the method can show routes, the variability of freight operations is difficult to include, such as random customer locations and freight demands, as this potentially results in different routes each day. While this is somewhat true of the adaptability of real driver behaviour, it is unhelpful when considering routes over a longer period of time, such as a year’s operation. In addition, other important freight operations, including freight consolidation and pickup-and-delivery (PUD) activities, are unable to be incorporated into the model.
3. Clustering Methods
Clustering analysis is important to freight transport modelling since the models involve the distribution of customer locations. To analyse a group of homogeneous locations and simplify the network, clustering methods are used to partition and differentiate locations. The clustering method has been used in different disciplines to group data points such as biology
[37][35], medicine
[38][36], chemistry
[39][37], and computer science
[40][38].
Clustering methods have been applied to logistics problems in group locations. The most common methods are density-based clustering and K-means clustering. The density-based clustering method works by sensing areas of points that are more concentrated or sparse
[41][39]. The outlier points excluded from the part of a cluster are labelled as noise. Additionally, the time of the points can also be used to find potential groups of points that cluster in a given space and time. This approach relies on unsupervised machine learning clustering algorithms, which autonomously discern patterns by considering the spatial proximity and distance between neighboring points. There are several algorithms of this method, including density-based spatial clustering of applications with noise (DBSCAN)
[42][40], hierarchical density-based spatial clustering of applications with noise (HDBSCAN)
[43][41], and ordering points to identify the clustering structure (OPTICS)
[44][42]. Density-based clustering has been applied to select providers for healthcare manufacturers
[45][43] and cluster areas for crowdsourcing
[46][44].
The Euclidean clustering method is similar to the density-based clustering method, as both methods consider Euclidean distances in their processes. However, there are notable distinctions between the two approaches. While the density-based clustering method delineates distances based on circular regions, the Euclidean clustering method calculates distances between individual data points
[47][45]. Consequently, the density-based clustering method yields clusters of varying sizes, while the Euclidean clustering method produces clusters of more uniform sizes with reduced noise. Nevertheless, the presence of noise impacts the formation of clusters.
Consequently, the density-based clustering method is particularly valuable in scenarios characterised by fluctuating cluster densities, irregular geometries, and the need for robust noise point identification. In contrast, Euclidean methods find applicability in situations where clusters are anticipated to exhibit consistent sizes and spherical shapes, especially when the number of clusters is either predetermined or can be reliably estimated. In freight logistics, the considerable diversity in customer locations results in varying cluster densities, shapes, and sizes. In such cases, the employment of the density-based clustering method proves to be more advantageous and appropriate.
The K-means clustering method is a centroid-based clustering method. Centroids are initially deployed and data points are assigned based on the sum of distances
[48][46]. The goal of K-means is to group similar data points and assign them to clusters, where each cluster is represented by its centroid. Applications of the K-means algorithm include clustering restaurants around distribution centres
[49][47], hub location optimisation
[50][48], urban freight loading bay management
[51][49], and locating urban facilities
[52][50]. One significant disadvantage of the K-means clustering method is that similarity is not considered in the development of clusters
[53][51].
In comparison with the density-based clustering method, the K-means clustering technique strives to minimise inter-data distances and finds its relevance in situations where clusters are expected to display consistent sizes and shapes. Therefore, within the context of freight logistics, the density-based clustering method emerges as a more appropriate choice for accommodating data variations.
4. Hub and Spoke Architecture
The H&S architecture has been used to describe urban freight transport. It is occasionally combined with clustering analysis. The benefits of applying a clustering algorithm to freight delivery models are that customer locations can be represented as clusters in an H&S architecture, with corresponding truck allocation plans. In an H&S architecture, hubs refer to warehouse points and are connected by travel routes, which are simplified as spokes. The H&S architecture has been applied to structure logistics systems; for example, freight line-haul transport
[54[52][53],
55], intra-city metro logistics
[56][54], retail distribution systems
[57][55], and parcel mail delivery
[58,59][56][57]. A genetic-based fuzzy C-means clustering method was applied to develop an H&S model for an underground logistics system
[60][58]. These models are mathematic models that were used to optimise systems in terms of lead times, truck utilisation rates, and transportation costs. Last-mile routes formulated by these models are deterministic. However, in practical freight operations, routes are stochastic in last-mile delivery.
However, there is little to no use of this structure in simulating the last-mile route. This is probably because many last-mile delivery problems have an approximately homogenous distribution of addresses, so it is difficult to distinguish any structure. Also, last-mile delivery is notoriously for the day-to-day variability of addresses and consignment attributes of weight and volume. Hence, there is a need to formulate a last-mile delivery model with stochastic parameters.
5. Gaps in Modelling Last-Mile Delivery
Modelling urban last-mile delivery operations has historically been challenging. The existing H&S-based mathematical models and GIS-based models have been designed to address specific delivery scenarios with a large number of deterministic values, and are easy to construct and useful in specific cases. However, real-world freight operations are far more complex, and these simplified models prove inadequate for informed decision making by freight companies
[61][59]. The complexity arises from the massive and unpredictable nature of customer locations, with high day-to-day variability of both addresses and the weight (or volume) of consignments.
Although computer simulation techniques such as DES have been employed to simulate freight logistics for decision making, the geographic question—namely the variability in routing—is problematic in these techniques. Consequently, there arises a need to develop a method that can support computer simulation techniques for running last-mile simulations while incorporating stochastic routing variables to facilitate more effective decision making about truck allocation.