System Architecture for Autonomous Vehicles

System Architecture for Autonomous Vehicles: Comparison

Please note this is a comparison between Version 2 by Conner Chen and Version 1 by Qasim Zeeshan Ahmed.

Technology facilitates humans, improves productivity and leads to a better quality of life. Technological developments and automation in vehicular networks will lead to better road safety and lower congestion in present urban areas where the traditional transport system is becoming increasingly disorganised and inefficient. Therefore, the development of the intelligent transport systems (ITS) concept has been proposed, with the aim and focus on improving traffic safety and providing different services to its users. There has been considerable research in ITS resulting in significant contributions .

: autonomous vehicles
Bluetooth
dedicated short-range communications

1. System Architecture for Autonomous Vehicles

An ordinary vehicle can be converted into an autonomous one by adding some additional components including sensors that allow the vehicle to make its own decisions by sensing the environment and controlling the mobility of the vehicle [

Figure 4 illustrates the overall communication process/protocol in AVs and also lists the sensors, actuators, hardware and the software control required. The protocol architecture, explained below, is composed of four main stages and enables a Level 5 fully autonomous vehicle where all the users are passengers.

1 illustrates the overall communication process/protocol in AVs and also lists the sensors, actuators, hardware and the software control required. The protocol architecture, explained below, is composed of four main stages and enables a Level 5 fully autonomous vehicle where all the users are passengers.

Figure 41.

System Architecture for AVs.

Perception: This stage involves sensing of the AVs surrounding through various sensors and also detecting its own position with respect to the surroundings. In this stage, some of the sensors used by the AV are RADAR, LIDAR, camera, real-time kinetic (RTK), etc. The information from these sensors is then passed to the recognition modules which process this information. Generally, the AV consists of adaptive detection and recognition framework (ADAF), a control system, LDWS, TSR, unknown obstacles recognition (UOR), vehicle positioning and localisation (VPL) module, etc. This processed information is fused and passed to the decision and planning stage.
Decision and Planning: Utilising the data gathered in the perception process, this stage decides, plans and controls the motion and behaviour of the AV. This stage is analogous to the brain and makes decision such as path planning, action prediction, obstacle avoidance, etc. The decision is made based on the current as well as past information available including real-time map information, traffic details and patterns, information by the user, etc. There may be a data log module that records errors and information for future reference.
Control: The control module receives information from the decision and planning module and performs functions/actions related to physical control of the AV such as steering, braking, accelerating etc.
Chassis: The final stage includes the interface with the mechanical components mounted on the chassis such as the accelerator pedal motor, brake pedal motor, steering wheel motor and gear motor. All these components are signalled to and controlled by the control module.

After discussing the overall communication and sensor architecture of an AV, we discuss the design, functionality and utilisation of some main sensors.

1.1. Ultrasonic Sensors

These sensors use ultrasonic waves and operate in the range of 20–40 kHz [

]. These waves are generated by a magneto-resistive membrane used to measure the distance to the object. The distance is measured by calculating the time-of-flight (ToF) of the emitted wave to the echoed signal. Ultrasonic sensors have very limited range which is generally less than 3 m [

]. The sensor output is updated after every 20 ms [

], making it not compliant with the strict QoS constraints of an ITS. These sensors are directional and provide a very narrow beam detection range [

]. Therefore, multiple sensors are needed to to get a full-field view. However, multiple sensors will influence each other and can cause extreme ranging errors [

]. The general solution is to provide a unique signature or identification code which will be required to discard the echoes of other ultrasonic sensors operating in near-by range [

]. In AVs, these sensors are utilised to measure short distances at low speeds. For example, they are used for SPA and LDWS [

]. Moreover, these sensors work satisfactorily with any material (independent of color), in bad weather conditions and even in dusty environments.

1.2. RADAR: Radio Detection and Ranging

RADARs, in AVs, are used to scan the surroundings to detect the presence and location of cars and objects. RADARs operate in the millimetre-wave (mm-Wave) spectrum and are typically used in military and civil applications such as airports or meteorological systems [

]. In modern vehicles, different frequency bands such as 24, 60, 77 and 79 GHz are employed and they can measure a range from 5 to 200 m [

]. The distance between the AV and the object is calculated by measuring the ToF between the emitted signal and the received echo. In AVs, the RADARs use an array of micro-antennas that generate a set of lobes to improve the range resolution as well as the detection of multiple targets [

]. As mm-Wave RADAR has higher penetrability and a wider bandwidth, and it can accurately measure the short-range targets in any direction utilising the variation in Doppler shift [

]. Due to longer wavelength, mm-Wave radars have an anti-blocking and anti-pollution capability that allows them to cope in rain, snow, fog and low-light. Furthermore, mm-Wave radars have the ability to measure the relative velocity using the Doppler shift [

]. This ability of mm-Wave radars make them suitable for extensive AV application such as obstacle detection [

], pedestrian recognition [

] and vehicle recognition [

]. Some applications of RADARs in AVs are forward cross traffic alert (FCTA), lane change assistance (LCA), blind spot detection (BSD), rear cross traffic alert (RCTA), etc. The mm-Wave also has some disadvantages such as reduced field-of-view (FoV), less precision and results in getting more false alarm as a result of emitted signals which gets bounced from the surroundings [

1.3. LiDAR: Light Detection and Ranging

LiDAR utilises the 905 and 1550 nm spectra [

]. The 905 nm spectrum may cause retinal damage to the human eye, and, therefore, the modern LiDAR is operated in the 1550 nm spectrum to minimise the retinal damage [

]. The maximum working distance of LiDAR is up to 200 m [

]. LiDAR can be categorised into 2D, 3D and solid-state LiDAR [

]. A 2D LiDAR uses the single laser beam diffused over the mirror that rotates at high speed. A 3D LiDAR can obtain the 3D image of the surrounding by locating multiple lasers on the pod [

]. At present, the 3D LiDAR can produce reliable results with an accuracy of few centimetres by integrating 4–128 lasers with a horizontal movement of 360 degrees and the vertical movement of 20–45 degrees [

]. The solid-state LiDAR uses the micro-electromechanical system (MEMS) circuit with micro-mirrors to synchronise the laser beam to scan the horizontal FoV several times. The laser light is diffused with the help of a micro-mirror to create the vertical projection of the object. The received signal is captured by a photo-detector and the process repeats until the complete image of the object is created. LiDAR is used for positioning, obstacle detection and environmental reconstruction [

]. 3D LiDAR sensors are playing an increasingly significant role in the AV system [

]. As a result, the LiDARs can be used for ACC, 2D or 3D maps and object identification and avoidance. A roadside LiDAR system has shown to reduce the vehicle-to-pedestrian (V2P) crashes both at intersections and non-intersection areas [

]. In [

], a 16-line real-time computationally efficient LiDAR system is employed. Deep auto-encoder artificial neural network (DA-ANN) is proposed, which achieves an accuracy of

95 %

within a range of 30 m. In [

], a 64-line 3D LiDAR utilising a support vector machine (SVM)-based algorithm is shown to improve the detection of the pedestrian. Although LiDAR is superior to a mm-Wave radar in measurement accuracy and 3D perception, its performance suffers under severe weather conditions such as fog, snow and rain [

]. In addition, its operating range detection capability depends on the reflectiveness of the object [

1.4. Cameras

The camera in AVs can be classified as either visible-light based or infrared-based depending upon the wavelength of the device. The camera uses image sensors built with two technologies that are charge-coupled device (CCD) and a complementary metal-oxide-semiconductor (CMOS) [

]. The maximum range of the camera is around 250 m depending on the quality of the lens [

]. The visible cameras use the same wavelength as the human eye i.e., 400–780 nm, and is divided into three bands: Red, Green and Blue (RGB). To obtain the stereoscopic vision, two VIS cameras are combined with known focal length to generate the new channel with the depth (D) information. Such a feature allows the camera (RGBD) to obtain a 3D image of the scene around the vehicle [

The infrared (IR) camera uses passive sensors with a wavelength between 780 nm and 1 mm. The IR sensors in AVs provide vision control in peak illumination. This camera assists AVs in BSD, side view control, accident recording and object identification [

]. Nevertheless, the performance of the camera changes in bad weather conditions such as snow, fog and moment-of-light variation [

The main advantages of a camera are that it can gather and record the texture, color distribution and contour of the surroundings accurately [

]. However, the angle of observation is limited due to narrow view of the camera lens [

]. Therefore, multiple cameras have been adopted in AVs to monitor the surrounding environment [

]. A three-stage RGBD architecture using deep learning and convolutional neural networks was proposed by Ferraz et al. for vehicle and pedestrian detection [

]. However, this requires the AV to process huge amount of data [

]. Currently, AVs do not possess such computational resources; therefore, computational offloading may be an appropriate solution [

Table 4

summarises the challenges of the discussed sensor technologies. It can be observed in

Table 4

that the detection capability and reliability of the various sensors is limited in different environments. This limitation can be overcome and the accuracy of target detection along with the reliability can be improved through multi-sensor fusion. Radar–camera (RC) [

], Camera–LiDAR (CL) [

], Radar–LiDAR (RL) [

] and Radar–Camera–LiDAR (RCL) [

] have been proposed where different sensors are combined together to improve the perception of the environment. Furthermore, in [

], three different sensor plans are developed based on range, cost and balance function. In this study, several different sensors are combined. In Plan

, four cameras, a mm-Wave RADAR, 32- and 4-layer LiDAR and a GPS+IMU are employed. In Plan

, four cameras, three mm-Wave RADAR, a four-layer LiDAR and a GPS+IMU are utilised. Finally, in Plan

, two regular cameras, three mm-Wave RADARs, a surrounding camera and a twelve-unit ultrasonic sensor are utilised.

Table 4.

Comparison of sensor and their challenges.

Sensor	Challenges
Ultrasonic Sensors

] and in some extreme cases the GPS position error is around 100 m [

]. In addition to this, the RTK system can also be used in AVs to precisely calculate the position of the vehicle [

]. Furthermore, dead reckoning (DR) and the inertial position can also be used in AVs to determine the position and the direction of the vehicle [

]. A technique known as odometry can be used to measure the position of the vehicle by fixing the rotary sensors to the wheels of the vehicle [

]. To make the AV capable of detecting slippage or lateral movements, the inertial measurement unit (IMU) is used and it detects this using accelerometers, gyroscopes and the magnetometer sensor’s data. The IMU combined with all units can rectify the errors and increases the sampling speed of the measuring system. Although the IMU cannot provide the position error unless it is not accompanied by the GNSS system, AVs can get information from different sources such as RADAR, LiDAR, IMU, GNSS, UWB and camera to minimise the possibilities of error and perform reliable position measurement [

]. GPS can be combined with IMU techniques such as DR and the inertial position to confirm and improve the position estimate of the AV [

1.6. Sensor Fusion

Real-time and accurate knowledge of vehicle position, state and other vehicle parameters such as weight, stability, velocity, etc. are important for vehicle handling and safety and, thus, need to be acquired by the AVs using various sensors [

]. The process of sensor fusion is used to obtain coherent information by combining the data obtained from different sensors [

]. The process allows the synthesis action of raw data obtained from complimentary sources [

]. Therefore, sensor fusion allows the AV to precisely understand its surrounding by combining all the beneficial information obtained from different sensors [

]. The fusion process in AVs is carried out by using different types of algorithms such as Kalman filters and Bayesian filters. The Kalman filter is considered very important for a vehicle to drive independently because it is utilised in different applications such as RADAR tracking, satellite navigation system and visual odometry [

2. Vehicular Ad-Hoc Networks (VANETs)

VANETs are an emerging sub-class of mobile ad-hoc networks capable of spontaneous creation of a network of mobile devices/vehicles [

123

]. VANETs can be used for vehicle-to-vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communication [

124

]. The main purpose of such technology is to generate security on the roads; for example, during hazardous conditions such as accidents and traffic jam the vehicles can communicate with each other and the network to share vital information [

125

126

]. The main components of VANET technology are:

On-board unit (OBU): It is a GPS-based tracking device embedded in every vehicle to communicate with each other and with roadside unit (RSU) [124,126]. To retrieve the vital information, the OBU is equipped with many electronic components such as resource command processor (RCP), sensor devices and user interfaces. Its main goal is to communicate between different RSUs and OBUs via a wireless link [44].

Cannot be used at high speed
	Maximum range is 2 m [15] Very low resolution as compared to RADAR

Roadside Unit (RSU): RSU is a computing unit fixed at specific location on roads, parking areas and intersections [127]. Its main goal is to provide connectivity between autonomous vehicle and the infrastructure and also assists in vehicle localisation [44,127]. It can also be used to connect vehicle with other RSUs using different network topologies [44]. They have also been powered using ambient energy sources such as solar power [

RADAR
	Generate a large number of false alarms due to surroundings metal objects The range is between 5 m and 200 m Generated images are of low resolution as compared to Cameras and LiDARs
LiDAR	Suffers extremely from the weather Maximum range is 200 m [15] Very expensive due to high price.
Cameras	Computation overheads which increases the time-critical applications [49] The maximum range is 250 m depending upon the lens [15] A big range resolution gap exists between the Cameras and RADARs or LiDARs.

1.5. GNSS and GPS, IMU: Global Navigation Satellite System and Global Positioning System, Inertial Measurement Unit

This technology can determine the exact position of the AV and helps it navigate [

]. GNSS utilises a set of satellites orbiting around the earth’s surface to localise [

]. The system contains the information of AV’s position, speed and the exact time [

]. It operates by calculating the ToF between the satellite emitted signal and the receiver [

]. The AV position is usually extracted from the Global Positioning System (GPS) coordinates. The extracted coordinates by GPS are not always accurate and they usually introduce an error in the position with a mean value of 3 m and a standard deviation of 1 m [

]. The performance is further degraded in urban environments and an error in position can increase up to 20 m [

On-board unit (OBU): It is a GPS-based tracking device embedded in every vehicle to communicate with each other and with roadside unit (RSU) [124,

VANETs have some unique properties which are very different from other ad-hoc technologies.

VANETs have very low discovery latency and as a result the vehicles, even at high speeds, connect to the RSU quickly and rarely face network outage [130,131].

The OBUs can move with predictable and regular path. It can help to detect the actual trajectory of the vehicle at any point of time [131]. The RSUs in VANETs can localise the vehicle and also log the path of the vehicle and also predict its trajectory to avoid any hazard.

Trusted Authority (TA): It is an authority which manages the entire process for VANETs, so that only valid RSUs and vehicle OBUs can register and communicate [129]. It provides security by verifying the OBU ID and authenticates the vehicle. It also detects malicious messages or suspicious behaviour [44].

The vehicle sensors and other nodes do not face any energy restrictions because they can extract energy from the vehicle engine.

The use of multicast broadcasting in VANETs allows the different vehicles to communicate with each other simultaneously [132].

126]. To retrieve the vital information, the OBU is equipped with many electronic components such as resource command processor (RCP), sensor devices and user interfaces. Its main goal is to communicate between different RSUs and OBUs via a wireless link [44].

Roadside Unit (RSU): RSU is a computing unit fixed at specific location on roads, parking areas and intersections [127]. Its main goal is to provide connectivity between autonomous vehicle and the infrastructure and also assists in vehicle localisation [44,127]. It can also be used to connect vehicle with other RSUs using different network topologies [44]. They have also been powered using ambient energy sources such as solar power [128].

Trusted Authority (TA): It is an authority which manages the entire process for VANETs, so that only valid RSUs and vehicle OBUs can register and communicate [129]. It provides security by verifying the OBU ID and authenticates the vehicle. It also detects malicious messages or suspicious behaviour [44].

VANETs have some unique properties which are very different from other ad-hoc technologies.

VANETs have very low discovery latency and as a result the vehicles, even at high speeds, connect to the RSU quickly and rarely face network outage [130,131].

The OBUs can move with predictable and regular path. It can help to detect the actual trajectory of the vehicle at any point of time [131]. The RSUs in VANETs can localise the vehicle and also log the path of the vehicle and also predict its trajectory to avoid any hazard.

The vehicle sensors and other nodes do not face any energy restrictions because they can extract energy from the vehicle engine.

The use of multicast broadcasting in VANETs allows the different vehicles to communicate with each other simultaneously [132].

Vehicular communication, utilising VANETs, includes V2V communication, V2I communication and V2X communication, as illustrated in

Figure 5. The details are given below.

2. The details are given below.

Figure 52.

Vehicular communication (VC) system.

2.1. Vehicle-To-Vehicle (V2V) Communication

It is also called inter-vehicle communication (IVC) that allows the vehicles to communicate with each other and share the necessary information about traffic congestion, accidents and speed limits [

]. V2V communication can generate the network by connecting different nodes (Vehicles) using a mesh (partial or full) topology [

]. Depending upon the number of hops used for inter-vehicle communication, they are classified as single-hop (SIVC) or Multi-hop (MIVC) systems [

]. The SIVC can be used for short-range applications such as lane merging, ACC, etc., whereas MIVC can be used for long-range communication such as traffic monitoring. The V2V communication provides several advantages such as BSD, FCWS, automatic emergency braking (AEB) and LDWS [

2.2. Vehicle-To-Infrastructure (V2I) Communication

It is also known as roadside-to-vehicle communication (RVC) and allows the vehicles to interact with the RSUs. It helps to detect traffic lights, cameras, lane markers and parking meters [

]. The communication of vehicles with the infrastructure is ad-hoc, wireless and bidirectional [

]. The data collected from the infrastructure are used for traffic supervision and management. They are used to set different speed variables allowing the vehicles to maximise fuel efficiency as well as control the traffic flow [

]. Depending on the infrastructure, the RVC system can be divided into the Sparse RVC (SRVC) and the Ubiquitous RVC (URVC) [

]. The SRVC system provides communication services at hotspots only, for example to detect available parking spaces or gas stations, whereas the URVC system provides coverage throughout the road, even at high speeds. Therefore, the URVC system requires a large investment to ensure network coverage [

2.3. Vehicle-To-Everything (V2X) Communication

The V2X communication allows the vehicle to communicate with other entities such as pedestrians (V2P), roadside (V2R), devices (V2D) and the Grid (V2G) [

133

]. This communication is used to prevent road accidents with vulnerable pedestrians, cyclists and motorcyclists [

]. The V2X communication allows the Pedestrian Collision Warning (PCW) mechanism to alert the roadside passenger before any serious accident takes place. The PCW can access the Bluetooth or Near Field Communication (NFC) of the smartphone and may use beacon stuffing to deliver critical messages to the pedestrian [