Autonomous Vehicle Localization: Comparison
Please note this is a comparison between Version 1 by Ernő Horváth and Version 2 by Mona Zou.

In a vehicle, wheel speed sensors and inertial measurement units (IMUs) are present onboard, and their raw data can be used for localization estimation. Both wheel sensors and IMUs encounter challenges such as bias and measurement noise, which accumulate as errors over time. Even a slight inaccuracy or minor error can render the localization system unreliable and unusable in a matter of seconds. Traditional algorithms, such as the extended Kalman filter (EKF), have been applied for a long time in non-linear systems. These systems have white noise in both the system and in the estimation model. These approaches require deep knowledge of the non-linear noise characteristics of the sensors. On the other hand, as a subset of artificial intelligence (AI), neural network-based (NN) algorithms do not necessarily have these strict requirements.

  • autonomous vehicle
  • open source
  • localization
  • GNSS
  • IMU
  • odometry
  • self-driving

1. Introduction

A crucial aspect of contemporary autonomous vehicle research involves enhancing localization methods [1][2][3][4][1,2,3,4]. Errors accumulate over time during position estimation: even a small error can rapidly ruin the estimation quality within a fraction of a second. Traditional methods like the extended Kalman filter (EKF), unscented Kalman filter (UKF) and particle filter (PF) have served as stalwart approaches in handling non-linear systems, affected by noise in both the system and estimation models. However, these approaches necessitate an in-depth understanding of the non-linear model noise characteristics of the sensors. To properly tune these kinds of systems, a substantial investment of time and effort is needed. For example, to tune various Kalman filter model parameters, e.g., process noise covariance, pre-whitening filter models for non-white noise, and frequently even optimization methods are applied [5][6][7][8][5,6,7,8]. In contrast, neural network-based algorithms offer a promising alternative that circumvents these stringent requirements. It is important to distinguish between two neural networks (NN) involved approaches. The first one uses neural networks as an optimization approach, and only uses the network to obtain the proper parameters for the Kalman filter. The second one perceives the problem as a black box without a pre-existent filter.

2. Robot and Vehicle Position Estimation

A significant challenge both in robotics and autonomous vehicle research is to determine the robot’s current position [9][10][11][9,10,11] often referred to as localization. Robots use their sensor data to extract the position information. The sensor data may include speed, acceleration, image, depth information and distance measurements, with different levels of accuracy. By combining information from diverse sensors, a comprehensive position estimation could be achieved. Generally, the accuracy of the robot’s position estimation improves with the increasing number of available sensors, especially when the raw sensor data have a certain level of accuracy. In robotics one of the most widely used localization techniques is simultaneous localization and mapping (SLAM) [12]. SLAM is a sensor fusion approach and as the name suggests it provides not only the vehicle’s location but also an estimate of the environment. In recent years this approach has become much more robust in exchange for efficiency [12]. SLAM usually involves camera or LiDAR (light detection and ranging) data processing, which means a large amount of sensor data. On the other hand, more efficient and sometimes less robust solutions [2][3][13][2,3,13] only rely on simple measurements such as velocity, acceleration or magnetic field. The most used estimators in this field are EKF, UKF and PF [14]. In the [15] an invariant extended Kalman filter (IEKF) based, loosely coupled IMU-GNSS integration scheme was introduced. The IEKF as a navigation filter using quaternion rotation for the INS-GNSS integration scheme. The observable states of the Kalman filter are made to converge within an area of attraction. There are solutions which fuse IMU and LIDAR data for positioning, while there are solutions that use only LIDAR data for estimation. What both approaches have in common is that LIDAR provides a much larger amount of data than conventional in-vehicle sensor systems. As a result, their computational requirements are also more significant. This approach makes data processing and data optimization one of the main challenges, and the ability to estimate as accurately as possible from the reduced data volume. Such odometry estimation solution was presented in [16] which approach relies on point-to-point ICP (iterative closest point) combined with adaptive thresholding.
Compared to the previously introduced localization, a much broader problem set is called system state estimation. This is a fundamental concept in dynamic systems, which helps to understand and predict the behavior of complex, dynamic processes. Dynamic systems are prevalent in numerous fields, spanning from engineering and physics to economics and environmental sciences, and often involve evolving states influenced by various inputs, disturbances and inherent uncertainties. The primary objective of state estimation is to ascertain the unobservable or partially observable states of a dynamic system based on available measurements and a mathematical model that describes the system. Some prevalent approaches for this problem set involve probabilistic and statistical methods such as the Bayes filter, Kalman filter [13], (extended and unscented variations), Gaussian approaches [17].
The paper [18] suggests enhancing the Kalman filter by incorporating the distance attributes of both LIDAR and radar sensors. An analysis of the sensor properties of LIDAR and radar with respect to distance was conducted, leading to the development of a reliability function to expand the Kalman filter’s capability in capturing distance-related features. In this study real vehicles were used as an experiment, and comparative measurements were performed that integrated sensor fusion with a fuzzy, adaptive noise measurement and the Kalman filter. The experimental findings demonstrated that the approach employed in the study produced reliable position estimates. Similarly, in [19] a localization approach that integrates an inertial dead reckoning model with 3D LIDAR-based map matching is suggested and empirically validated in a dynamic urban setting encompassing diverse environmental conditions.
In the study presented in [20], the unique challenges faced by UAVs in outdoor wilderness scenarios are addressed with a significant focus on enhancing received signal strength (RSS) for accurate positioning and collision avoidance. The presence of colored noise in RSS data disrupts traditional least squares (LS) algorithms. The extended Kalman filter (EKF) emerges as a solution, wherein a colored noise model is ingeniously incorporated, sharpening its distance estimation accuracy. Alongside, an adaptive algorithm is introduced to account for fluctuating path-loss factors during UAV flights. This comprehensive EKF-based approach not only ensures precise UAV positioning but also incorporates an orthogonal rule for advanced collision detection and trajectory modification. Experimental findings validate the robustness of this EKF-centric methodology, endorsing its value for collaborative UAV operations.
Increasing the positioning accuracy can be based not only on internal sensor and LIDAR data, but also on increasingly complex and widespread vehicle-to-vehicle or vehicle-to-infrastructure communication networks. If a vehicle within a network loses its GNSS signal, it can obtain positioning data from other nodes in the network. As an example, methods can be based on Vehicular ad hoc networks (VANETs). VANETs communicate additional position data and attempt to achieve localization using the received signal strength indicator (RSSI) analytical model [21][22][21,22].
The disadvantage of these methods is that RSSI is significantly affected by the weather and objects in the environment. Several correction methods have been developed to deal with this, including weighted centroid localization (WCL) and signal to interference noise ratio (SINR) methods [23].
There are procedures, similar to the fusion of IMU and GNSS data, which are based on Kalman filter and use RSSI, SINR calculations together with WCL procedure [14].

3. Knowledge-Based Versus Black-Box Modeling

The mentioned knowledge-based solutions (most notably the EKF) have obtained a good reputation in signal filtering. The main drawback of knowledge-based methods is their dependency on the underlying model accuracy. Kalman filters typically use kinematic models in lower velocities, but in higher velocities, dynamic models are a better fit. These have several parameters that depend on external factors and may vary over the life of the vehicle, for example vehicle mass, weight distribution and tire stiffness. The changes can be neglected, but still the noise characteristics of the model and the sensors must be known and considered in the modelling. A further difficulty is that there are sensor signals, such as GNSS data, where the noise characteristics are not deterministic and can change very quickly. For example, when communication is suddenly lost with the remote base-towers required for real-time kinematics (RTK) based solutions [24], the accuracy and hence the noise characteristics of the GNSS signals can change significantly within seconds. This also shows that with the number of measured data and internal state variables, not only does the computational complexity is increasing significantly due to the significant matrix operation running in the background, but also the complexity of the models makes them difficult to handle. A further problem is that while data from IMU and odometry data can be taken into account in a dynamical or kinematic model directly or through physical relationships, this is more difficult to do for a camera signal or LIDAR point cloud.
In recent years, black-box modelling-based technologies have gained significant traction too. In the field of vehicle localization a numerous papers [24][25][26][27][28][29][30][31][24,25,26,27,28,29,30,31] have been published. Maybe even more attention is on smartphone-based pedestrian tracking [32][33][34][35][36][32,33,34,35,36]. In this field similar technologies can be applied, but the quality and type of usable sensor set is divergent. From the mentioned approaches [31] is the closest to ours. The main difference is that instead of convolutional neural networks (CNN), our solution uses LSTM. The most known advantage is the ability to accurately predict outcomes and behavior without requiring a deep understanding of the underlying internal mechanisms or complexities. One of the main drawbacks of black-box methods is to provide the right results in all situations. Moreover, they require a large amount of measurement data to teach them, and these measurements must be recorded under the most varied conditions possible.