Lane Detection

This entry is adapted from the peer-reviewed paper 10.3390/s23198285

Lane detection is a vital component of intelligent driving systems, offering indispensable functionality to keep the vehicle within its designated lane, thereby reducing the risk of lane departure.

lane detection re-parameterization attention mechanism

1. Introduction

In recent years, with the rapid development of intelligent transport systems (ITS), they play a key role in traffic safety ^[1]. Among the features of these systems, lane detection technology has received widespread attention as an important component of assisted driving. Lane lines clearly delineate driving zones for various types of vehicles. This contributes to reduced road congestion and aids in collision avoidance, thus ensuring road safety ^[2].

In practical driving, the complexity and diversity of traffic scenarios challenge lane detection. For example, capturing the full shape of lane lines is difficult under conditions of dazzle or insufficient lighting. The thin and elongated appearance of lane lines makes them susceptible to obscuration by surrounding vehicles. Drivers need timely feedback about road conditions. Thus, driver assistance systems must swiftly ascertain the location of lane markings. A formidable challenge in this area is to achieve a balance between lane detection accuracy and real-time responsiveness.

Lane detection technologies can be classified into two main categories: one relying on conventional image processing techniques, and the other on deep learning approaches. Traditional lane detection algorithms primarily use computer vision techniques alongside image processing methodologies to discern the color ^[3]^[4], texture ^[5]^[6], and other features of lane lines against the surrounding road surface. Algorithms like Sobel ^[7] and Canny ^[8] are employed to extract the boundaries of lane lines. Furthermore, by incorporating methodologies such as the Hough Transform ^[9]^[10] or Random Sample Consensus (RANSAC) ^[11]^[12] can serve to further augment the optimization of detection results. For instance, Cai et al. ^[13] suggested using a Gaussian Statistical Color Model (G-SCM) to extract areas of interest based on lane line color characteristics. This was then combined with an improved Hough Transform for lane detection within the extracted image region. Guo et al. ^[14] suggested combining an improved RANSAC version with the Least Squares method to optimzie model parameters, achieving enhanced lane fitting results. However, traditional lane detection methods require manual feature selection and extraction. In intricate driving scenarios, these methods often struggle to discern clear lane lines. This is especially true in circumstances with an absence of structured lane lines or variable lighting conditions.

Contrary to the conventional lane detection algorithms, deep learning techniques can automatically extract and learn features, continually updating model parameters through training on large-scale datasets ^[15]. This narrows the gap between predictive outcomes and actual results, addressing the challenges of lane feature extraction in complex scenarios. Nevertheless, deep learning demands a vast volume of training data and high computational performance. Therefore, the complexity of the model requires thorough consideration for its practical applications.

Currently, deep-learning-based methods for lane detection consist of three categories: those founded on segmentation ^[16]^[17]^[18]^[19]^[20]^[21], parameter regression ^[22]^[23]^[24], and anchor-based methods ^[25]^[26]^[27]^[28]. Segmentation-based detection methods can be further divided into semantic segmentation and instance segmentation. Pixels are classified by semantic segmentation in order to identify lanes and backgrounds as separate categories. On the other hand, instance segmentation not only identifies the category of each pixel, but also distinguishes between different instances of objects, making it useful for detecting multiple lane lines, especially when their count varies. However, segmentation tasks typically involve extensive computation, posing challenges to the real-time requirements of driver assistance systems. Parameter regression-based methods use neural network regression to predict parameters. These parameters are then used to construct a curve equation representing the lane lines. While these algorithms can identify lane lines with changing shapes, their predictions are significantly influenced by regression parameters, leading to poorer model generalization. Row-anchor-based methods use prior knowledge of lane line shapes and divide the image into location grids oriented in the row direction. A classifier then returns grids containing lanes. Although this method provides relatively quick inference speeds, its accuracy might not always be optimal.

2. Lane Detection Based on Deep Learning

To cope with the complex and ever-changing driving scenarios, researchers have applied deep-learning-based feature extraction methods to lane detection. Neven et al. ^[17] present the LaneNet model, which consists of an embedding vector branch and a semantic segmentation network. This model employs an encoding–decoding operation to transform input images into high-dimensional feature vectors and back to the original image, successively determining whether each pixel belongs to the lane line. Seeking enhanced semantic information extraction capabilities, Pan et al. ^[18] introduced an original network architecture, SCNN, which incorporates a spatial convolution layer to facilitate both vertical and horizontal information propagation. The convolution layer contains connections in four directions: left, right, up and down, thereby enhancing the correlation of long-distance spatial information. However, the overall structure of the model is complex, requiring substantial computational resources and time. Consequently, the training and inference processes are significantly time-consuming. Hou et al. ^[19] incorporated Self-Attention Distillation (SAD) into Convolutional Neural Networks (CNNs). This innovative method facilitates knowledge distillation between different layers, enabling efficient utilization of information from varying layers to capture critical feature information. It is important to note that while SAD is only involved in the training phase and does not increase inference time, it inevitably escalates the computational cost of model training. Tabelini et al. ^[22] designed a parameter-based lane line detection model, PolyLaneNet, which represents lane line shapes through polynomial curves. As a regression model, it boasts a faster detection speed compared to segmentation models, but its refining ability is inadequate, and the detection precision is lacking. Qin et al. ^[27] suggested a row-anchor-based lane detection method, transforming pixel-level classification into global row selection classification, thus reducing the computational load during the inference process. However, due to the simplicity of the network architecture, the lane detection results may be somewhat deficient. Tabelini et al. ^[25] proposed an anchor point-based lane detection method. This method extracts features from each anchor point using feature maps generated by the main network and then combines these features with the global ones produced by the attention module. As a result, the model can connect information from multiple lanes, improving its detection accuracy compared to other anchor-based lane line detection methods.

3. Re-Parameterization

With the continuous development of CNNs, a series of high-precision models have emerged. These models often have deeper layers and more complex modules to achieve better prediction and recognition capabilities. However, the complexity of these models frequently leads to significant computational resource consumption, making real-time inference challenging. To enable models to achieve faster inference speeds while maintaining high precision, a strategy based on structural re-parameterization has been widely adopted. For example, ACNet ^[29] utilizes asymmetric convolution to construct the network, improving the robustness of the model to rotational distortion without increasing the computational cost of deployment. The RepVGG ^[30] model features different structures in its training and inference phases. During training, the model leverages a multi-branch topology structure to capture information at multiple scales. In contrast, during inference, it employs a single-branch architecture reminiscent of VGG ^[31], consisting of 3 × 3 convolutions and ReLU, to ensure efficient inference. The Diverse Branch Block (DBB) ^[32], a structure paralleling the Inception model, incorporates a multi-branch design. This design permits the substitution of any K × K convolution within the model throughout the training phase, capturing multi-scale features and thereby enriching the image information extracted.

4. Attention Mechanisms

The attention mechanism dynamically changes the weight of each feature in the image, mimicking the selective perception of the human visual system. It focuses on the critical areas of the image and suppresses irrelevant information. SENet ^[33] is the first to introduce attention into the channel dimension. It establishes the dependency relationship between convolutional feature channels through squeeze and excitation operations, allowing the model to learn to allocate weights to different channels and improve the utilization efficiency of important features. ECANet ^[34] is an adaptive channel attention mechanism. It does not depend on the full connection operation and focuses only on the cross-channel interaction of neighboring channels, reducing computational cost and memory consumption. To augment feature extraction, researchers consider the dependency relationships of channels and space, and design a fusion of different attention mechanisms. For example, CBAM ^[35] concurrently incorporates information from the primary dimensions of channels and spatial contexts, thereby empowering the network to extract more comprehensive features and enabling the network to extract more comprehensive features. DANet ^[36] designs parallel structure position attention modules and channel attention modules, enabling local features to establish rich context dependencies and effectively improving the detection results.

References

Lamssaggad, A.; Benamar, N.; Hafid, A.S.; Msahli, M. A survey on the current security landscape of intelligent transportation systems. IEEE Access 2021, 9, 9180–9208.
Kumar, S.; Jailia, M.; Varshney, S. A Comparative Study of Deep Learning based Lane Detection Methods. In Proceedings of the 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 23–25 March 2022; pp. 579–584.
He, Y.; Wang, H.; Zhang, B. Color-based road detection in urban traffic scenes. IEEE Trans. Intell. Transp. Syst. 2004, 5, 309–318.
Chiu, K.; Lin, S. Lane detection using color-based segmentation. In Proceedings of the IEEE Intelligent Vehicles Symposium, Las Vegas, NV, USA, 6–8 June; 2005; pp. 706–711.
Tapia-Espinoza, R.; Torres-Torriti, M. A comparison of gradient versus color and texture analysis for lane detection and tracking. In Proceedings of the 2009 6th Latin American Robotics Symposium (LARS 2009), Valparaiso, Chile, 29–30 October 2009; pp. 1–6.
Li, Z.; Ma, H.; Liu, Z. Road lane detection with gabor filters. In Proceedings of the 2016 International Conference on Information System and Artificial Intelligence (ISAI), Hong Kong, China, 24–26 June 2016; pp. 436–440.
Gao, W.; Zhang, X.; Yang, L.; Liu, H. An improved Sobel edge detection. In Proceedings of the 2010 3rd International Conference on Computer Science and Information Technology, Chengdu, China, 9–11 July 2010; pp. 67–71.
Xuan, L.; Hong, Z. An improved canny edge detection algorithm. In Proceedings of the 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 24–26 November 2017; pp. 275–278.
Luo, S.; Zhang, X.; Hu, J.; Xu, J. Multiple lane detection via combining complementary structural constraints. IEEE Trans. Intell. Transp. Syst. 2020, 22, 7597–7606.
Bisht, S.; Sukumar, N.; Sumathi, P. Integration of Hough Transform and Inter-Frame Clustering for Road Lane Detection and Tracking. In Proceedings of the 2022 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Ottawa, ON, Canada, 16–19 May 2022; pp. 1–6.
Kim, J.; Lee, M. Robust lane detection based on convolutional neural network and random sample consensus. In Neural Information Processing, Proceedings of the 21st International Conference, ICONIP 2014, Kuching, Malaysia, 3–6 November 2014; Proceedings, Part I 21; Springer: Berlin/Heidelberg, Germany, 2014; pp. 454–461.
Sukumar, N.; Sumathi, P. A Robust Vision-based Lane Detection using RANSAC Algorithm. In Proceedings of the 2022 IEEE Global Conference on Computing, Power and Communication Technologies (GlobConPT), New Delhi, India, 23–25 September 2022; pp. 1–5.
Cai, H.; Hu, Z.; Huang, G.; Zhu, D. Robust road lane detection from shape and color feature fusion for vehicle self-localization. In Proceedings of the 2017 4th International Conference on Transportation Information and Safety (ICTIS), Banff, AB, Canada, 8–10 August 2017; pp. 1009–1014.
Guo, J.; Wei, Z.; Miao, D. Lane detection method based on improved RANSAC algorithm. In Proceedings of the 2015 IEEE Twelfth International Symposium on Autonomous Decentralized Systems, Taichung, Taiwan, 25–27 March 2015; pp. 285–288.
Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444.
Lee, S.; Kim, J.; Shin Yoon, J.; Shin, S.; Bailo, O.; Kim, N.; Lee, T.; Seok Hong, H.; Han, S.; So Kweon, I. Vpgnet: Vanishing point guided network for lane and road marking detection and recognition. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1947–1955.
Neven, D.; De Brabandere, B.; Georgoulis, S.; Proesmans, M.; Van Gool, L. Towards end-to-end lane detection: An instance segmentation approach. In Proceedings of the 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, China, 26–30 June 2018; pp. 286–291.
Pan, X.; Shi, J.; Luo, P.; Wang, X.; Tang, X. Spatial as deep: Spatial cnn for traffic scene understanding. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018.
Hou, Y.; Ma, Z.; Liu, C.; Loy, C.C. Learning lightweight lane detection cnns by self attention distillation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1013–1021.
Xu, H.; Wang, S.; Cai, X.; Zhang, W.; Liang, X.; Li, Z. Curvelane-nas: Unifying lane-sensitive architecture search and adaptive point blending. In Computer Vision-ECCV 2020, Proceedings of the 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XV 16; Springer: Berlin/Heidelberg, Germany, 2020; pp. 689–704.
Ko, Y.; Lee, Y.; Azam, S.; Munir, F.; Jeon, M.; Pedrycz, W. Key points estimation and point instance segmentation approach for lane detection. IEEE Trans. Intell. Transp. Syst. 2021, 23, 8949–8958.
Tabelini, L.; Berriel, R.; Paixao, T.M.; Badue, C.; De Souza, A.F.; Oliveira-Santos, T. Polylanenet: Lane estimation via deep polynomial regression. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 6150–6156.
Feng, Z.; Guo, S.; Tan, X.; Xu, K.; Wang, M.; Ma, L. Rethinking efficient lane detection via curve modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 17062–17070.
Wang, J.; Ma, Y.; Huang, S.; Hui, T.; Wang, F.; Qian, C.; Zhang, T. A keypoint-based global association network for lane detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 1392–1401.
Tabelini, L.; Berriel, R.; Paixao, T.M.; Badue, C.; De Souza, A.F.; Oliveira-Santos, T. Keep your eyes on the lane: Real-time attention-guided lane detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 294–302.
Liu, L.; Chen, X.; Zhu, S.; Tan, P. Condlanenet: A top-to-down lane detection framework based on conditional convolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 3773–3782.
Qin, Z.; Zhang, P.; Li, X. Ultra fast structure-aware deep lane detection. In Proceedings of the Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; pp. 276–291.
Zheng, T.; Huang, Y.; Liu, Y.; Tang, W.; Yang, Z.; Cai, D.; He, X. Clrnet: Cross layer refinement network for lane detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 898–907.
Ding, X.; Guo, Y.; Ding, G.; Han, J. Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1911–1920.
Ding, X.; Zhang, X.; Ma, N.; Han, J.; Ding, G.; Sun, J. Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 13733–13742.
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556.
Ding, X.; Zhang, X.; Han, J.; Ding, G. Diverse branch block: Building a convolution as an inception-like unit. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 10886–10895.
Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 22 June 2018; pp. 7132–7141.
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–18 June 2020; pp. 11534–11542.
Woo, S.; Park, J.; Lee, J.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19.
Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–19 June 2019; pp. 3146–3154.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.

Upload a video for this entry

Information

Subjects: Transportation Science & Technology

Contributors MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register :

Tao Xie

Mingfeng Yin

Xinyu Zhu

Jin Sun

Cheng Meng

Shaoyi Bei

View Times: 775

Update Date: 19 Oct 2023

Version	Summary	Created by	Modification	Content Size	Created at	Operation
1		tao xie	--	1384	2023-10-18 13:34:19	\|
2	update references and layout	Rita Xu	Meta information modification	1384	2023-10-19 03:16:44	\|