Visible Light Communications: Comparison
Please note this is a comparison between Version 1 by Jianyang Shi and Version 2 by Rita Xu.

Visible light communication (VLC) is a highly promising complement to conventional wireless communication for local-area networking in future 6G. The extra electro-optical and photoelectric conversions in VLC systems usually introduce exceeding complexity to communication channels, in particular severe nonlinearities. Artificial intelligence (AI) techniques are investigated to overcome the unique challenges in VLC, whereas considerable obstacles are found in practical VLC systems applied with intelligent learning approaches.

  • visible light communication
  • artificial intelligence
  • machine learning

1. Introduction

As 5G’s commercialization progresses, the number of 5G base stations worldwide has surpassed one million. This marks the beginning of globally competitive future-oriented research on 6G networks. According to several research reports [1][2][3][1,2,3], it is widely assumed that 6G communication will go beyond the current wireless spectrum and shift towards higher frequencies. The millimeter-wave and terahertz spectrum have long been the research focus academically and industrially, except that the equipment is of extremely high cost. Recently, the spectrum of light, i.e., visible and infrared light, provides a potential supplement for 6G. During the last decade, visible light communication is being cast in the spotlight by 6G researchers as a green, energy-efficient, high-speed communication method [4].
Visible light communication transmits (VLC) signals in a spectrum range of 400–800 Thz, which owns a very different physical property compared with both conventional wireless transmission and optical communication. Communication with visible light provides benefits of electromagnetic interference resistance, vast spectrum resources, and high-speed transmission capabilities. Moreover, it can be equipped with common lighting systems to allow simultaneous illumination and communication. Furthermore, the short wavelength of light source allows for the creation of super-compact cells, which are ideal for 6G communication. Nevertheless, signal communication at such a small wavelength poses critical challenges to transmitting and receiving devices. Semiconductor materials with wide bandgaps must be employed to achieve such high-frequency photons [5]. The extra electro-optical and photoelectric conversions compared to wireless communications introduce undesirable nonlinear distortions and hinder the high-speed transmission in visible light communications [6][7][6,7]. Traditional algorithms and strategies can help to mitigate the specific negative influence from visible light to its communication performance [8][9][8,9]. However, these algorithms cannot offset the performance difference between VLC applications and their existing counterparts. Thankfully, artificial intelligence (AI) has become a critical component of the 6G network [10]. It is expected to be the optimal solution for enabling visible light communication.
Machine learning (ML) has emerged to be the most popular technique for prediction, classification, and pattern identification, and has shown great success in data mining, image recognition, and other areas in the last decade. The recent development of AI processing units further accelerates the advancement of the more powerful deep neural networks (DNN). Many machine learning techniques have been successfully implemented in the fields of optical communication [11] and wireless communication [12]. However, the machine learning algorithm also has their own set of drawbacks, such as high computational complexity, long training times, and poor generalization. In the more complicated visible light communications, these issues will be amplified. In the more complicated visible light communications, these issues will be amplified. Therefore, machine learning should be wisely adopted to the visible light communication scenario, in the case that it may not be a viable solution.
Nowadays, wireless networks have progressed from software-defined radio (SDR) and cognitive radio (CR) [13] to AI-powered intelligent radio (IR) [10]. Visible light communication, as a communication method sprouting from 6G, aims to skip the first two stages and go directly to the IR stage. To accomplish this leap, rwesearchers need to build the framework of intelligent visible light communication (IVLC). IVLC will be a broad concept covering both the intelligent physical layer and the intelligent network layer (including the traditional data link layer and network layer). As we have seen, 6G is still in its early stages of development, and 6G-based IVLC is in an even more preliminary stage. Therefore, the intelligent physical layer, which is more different from traditional wireless, could be the core breakthrough point in forthcoming years.

2. Machine Learning in Physical Layer of IVLC

2.1. Channel Emulator

The end-to-end channel for visible optical communication is exceptionally complex. In the transmission model, for example, in atmospheric environments, gas molecules and aerosol particles in the atmosphere absorb and scatter light radiation in the near-infrared band, resulting in a loss of signal received power. In addition, the change of atmospheric turbulence causes severe distortion to the optical signals. For another example, in the underwater environment, the attenuation of underwater light depends on the wavelength, where the attenuation of the signal increases with frequency. Moreover, there are other propagation effects such as temperature fluctuations, salinity, scattering, dispersion, and beam steering. For underwater VLC applications whose bandwidth is not too high (tens of MHz), the power attenuation with frequency can be approximately modeled as a linear relationship, allowing the modeling of underwater VLC multipath channels using compressive sensing (CS) method [14][43]. Traditional methods for high-speed point-to-point VLC cannot support accurate VLC end-to-end channel modeling, but machine learning is able to simulate the complicated nonlinear dynamics of VLC channels [15][44]. In massive multiple-input multiple-output (m-MIMO) VLC, the machine learning-based methods enable accurate estimation of the channel matrix [16][45].

2.1.1. TTHNet

Conducting an experimental transmission test in an underwater environment is costly, but there is no accurate analytic model as a reference for underwater high-speed VLC. In order to reduce the cost of testing underwater VLC systems, a machine learning method is needed to model the underwater channel. The two-tributaries heterogeneous neural network (TTHnet) uses a convolutional neural network (CNN) for modeling the linearity of the underwater VLC channel and a two-layer MLP with a hollow layer for modeling the nonlinearity of the underwater VLC channel [15][44]. The two-branch heterogeneous structure makes full use of the CNN’s shared parameters, thus reducing the system complexity. At the same time, it utilizes the MLP’s extremely strong nonlinear fitting capability to fit the nonlinearity in the channel. Experiments show that the channel modeled by TTHnet is extremely close to the real channel, and the average spectrum mismatch is only 36.2% of the MLP-based channel emulator and 44.3% of the CNN-based channel emulator.

2.1.2. FFDNet

Since the modulation bandwidth of a single LED is limited, the use of m-MIMO LED and PD arrays are expected to substantially increase the capacity and transmission rate of VLC systems. However, due to the complexity of VLC channels, it is extremely difficult to estimate the m-MIMO channel matrix, which requires deep learning methods. Fast and flexible denoising convolutional neural network (FFDnet) is used for channel estimation in millimeter-wave communication recently [17][18][46,47], which is also applicable in VLC [16][45]. As an image denoising tool using machine learning, FFDnet is able to recover the input noisy channel matrix into an almost noiseless channel matrix. Compared with the minimum mean square error (MMSE) method, the FFDnet has a stronger denoising effect, which can increase the peak signal-to-noise ratio (PSNR) of the recovered channel matrix image. Unlike the nonlinear channel modeling in point-to-point high-speed VLC links, the channel matrix is treated as an image and processed using machine learning methods of image processing, which is of great importance in channel estimation of m-MIMO-VLC channels.

2.1.3. Conclusions

The channel capacity determines the upper bound of the communication system rate, and therefore, the accuracy of the channel estimation determines the communication efficiency of the actual system. Complex VLC channels should be accurately predicted thanks to the widespread use of powerful ML techniques in channel estimation. ML algorithms will guide IVLC to break through its own bottlenecks and complete the comprehensive integration of high-speed communication and large-scale heterogeneous networking to achieve technical solutions for next-generation communication.

2.2. Channel Equalization

Channel equalization techniques generally estimate the transfer function of communication channels and try to remove the channel distortion by an adaptive filter [19][48]. However, the common equalizers with linear adaptive algorithms become powerless in the field of high-speed VLC, because of the intrinsically limited modulation bandwidth of LEDs [20][49] and nonlinear distortion introduced by photoelectric devices and VLC channels. Recently, ML-based equalizers, such as artificial neural networks (ANN) [21][50], etc., have been developed for VLC systems. ML-based equalizers have shown outstanding equalizing performance, especially on modeling nonlinear phenomena, by adopting neural-network-based algorithms. Despite this, challenges such as massive computational complexity, slow convergence speed, and relatively poor generalization still prevent the further practical application of ML-based equalizers for VLC systems. Therefore, researchers have developed many variants, as presented next, to overcome those challenges.

2.2.1. Pre-Equalization GK-DNN

Conventionally, one would replace postequalization with pre-equalization to reduce the computational complexity and power consumption at the receiver side. Research works such as a weighted lookup table (WLUT), etc., have been proposed to mitigate the nonlinear distortion in VLC systems [22][51]. However, LUT-based pre-equalization methods suffer from a massive increase in computational complexity when dealing with high-order and high-ISI communication scenarios. Therefore, researchers have come up with ML-based pre-equalization methods in the field of VLC systems to provide a new way of solving computational problems of LUTs. In [23][52], a pre-equalization method, namely Gaussian kernel-aided deep neural network pre-distortion (GK-DNN-PD), is proposed for a high-order modulated high-speed VLC system. GK-DNN-PD outperforms the LUT-PD in terms of memory depth (MD) and the required training dataset, which leads to lower computational complexity. The experimental results show a 1.56 dB Q-factor gain compared with LUT-PD. The proposed GK-DNN-PD method consists of two phases: the training phase and the communication testing phase. In the training phase, the received signal, which is not pre-distorted, will be linearly equalized, giving reusearchers the label sets of the GK-DNN channel estimator. Then, the clean transmitted signal with certain MD would be the feature sets. Then, the GK-DNN channel estimator will be trained to obtain the weight and bias of the estimator. Next in the communication testing phase, the weight and bias obtained in the first phase would be used to pre-distort the clean signal that is to be transmitted. Specifically, the difference between the clean signal and the output of the GK-DNN channel estimator is also considered, in addition to the weight and bias during the pre-distortion progress. Additionally, clipping operation is also adopted to reduce the peak to PAPR, which consequently reduces the nonlinear degradation. Moreover, an NN-based pre-equalizer is proposed in [24][53] to mitigate the semiconductor optical amplifier (SOA) pattern effect for 50G PON, confirming the feasibility of NN-based pre-equalizer in intensity modulation and direct detection (IM/DD) system.

2.2.2. Postequalization GK-DNN

Since the conventional nonlinear postequalization methods based on the Volterra series suffer from a massive increase in computational complexity when dealing with high-order nonlinearity, researchers have turned to the ML for new inspirations. However, the time-consuming training progress of most ML-based postequalizers limits its actual application. To accelerate the training processing and greatly relieve the computational complexity of the equalizer at the receiver side, researchers have proposed the Gaussian kernel-aided deep neural network (GK-DNN) [25][54] in the field of VLC systems. Compared to the classical MLP, the major unique feature of GK-DNN is that the input data would go through a functional mapping that is based on Gaussian function, namely the Gaussian kernel, which maps the windowed input data to a nonlinear space to reduce the number of iterations and time consumption of the fitting progress. The researchers believe that the adjacent symbols’ influence towards the central (or current) one is in accordance with Gaussian distribution, hence the mapping operation would accelerate the training processing. The expression of the Gaussian kernel is given in [25][54]. It should be noted that the scope-controlling parameter of the Gaussian kernel would greatly affect the equalization performance of GK-DNN. Generally, the larger the parameter is, the faster the training process would be. However, there is a trade-off between the training process acceleration and equalization performance. Therefore, the Gaussian kernel parameter selection is vital to obtain the best performance. Moreover, the selection of the number of hidden layer nodes is equivalently important, which directly decides the computational complexity of the equalizer. According to the experimental results in [25][54], the GK-DNN equalizer could efficiently realize the postequalization in the VLC system with the aid of Gaussian kernel, which reduces the iteration epochs of the neural network by 47.06%.

2.2.3. Postequalization FSDNN

The frequency-slicing deep neural network (FSDNN) is a variant application of DNN that could be used in a high-speed VLC system [26][55]. It has the characteristics of processing high and low frequency respectively to decrease computation complexity by 11.15% compared to the traditional MLP when it comes to the equalization performance in VLC system. In order to solve the nonlinear frequency spectrum fading issue of the received signal after going through the VLC channel, DNN is introduced as an outstanding postequalizer to equalize linear and nonlinear distortion. However, the DNN structure must be complex enough, which means that more layers and nodes are needed and computation complexity improves to handle complicated linear and nonlinear distortions. For the expectation to release the pressure of DNN, it is worth noticing that high and low domain frequency suffer different degrees of fading. The high-frequency spectrum suffers more serious amplitude attenuation, while the low-frequency spectrum suffers less fading in the received signal in VLC system, so complex MLP structure is unnecessary for the low-frequency domain. Therefore, the received signal can be separated into high-frequency and low-frequency domains and processed, respectively, using a DNN equalizer with different complexity. The received wide-band signal is split into two narrow-band parts in the frequency domain. Its frequency spectrum is separated into two sub-bands using a low-pass filter and a high-pass filter. Then, the two sub-band signals are respectively fed into two MLPs to train individually. The main factors of the two-MLP network should be tested artificially and adjusted to optimal values, including the number of layers, nodes in every layer, taps, and epochs. Once the MLP is finished training and the weight values are fixed, the sum of the output signal from two MLPs is the equalized and recovered signals.

2.2.4. Postequalization TFDNet

The commonly used ML-based equalizers in VLC systems often aim at fitting the waveform of the transmitting signal, which is a time-domain-serial signal. It is expected that the well-learned received signal should have the same spectrum as the transmitted one. However, waveform-fitting ML equalizers would sometimes cause the spectrum difference between the equalized signal and the original one. This suggests that rwesearchers should take both time- and frequency-domain information into consideration to obtain a better equalization performance. A novel postequalizer, namely joint time-frequency deep neural network (TFDNet), is reported in [27][56] to compensate for the nonlinear distortions in the VLC system. TFDNet could reveal comprehensive information of nonstationary signals received in the VLC system by considering both time and frequency domain information simultaneously. TFDNet can be divided into three main procedures: (1) the received one-dimensional (1D, time domain) signal goes through a short-time Fourier transformation (STFT) operation and would be transferred into a two-dimensional (2D, time-frequency domain) signal, which is a matrix and could be denoted as Y; (2) then, the obtained STFT matrix Y is fed into the NN to be trained. The labels could always be obtained by manipulating the original transmitting signal. If rwesearchers assume that each row of Y represents a certain frequency component, then Y would be fed into the following network column by column; (3) finally, after the NN finishes the training progress, the reconstructed transmitting signal could be obtained by carrying out the inverse STFT (ISTFT) operation, where the analysis window must satisfy the COLA constraint [28][57]. Experimental results in [27][56] also confirm that the proposed TFDNet could resist severe nonlinear distortions and achieve a 0.1 Gbps and 0.2 Gbps data rate gain for VLC system compared to other nonlinear compensators such as Volterra and DNN.

2.2.5. Postequalization DBMLP

To further improve the utility of NN equalizers, researchers had proposed a modified double-branch multilayer perceptron (DBMLP) postequilibrium algorithm [15][44] to further reduce the consumption of energy and computational resources. DBMLP reconstructed the MLP postequalization algorithm using the structure of the Volterra series postequalization algorithm as a template. DBMLP combines the advantages of linear adaptive filters and MLP, which can improve the BER performance of the algorithm while reducing the complexity of the algorithm by 74.1%. The core structure of DBMLP is two branches of linear and nonlinear ones. In the DBMLP structure, a CNN with a convolutional layer and a dense-layer structure to simulate the linear distortion in the signal bandwidth is the first branch. In addition, a hollow MLP with an airlift layer and two dense-layer structures to simulate the nonlinear distortion outside the signal bandwidth is the second branch. The nonlinearity of the output of the first branch is corrected by the output of the second branch, and the hollow layer can ignore the effect of the intermediate signal on the signals on both sides. To further reduce power consumption and complexity, a pruning algorithm based on DBMLP is proposed [29][58]. The algorithm performs the operation of pruning by setting the smaller absolute value of weights of the connections to be pruned to 0 based on sparsity. The weights of the linear branch are not prunable while the nonlinear ones are prunable. The experimental results confirm the superiority of this approach.

2.2.6. Post-Equalization PCVNN

To improve the SNR in Underwater Visible Light Communication (UVLC) system, high LED power must be encouraged due to the LED’s incoherent characteristic and the water medium’s considerable attenuation. The nonlinearity grows more severe as the signal amplitude increases. Consequently, symbols on the outside of the constellation sustain a more nonlinear distortion than those on the inside. Based on complex-valued neural network (CVNN) [30][59], an adaptive partition equalizer (PCVNN) [31][60] has been presented, which reduces the complexity and has superior performance. In PCVNN, the constellation is segmented into two areas by a proper threshold to distinguish between large-amplitude signals and small-amplitude signals. Then, the large- and small-amplitude signals are fed into two complex-valued neural networks. Finally, a fully connected neural network is then used to combine the signals into a complete one. Since large and small signals experience different nonlinear impairments, such a network structure can recover the signal more accurately and can greatly reduce the complexity of the model for small signals. The final experimental results also verified this conjecture [31][60]. PCVNN achieves up to 56.1% computational complexity reduction compared with the standard CVNN at the same performance.

2.2.7. Postequalization LSTM-Equalizer

High-speed VLC is limited by inherent nonlinear effects. Linear equalizers with limited taps seem powerless, and the Volterra series schemes suffer from high computational complexity when the high-order taps are required. With the rise of ML in solving nonlinear problems, long short-term memory (LSTM) networks are studied for VLC systems. In [32][61], researchers proposed a memory-controlled LSTM NN equalizer for both linear and nonlinear compensation, which outperforms the conventional Volterra-based and FIR-based equalizers. LSTM carries out channel equalization as a pattern classifier where the output of LSTM cells is activated by a specially designed function. Training data with high priority would be assigned by LSTM to the latest training sequence. The proposed LSTM equalizer in [32][61] contains an input layer, a logical hidden layer with long and short-term memory, a classification layer, and an output layer with a merge node. A standard LSTM cell structure is used for long/short-term memory links. Moreover, a batch random resequencing procedure is adopted to control the memory effect. Recently, the variants of LSTM have also drawn the attention of researchers because the simple LSTMs have a slow convergence speed. This is because the LSTM unit’s inner parameters prolong the training period. A convolution-enhanced LSTM (CE-LSTM) equalizer, which extracts the features by using a convolutional layer, is proposed in [33][62] to shrink the complexity of the LSTM network and speed up the convergence progress. The experimental results also confirmed the feasibility of the proposed CE-LSTM equalizer.

2.2.8. Postequalization MPANN

Although the ML-based equalizers for mitigating both the linear and nonlinear distortions in VLC systems have been booming recently, the computational complexity is still a problem that needs to be further solved. Therefore, an ML-based equalizer with relatively optimal equalization performance while still maintaining a low complexity is needed in the field of VLC. One promising way is to greatly relieve the equalizer’s complexity by moderately sacrificing partial performance. Researchers have developed a simplified ML-based equalizer, namely the memory-polynomial artificial neural network (MPANN) [34][63], to prune the network structure and still maintain similar equalization performance as MLP or other NNs. Likewise, the input data to be fed into MPANN could be obtained by windowing the received time-serial signal. The length of the window is usually called the memory length, which also represents the dimensions of the features. The major characteristic of MPANN is that its input layer, namely the memory-polynomial layer (MP layer), would expand the input features by one certain function, which is memory polynomial expansion. In addition, the Gaussian, Fourier basis, and other trigonometric polynomials (e.g., Legendre, Chebyshev, etc.) could be the function in the input layer. It is believed that the demanded nodes of the modified NN structure could be significantly decreased if one could provide a prior knowledge of the nonlinear model. Therefore, the memory polynomial expansion is adopted to map the input features to higher dimensional data space. Then the output pattern of the MP layer is multiplied by the corresponding weights and fed into the following hidden layer of the NN. A regular activating (ReLU) and weighting process are conducted in the hidden layer and back propagation (BP) algorithm is utilized to update the parameters. Then, finally, the output layer is utilized to output the equalized symbol. The experimental results confirmed that the MPANN could achieve the same equalization performance as the regular MLPs and only requires less than a quarter of the complexity [34][63].

2.2.9. Conclusions

As can be seen from the above presentation, the application of neural networks in channel equalization has become more than a simple application. The integration of neural networks with communication systems is starting to emerge. Different branches of neural networks are beginning to emerge, and many more choose to extract communication-specific features from the input data. Beyond that, fast development of computational power resources make it promising to implement ML-based modules in the field of VLC. ML-based methods with powerful nonlinear phenomenon modeling ability open a new gate to solving the inherent nonlinear problems in VLC system. However, further optimization and improvement would be needed for those ML-based equalizers in terms of computational complexity, convergence speed, and generalization. Table 1 compares the equalizers mentioned above.
Table 1. Summarization of machine learning algorithms for channel equalization.
Number of hidden layers 2 1 1 1 1 1 1
Activation function ReLU ReLU ReLU ReLU Tanh ReLU Tanh, Sigmoid
Optimizer Adagrad Adam Adam Adam Adam Adam Adam
Complexity Moderate Low High Low High Low High
Convergence speed Fast Moderate Moderate Moderate Slow Slow Slow
Deployment location Waveform Waveform Waveform Waveform Waveform Symbol Symbol
Video Production Service