2.2. Forecasting
Energy production forecasting has been an important problem, even in traditional systems, and it has been tackled with different techniques, as it can be seen in Figure 2.
Figure 2. Taxonomy of most used IA methods for forecasting.
In ^{[47]}, the researchers found a system that uses Support Vector Machines (SVM) ^{[48]}. SVM is mostly used for regression. The model uses two different inputs: solar irradiance and environmental temperature, with energy production as the output. This work included the use of a parameter to tune the number of support vectors during the training. The results show a low error, with a Mean Absolute Percentage Error (MAPE) of 0.1143, but it was really intolerant at errors in the input data. The method was implemented using MATLAB. Another approach related with SVM is found in ^{[49]}. The authors propose a multiinput support vector. Three different inputs were tested. Only solar power, solar power and solar irradiance combined and finally solar power, temperature and irradiance. The best predictions were made when the third vector was used to train the network with. The model showed better results than analytical methods with a MAPE of 36%, but it was found that it was weak against changes in the climate. The method was implemented using MATLAB.
In ^{[50]}, a Neural Network was used for ShortTerm Forecasting. The input data were composed of the the deviation of load power and temperature of 30 days before the forecast day and the same data of 60 days before and after the forecast day in the previous year. If the forecast day is changed, the neural network needs to be retrained. The network is composed of 9 inputs nodes, 20 hidden nodes and one output neuron. The results show a Mean Absolute Percentage Error (MAPE) of 1.63% on average.
The work of ^{[51]} tries to go further, presenting a neural network of 2 hidden layers, one of 6 nodes and the second with 4. This model has nine inputs (Day, Time, Cloud Cover Index, Air Temperature, Wind speed, Air Humidity, UV index, precipitation and air pressure) and is trained using a hybrid metaheuristic, which combines PSO and GA ^{[52]}. This hybrid is faster and more robust than backpropagation for this problem.
Neural Networks have been found to be sensitive to many factors, including the architecture or the initialization of weights. Combining different NNs in an ensemble has been found to be a strategy to reduce these problems. The work of ^{[53]} tested different combinations using temperature and solar irradiance as inputs. Every combination was found to be better than using only a single NN. The data were composed of 7300 data from 365 different days. The findings were that the best architecture for forecasting is the one which uses an iterative methodology to find the outputs, forecasting one at a time with a Mean Absolute Error (MAE) of 51.48% and Mean Relative Error (MRE) of 17.24%.
The work in ^{[54]} used a fixed methodology, changing activation functions, learning rules and architecture in order to find the best neural network for their dataset. The data were acquired along a period of 70 days, obtaining 11,200 examples. The best network had 1 hidden layer with a Linear Sigmoid Activation Function. The learning rule as Conjugate Gradient ^{[55]}, which uses second derivatives to determinate the weight update, inputs temperature and photovoltaic power and outputs nextday forecasting of PV power output. The validation study indicates that this network is simple and versatile and can precisely forecast with a minimum MAPE of 0.8655. The experiments were implemented using the NeuroSolutions ^{[56]}.
Another problem of NN is that training can be slow since backpropagation is highly demanding. For solving this problem, the work in ^{[57]} used the extreme learning machine (ELM) technique to train the network. ELM ^{[58]} has a faster learning speed while obtaining better generalization performance, and it also optimizes the number of hidden neurons. The system is composed of three networks, one for each kind of weather. The network is trained with the PV output history and the weather history data. Based on the weather report of the next day, the model is chosen to forecast the dayahead PV. The results show that ELM networks outperformed BP networks with a MAPE of 2.78% in the best case. The experiments were implemented using MATLAB.
Another improvement can be found in ^{[59]}; the neural network is aided by a technique known as Wavelet Transform (WT) ^{[60]}. This algorithm is specialized in isolating the spikes produced by continuous fluctuations of the PV data. It has two stages: decomposition of the input signal, which is performed before the neural networks, and reconstruction, which is performed with the output of the NN. The model used is a Radial Basic Neural Networks (RBNN) ^{[61]}, which needs less computation time and is more effective than Backpropagation Neural Networks and takes as input the PV, solar irradiance and temperature of the current hour, twelve hours before and twenty hours before in order to predict the onehourahead power output. The results show that the proposed model outperformed RBNN without WT for hourly PV for the horizon of 12 hours with a MAPE of 2.38% in the best case.
WT is used along other architectures as in ^{[62]}. RNNs are probed to be useful in order to predict from time series and WT deals with the fluctuations on the data provided by the meteorological time series obtained from sampling at intervals of 10 min and stored as time series. This combination proved to be able to forecast 2 days ahead more accurately than other Neural Networks.
A recent use of WT is found in ^{[63]}. This work presents a hybrid algorithm composed of WT, PSO and RBFNN used to forecast from 1 to 6 hours ahead. The inputs that are used in the model are set as Actual PV, irradiance and temperature. The WT is used to perform an data filtering on the past 15 days before the forecast day. The RBFNN is optimized by the PSO algorithm. The network performed better than the compared methods, with an MAE of 4.22% on average for a 1hourahead forecast, 7.04% for a 3hourahead one and 9.13% for 6hour ahead one.
Recurrent Neural Networks are also used in ^{[64]}. Deep Recurrent Neural Networks (DRNN), RNNs with many hidden layers, are used to forecast. These networks are capable of representing complex functions more efficiently than RNNs. The input data are composed of highresolution time series, which are preprocessed and normalized to obtain a highresolution timeseries dataset of four different days. The architecture used was a DRNN with Long ShortTerm Memory (LSMT) ^{[65]} units with two hidden layers of 35 neurons. Other models showed lower accuracies and more bias error than the proposed method that obtained an RMSE of 0.086. The experiments were implemented using MATLAB and the Keras library (now on tensorflow) in Python.
Another RNN method is found in ^{[66]}. The authors compared 5 different architectures of RNN: A basic LSTM, an LSTM with the window technique, an LSTM with time steps, an LSTM with memory between batches and stacked LSTMs with memory between batches. Two datasets of different cities were used to test the 3 models. The results show the third proposal with an RMSE of 82.15 in the first dataset and an RMSE of 136.87 in the second, which uses prior time steps in the PV series as inputs, is the most accurate and reliable, even compared with other methods such as ANN. The experiments were implemented using Keras.
The authors of ^{[67]} present an interesting modification of RNN. This work used the networks know as Echo State Network ^{[68]}. ESN presented a dynamical reservoir instead of the traditional hidden layers of RNN. Their main advantage is that only the output weights need to be trained since the reservoir and input ones are random. These networks can obtain better results than typical RNN. A restricted Boltzmann machine (RBM) ^{[69]} and principal component analysis (PCA) ^{[70]} are used in order to determine the number of reservoirs and inputs. The network parameters are found by a DFP QuasiNewton algorithm ^{[71]}. Compared with other PV forecasting methods, the results show that the proposed model could outperform other forecasting systems with a MAPE of 0.00195%.
A complex hybrid is found in ^{[72]}. This system uses NN aided by different algorithms trained on data obtained during a year. Random Forest (RF) ^{[73]} is used to rank the different factors that affect PV in order to eliminate the less important ones. This importance degree, computed by RF, is transferred to Improved Gray Ideal Value Approximation (IGIVA) ^{[74]} as weights to determine the similar days of different climates type. The objective of this is to improve the quality of datasets. After that, the original sequence is decomposed by Complementary Ensemble Empirical Mode Decomposition (CEEMD) ^{[75]} to reduce the fluctuation of the original data. Finally, the neural network is optimized by a modification of PSO known as dynamic inertial factor particle swarm optimization (DIFPSO) ^{[76]}^{[77]}. The proposed model reduced training time and improved the forecasting accuracy with an MAE of 2.84 on sunny days, 10.12 on cloudy days and 13.01 on rainy or snowy days.
Another interesting approach is the NeuroFuzzy hybrid found in ^{[78]}. Fuzzy Logic is applied as a filter to the input data obtained in the energy production and weather forecast for 12 months (day, irradiance, temperature, humidity, pressure, wind speed and cloud clover) in order to speed up the system. The neural structure is composed of 7 inputs, 2 hidden layers of 9 and 5 nodes, respectively, and input. The network is trained by BP aided by a combination of PSO and GA, known as Genetic Swarm Optimization ^{[79]}. This method improved convergence speed and the predictive performance over other hourly forecast methods. The experiments were implemented using MATLAB Convolutional Neural Networks have also been applied to timeseries data since they are able to learn filters that represent repeated patterns in the data without needing any prior knowledge. They also work well with noisy data. In ^{[80]}, CNNs are applied for forecasting PV power using Solar Data and Electricity Data as inputs. The CNNs used the ReLu activation function, Adam optimizer and dropout to avoid overfitting. The parameters were selected by testing different architectures and choosing the most promising. The models were compared of an FFNN and an RNN of 128 hidden nodes. The results show that CNN performed similarly to LSTM and better than MLP with an MAE of 114.38.
An interesting approach mixing Big Data and Deep Learning is found in ^{[81]}. This method was used to nextdayahead forecast in 30 min intervals. It used a multistep methodology that decomposes the forecasting problem in different subproblems. For the Big Data, Spark Apache was used. The neural Network parameters were searched using the grid search strategy. The best structure was found with 3 hidden layers with between 12 and 32 neurons. The method demonstrated that DL is suitable for big solar data since it has a linear increase in training time and performs better than other methods.
The work of ^{[82]} makes use of a new kind of Neural Network, the Dendritic Neuron Network ^{[83]}, in order to forecast PV power. These kinds of neurons have 4 types of layers: synaptic layer, branch layer, membrane layer and cellbody layer. The input data (temperature and irradiance of the actual moment and the last) are transferred to the synaptic layers where they are converted by the sigmoid function and summarized to the branch layer. The results are transported to the cellbody layer for numerical judgment. This layer will transmit the data thought the axon to other neurons when the data exceed a given threshold. This new kind of network provides higher convergence speed and enhanced fitting ability. The network is also aided by WT. The results show that the model outperformed typical FeedForward models with an average MAPE of 10.9, with strong fluctuations and 4.55 on weak fluctuations. The experiments were run using MATLAB.
In Table 4, a summary of the reviewed models is presented.
Table 4. Models for forecasting.
2.3. Parameter Estimation
Finding the parameters of the PV models is vital to simulate their behavior and to optimize their production. This problem is simplified by finding the unknown parameters in order to optimize the output power. Different techniques, most of them metaheuristics, have been used to solve this problem, as can be seen in Figure 3.
Figure 3. Taxonomy of most used metaheuristics for parameter estimation.
Metaheuristics are the most used techniques to estimate PV parameters. Different kinds of algorithms have been evaluated in recent years. The work in ^{[85]} compares different evolutionary algorithms, comparing Genetic Algorithms ^{[86]}, Particle Swarm Optimization ^{[87]} and differential evolution ^{[88]}. DE is an evolutionary algorithm similar to Genetic Algorithms but which uses real numbers to codify the problem, this solves the problem of GA when it comes to converging speed. The fitness function was computed as the sum of the absolute errors in current and voltage. The findings showed that the best results were given by DE and the worst ones by GA. The authors also implemented different hybrids: Tabu Search ^{[89]} assisted differential evolution to avoid falling in local minimums, PSO assisted DE in which PSO is activated after 5 generations of DE and DE assisted by Tabu Search where DE is used to search for the optimal solution in a subset of the whole search space, while TS is used to move the local search within the global space. These hybrids performed better than the originals and provided more stability. DE assisted TS and provided the best results, and it was the fastest.
2.4. Defects Detection
Finding defects on the surface of the PV cells is a problem completely related to computer vision. As observed in the bibliography, the most used technique for photographing the images is electroluminescence. The datasets are usually private, but there are some exceptions. It can be seen in Figure 4 the most used techniques for detecting defects.
Figure 4. Most used IA method for defect detection.
Classical approaches as found in ^{[90]}, which tried to detect defects in the solar modules using image processing techniques. In order to segment the different modules, they used the first derivative of the statistic curve in order to find the division line between each chip. After that, they used another technique, the otsu method, to obtain a binary image. Finally, the algorithm tries to identify the state of the module using the geometry of the resulting image. This algorithm produced interesting results, with a recognition rate of 80% on cracked modules, 95% on fragmented and 99% on good state modules. The recognition was also quite fast. The algorithms were implemented and applied via MATLAB.
Another approach is found in ^{[91]}. This method combines the image processing techniques with Support Vector Machines. The dataset featured 13,392 samples of EL images of solar cells. The images are preprocessed in order to reduce spatial noises and to accurately highlight crack pixels in images. After that, binary processing is performed, and finally, the features are extracted from the image. These features are used by different SVMs in order to classify the cells. The results present that the SVM with penalty parameter weighting is the best SVM, with a correct detection rate of 91%, with specificity and accuracy of more than 97%. The experiments were run in MATLAB.
In ^{[92]}, the author compare Convolutional Neural Networks with SVM. The SVM is trained with data from the ELPV dataset, composed of 2624 EL images of solar cells, obtained by finding the features of the images using different feature descriptors. The CNN used was a pretrained VGG19 with the upper layers changed and trained with the examples. The models were tested with both monocrystalline and polycrystalline modules. The results show that both classifiers were useful for visual inspection, both with an average accuracy of 82.4%. The algorithms were implemented in Python, using Keras for the Neural Network.
The work in ^{[93]} presented a similar approach using SVM and CNN. The CNN was composed of two convolutional layers with leakyrelu and maxpooling. The convolutional part was aided by two leakyrelu dense layers and the output layer. The SVM was trained with different features extracted from the images. The dataset was built with 90 images of fullsized commercial modules that were segmented afterward, obtaining 540 cells. The results show similar behavior in both methods, with an accuracy of 98%. The article also tackled unsupervised learning, trying to cluster the images by two features. This resulted in a model that was able to assign the correct label in 66% of cases.The algorithms were implemented in Python, using Tensorflow and OpenCV.
The work found in ^{[94]} presents a CNN with 13 convolutional layers, an adaptation of the VGG16 architecture. The dataset was obtained by photographing solar modules of 6 × 12 cells with an EL camera. The network was trained using oversampling and data augmentation in order to reduce the error. The results show that the network performed the best when both oversampling and data augmentation were presented with a Balance Error Rate of 7.73% on binary classification problems of quick convergence. The method was implemented with Keras. The preprocessing was performed with OpenCV.
The authors of ^{[95]} present new models that are trained not only with images with cracks but also with corrosion. The images were obtained by photographing modules with the EL technique and performing segmentation afterward, obtaining 5400 images. The models are SVM and CNN. The CNN is composed of two convolutional layers. The SVM parameters are optimized by a grid search. The results show a precision of 99%, an improvement over other methods. The experiments were conducted via Keras and Tensorflow.
A variation of convolutional networks is found in ^{[96]}. A multichannel CNN is presented. This network has different convolutional layers for each kind of input. This network also can use inputs of different sizes. After each convolutional layer, a dense layer is applied. Finally, a final dense layer combines all the previous data in order to classify the image. This multichannel CNN improves the feature extraction of singlechannel CNNs. The dataset was made by 8301 different EL images of cells. The results show a 96.76% accuracy, much more than the 86% presented by singlechannel CNNs. The algorithms were implemented in Python using Keras.
The model presented in ^{[97]} is composed of six convolutional layers using different regularization techniques such as batch optimization. The dataset used was the ELPV dataset, with 2624 images. The resulting network is a light architecture that achieved high performance using few parameters with an accuracy of 93%.The experiments were run on Tensorflow.
In order to further improve the results, a new approach is presented in ^{[98]}. The authors use Fully Convolutional Neural Networks. An FCNN is a CNN without any dense layer. The model used is the Unet, which has been used previously in biomedical image problems with low data. This dataset was composed of 542 EL images. It is composed of 21 convolutional layers of different sizes. The results show that it was better to accept a slight decrease in the performance in order to improve the speed of the system. The algorithms were implemented in python using Keras and Tensorflow.
Wavelet Transform is used in ^{[99]}. This method combines two kinds of WT: Discrete WT and Stationary WT in order to extract textural and edge features from solar cells that have been previously preprocessed. The dataset was composed of 2300 EL images. Finally, two different classifiers are used: An SVM and an FFNN. The best model was the FFNN with 93.6% accuracy, over the 92.6% presented by the SVM.
Another Neural Network used is the Complementary Attention Network in ^{[100]}. The CAN is composed of a channelwise attention subnetwork connected with a spatial attention subnetwork. This CAN can be grouped with any CNN, Fast R CNN ^{[101]} being the one chosen by the authors. Two datasets were used, one composed of 2029 images and another of 2129 EL images. The network was used for classification and detection, obtaining an accuracy of 99.17% for classification and a mean average precision of 87.38%. The network was faster and had similar parameter numbers to other commercial methods. The algorithms were implemented using Python.
A very interesting approach is presented in ^{[102]}. This method is DeepFeatureBased, extracting features through convolutional neural networks that are classified afterward for classification algorithms such as SVM, KNN or FNN. The particularity of this system is that it used features from different networks. These features are combined using minimum redundancy and maximum relevance for feature selection. The dataset used was the ELPV dataset, with 2624 images. The selected CNNs for feature extraction are Resnet50, VGG16, VGG19 and DarkNet19. The best method was found with SVM, selecting 2000 features with an accuracy of 94.52% in twoclass classification and 89.63% in fourclass classification.
In the Table 5, a summary of the reviewed models is presented.
Table 5. Models for detection of faults.
Type

Features

Accuracy

Dataset Size

Image Processing Techniques

Segmentation + obtention of binary image + classification. ^{[90]}

from 80% to 99%

—

SVM + Image Processing Techniques

Images are preprocessed and features are extracted from the image. These features are used in an SVM with penalty parameter weighting. ^{[91]}

97%

13,392

SVM and CNN

Pretrained VGG19 using different feature descriptors. Similar results for both methods. ^{[92]}

82.4%

2624

CNN is composed of 2 layers using leakyrelu. SVM trained with different features extracted from the images. Similar behavior in both models. ^{[93]}

98%

540

CNN is composed of 2 convolutional layers. SVM parameters optimized by search grid. ^{[95]}

96%.

2840

CNN

Thirteen convolutional layers, an adaptation of VGG16. Uses oversampling and data augmentation. ^{[94]}

Uses a different measurement

5400

Multichannel CNN. Accepts inputs of different sizes. Improves the feature extraction of singlechannel CNN. ^{[96]}

96.76%

8301

Six convolutional layers. Regulation techniques such as batch optimization. ^{[97]}

93%

2624

Fully Convolutional Neural Network. Pretrained unet, composed of 21 convolutional layers. ^{[98]}

Uses a different measurement

542

CNN aided by a Complementary Attention Network, composed of a channelwise attention subnetwork connected with a spatial attention subnetwork. Usable with different CNNs. ^{[100]}

99.17%

2300

WT+ SVM and FFNN

Combines discrete WT and stationary WT to extract features and SVM and FFNN to classify them. ^{[99]}

93.6%

2029

CNN + SVM, KNN, etc.

Extracts features from different networks, combining them with minimum redundancy and maximum relevance for feature selection. Uses Resnet50, VGG16, VGG19 and DarkNet19. ^{[102]}

94.52%

2624
