Wind Energy Harvesting and Conversion Systems

Wind Energy Harvesting and Conversion Systems: Comparison

Please note this is a comparison between Version 1 by Ghanim Putrus and Version 2 by Jason Zhu.

Wind energy harvesting for electricity generation has a significant role in overcoming the challenges involved with climate change and the energy resource implications involved with population growth and political unrest. Indeed, there has been significant growth in wind energy capacity worldwide with turbine capacity growing significantly. This confidence is echoed in the wind power market and global wind energy statistics. However, wind energy capture and utilisation has always been challenging. Appreciation of the wind as a resource makes for difficulties in modelling and the sensitivities of how the wind resource maps to energy production results in an energy harvesting opportunity. An opportunity that is dependent on different system parameters, namely the wind as a resource, technology and system synergies in realizing an optimal wind energy harvest.

wind energy harvesting
forecasting techniques
turbine technology

1. Introduction

Electricity generation from renewable energy (RE) technologies such as solar, wind, hydro, biomass and geothermal offers sustainable energy sources of energy, and mitigates against detrimental environmental issues such as global warming and climate change [1]. RE also has economic importance, as the economy benefits from reduced cost of electricity generation, with RE generation derived from natural, renewable resources [2]. According to the International Energy Agency (IEA) and their global energy review in 2021 ^[3][4], the total renewable energy usage has shown a significant increase, from 4098 TWh in 2010 to 7627 TWh in 2020. Whereas hydropower contributes the largest portion of renewable energy capacity around the world for electricity generation, wind energy generation also shows a significant increasing trend. The 2021 International Renewable Energy Agency (IRENA) report ^[4][5] presents the economics for renewable energy in 2021. The report indicates that global weighted average levelized cost of energy (LCOE) of new onshore wind projects added in 2021 fell by 15%, year-on-year (to USD 0.033/kWh), whereas the cost of electricity for new onshore wind projects, excluding China, fell by a more modest 12% year-on-year to USD 0.037/kWh. The offshore wind market, saw unprecedented expansion in 2021 (21 GW added), as China increased its new capacity additions and the global weighted average cost of electricity fell by 13% year-on-year (to USD 0.075/kWh). If these figures are considered over a ten-year period, the LCOE has actually dropped by 68% and 60%, respectively, for onshore and offshore wind energy.

Wind energy harvesting for electricity generation, which was first introduced in 1970, has gained large popularity as the world moves to a carbon neutral renewable energy focus. Wind energy has a significant role in overcoming the challenges associated with fossil fuel depletion and environmental concerns it creates, as well as the ever-rising energy demand owing to population growth, economic growth, urbanization, lifestyle changes, and technological development ^[5][6][6,7]. It is abundant, extensively distributed, and ecofriendly. As explained by Ang et al. [1], the majority of wind energy capture opportunities are situated onshore. However, there is significant growth in offshore wind energy capacity, especially in Europe with turbine capacity growing at a constant rate of 16% since 2016 ^[7][8]. The location selection for wind turbines depends on the localized wind resource, which should be sufficient to generate the estimated power. For example, widespread land with high elevation or seashore is ideal for onshore wind energy harvesting [1]. The wind power market, which is resilient and cost-competitive, has quadrupled in the last decade ^[8][9]. According to the wind energy report by the Global Wind Energy Council (GWEC) in 2021, newly added wind energy capacity exceeded 93 GW (onshore 86.9 GW and offshore 6.1 GW) which is a 53% year-over-year increase compared to 2019 and it sums the global cumulative wind power capacity to 743 GW ^[9][10][10,11].

2. Wind Speed and Wind Power Forecasting

Feasibility analysis accompanied by an accurate wind resource assessment is critical for wind farm (or turbine) construction. In fact, it allows for finding the best potential location for the wind turbine that yields the highest profit. During the resource assessment phase, one or more anemometric towers are typically installed in the candidate location and wind data is generally gathered over the period of a year or more. Then the collected on-site data and wind data obtained from nearby meteorological stations for the last few decades are used for wind forecasting. The traditional method of linear correlations between the on-site data and meteorological data lacks accuracy and hence multivariate analyses, such as multivariate regression analysis and factor analysis are employed in wind forecasting, involving complex systems with many meteorological factors ^[11][20]. A vast number of forecasting techniques are reported in the literature. The published forecasts are mainly categorized as long-term and short-term. The former (long-term) forecasts wind over several days or months into the future and it is a relatively difficult and long process due to many variables influencing the weather. They are primarily essential for wind energy resource assessment. On the contrary, the latter (short-term) predicts wind in minutes or hours which is generally more stable and practical in applications ^[12][13][12,21]. Wind forecasting methods published in the literature can typically also be classified into four groups (physical methods, statistical methods, intelligent learning methods, and hybrid methods) ^[13][21]. Statistical methods are relatively simple and less expensive. Physical methods are more suitable for long-term forecasting whereas statistical methods are more applicable to short-term forecasting. The intelligent learning methods, which characterize non-linear correlations between input data and wind turbine power (harvested), are suitable for short-term forecasting. Hybrid methods, which encompass superior features of more than one model, are versatile and have comparatively superior prediction accuracies and capabilities. Though both intelligent learning methods and hybrid methods have superior forecasting capabilities, they have computational limitations ^[13][21]. On some occasions, wind forecasting models are classified into two categories: data-driven models and physical-driven models. The former basically maps the input variables and target variables while the latter is based on the numerical weather prediction system (NWP) [14]. In the modern day, the trend is to use hybrid models which encompass superior features of more than one algorithm. The hybrid models typically are versatile and have superior prediction accuracies and capabilities compared to traditional single methods [12]. Some authors have classified wind forecasting methods into two, i.e., direct methods and indirect methods. The former determines correlations between the related inputs (historical wind power data) and the future power target whereas the latter forecast wind speed first and then find the power forecasts using a wind power curve. The indirect forecasting methods can be used to assess the power generation potential of a planned wind turbine installation ^[8][9].

2.1. Statistical Methods

Statistical methods are typically based on time series models which characterize and analyze the trend of wind power data on the basis of maximum likelihood estimation and the least squares method. They typically use historical wind speed data as inputs without considering the meteorological information. Some of the frequently reported forecasting models based on statistical methods are autoregressive (AR), persistence model (PM), autoregressive moving average (ARMA), Markov chain model, autoregressive integrated moving average (ARIMA), autoregressive simple moving, vector autoregression moving average, fractional ARIMA, and generalized autoregressive conditional heteroskedasticity (GARCH) ^[8][15][9,18]. Though statistical models deliver highly accurate predictions for the linear component in data, predictions lack accuracy when the data are strongly nonlinear ^[16][17]. In comparison to physical methods and intelligent learning methods, statistical methods are computationally efficient. Some major drawbacks of statistical methods include the data-station hypothesis, strict distribution assumption, possible statistically biased estimators due to outliers, and lack of non-linear learning ability ^[15][18].

2.2. Physical Methods

Physical methods are fundamental analyses developed from physical theorems and related assumptions. In physical methods, the atmospheric evolution of meteorological phenomena and associated physical processes are modeled by a set of mathematical formulae and are numerically solved with pertaining initial and boundary conditions to simulate wind behavior in special and temporal scales ^[15][18]. Meteorological information such as humidity, temperature, pressure, surface roughness, obstacles, etc., is required for physical methods ^[8][9]. One of the popular techniques in physical methods is numerical weather prediction (NWP), which is capable of producing accurate results for long-term wind forecasting ^[9][10]. Furthermore, it can directly make power predictions from real-time data. However, it demands a large amount of historical data ^[17][19]. Spatial correlation models predict wind speed at different locations, using spatial relationships of wind speed and they provide higher accuracies on certain occasions ^[18][34]. Most of the physical methods are based on computational fluid dynamics (CFD) models. Some of the existing mainstream models based on physical methods are the regional ocean model system, the weather research and forecasting model (WRF), the community earth system mode, the fifth-generation mesoscale model (MM5), the European Centre for Medium-Range Weather Forecasts (ECMWF) model, and the global/regional assimilation and prediction system ^[8][9]. These models are capable of accurate long-term forecasting and have higher spatial and temporal resolutions as well as better space–time continuum performance. Moreover, physical methods generally do not demand an abundance of historical data, and data is usually needed only for model validation. However, they have heavy computational burdens making them time-consuming and susceptible to weak local predictability ^[8][15][9,18].

2.3. Intelligent Learning Methods and Hybrid Methods

Artificial intelligence learning methods have recently gained wide attention in wind forecasting, primarily owing to their excellent nonlinear learning ability and high efficiency. Compared to the statistical methods, the intelligent methods typically have more parameters and hence they can effectively model nonlinearities within data via iterative optimization ^[16][17]. These methods can either be used as single models or hybrid models. One-dimensional convolutional neural networks and radial basis function neural networks are two frequently used single models in wind forecasting ^[15][18]. Combined methods and hybrid methods have become popular for wind forecasting due to their superior accuracies compared to single models ^[15][18]. Combined methods are based on ensemble learning in which bagging, boosting, or stacking strategies are employed to have individual predictors. Here, model building depends on the integration weights of individual predictors. The combined methods usually have a higher computational burden due to multi-predictors. In contrast, hybrid methods only use one type of predictor, and all other components are employed to enhance the performance of the predictor ^[16][17]. Intelligent learning models can be basically categorized as shallow learning models and deep learning models. Some of the popular shallow learning models are support vector machine (SVM), back-propagation (BP) neural network, general regression neural network (GRNN), extreme learning machine (ELM), radial basis function neural network (RBFNN), echo state network (ESN), Elman neural network (ENN), etc. ^[8][14][9,14]. These models are prone to cause issues such as overfitting, poor convergence, and falling into local optima. On the other hand, deep learning models have gained popularity in many fields including wind forecasting due to their superior attributes such as strong generalization ability, big data training, and unsupervised feature learning [14]. They are excellent at capturing features from the original data and modeling interdependencies between historical data and targets. As a result, they are widely used for temporal dependence modeling and feature extraction. Some popular feature extraction models are deep belief network (DBN), stacked autoencoder (SAE), and convolutional neural network (CNN) [14]. These models do not have connections between neurons in the same layer. For most real wind forecasting applications, single methods may not satisfy the expected higher accuracy levels despite their superior prediction performances. This is mainly because wind speed and power generation are affected by various natural and operational factors such as wind speed and direction, air pressure, wind turbine friction, weather conditions, etc. Consequently, time series data which comprise both linear and nonlinear information cannot be accurately modeled by merely using either statistical models or intelligent models. Hence, hybrid models are the best candidates for real-world wind forecasting applications ^[17][19]. The framework of hybrid models is typically integrated with data preprocessing strategies such as feature extraction and selection, denoising, and decomposition ^[16][17]. One of the most popular decomposition strategies is wavelet transform (WT), which decomposes the signal into a low-frequency component and a high-frequency component, thus making further analysis more insightful ^[17][19]. In addition, the mode decomposition techniques such as empirical mode decomposition (EMD), ensemble empirical mode decomposition (EEMD), complementary ensemble empirical mode decomposition (CEEMD), complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), and variational mode decomposition (VMD) are commonly used for data preprocessing ^[16][17]. Despite the shortcoming that the number of decomposed modes is selected by experience in VMD, it has been shown to have good performance in wind forecasting applications ^[16][17]. Nevertheless, experienced-based selection can be unreliable in some situations.

2.3.1. Wind Speed Forecasting

In hybrid models, more parameter optimization components are employed to enhance the forecast accuracy. There are two classifications of hybrid models based on the setting of the objective function; these are single-objective optimization and multi-objective optimization. In the former, the objective function is typically based on the prediction error (sum of squared error SSE or mean square error MSE) and it does not explain overfitting on some occasions. Particle swarm optimization (PSO), firefly optimization algorithm (FA), and genetic algorithm are three commonly used intelligent single-objective optimization algorithms ^[16][17]. On the other hand, the objective of the latter is based on both prediction accuracy (mean squared error (MSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) and prediction stability (variance of error). Further, the two objective functions need to be conflicting, thus a Pareto optimum exists between the two. However, such a conflict of metrics lacks theoretical rationality and needs to be verified ^[16][17]. For this purpose, Gao et al. ^[16][17] proposed a multi-component ensemble hybrid model for ultra-short-term onshore wind speed forecasting. The framework of the model includes a decomposition and integration strategy (mode discrimination degree, adaptive variational mode decomposition), k-point modified multi-objective golden eagle optimizer (k-MoGE), and weight hybrid kernel extreme learning machine ^[16][17]. Here, k-MoGE based on bootstrap bias–variance tradeoff theory addresses the unproven conflict between the objective functions. The proposed model can produce both single-step and multi-step ahead forecasts with higher accuracy compared to most existing conventional statistic models, shallow neural network models, and deep neural network models. In addition, it has superior direction accuracy and out-of-sample prediction ability. Moreover, the proposed model has the ability to overcome the decline resulting from an increasing forecast horizon and it has an enhanced generalization ability. Support vector machine (SVM) is a method in supervised learning and is used for regression, classification, and detection of outliers [12]. In SVM regression, the aim is to identify a plane that separates all data into two distinct regions so that the data distribution is accurately predicted. The Jaya optimization algorithm developed by Rao ^[19][35] is an algorithm for the optimization of both constrained and unconstrained optimization problems. It is a simple, yet powerful algorithm and the aim is to achieve the optimal solution avoiding inferior solutions. More importantly, this algorithm does not require any additional algorithm-specific control parameters except for common control parameters such as the population size and the number of iterations. Liu et al. [12] proposed a Jaya algorithm-based SVM (Jaya-SVM) model for multistep short-term wind speed forecasting and compared its performance against seven other forecasting models such as extreme gradient boosting model, stacked sparse autoencoder, multi-layer perceptron regression model, least absolute shrinkage and selection operator, deep belief network, granular computing method, and Gaussian process regression. The performance assessment proved that the proposed Jaya-SVM model had the lowest MSE and the highest R square (R²) in comparison to the other seven models mentioned above. Further, the proposed model was reported to have higher reliability. However, the higher the number of prediction steps, the lower the accuracy and reliability of the Jaya-SVM model. Point forecast (PF) is the most common in many wind forecasting studies, and it does not provide information on real data distribution ^[9][10]. Developing optimal strategies based only on PF is difficult and increases the decision-making risk. As a result, interval forecast (IF) is important to minimize the uncertainty associated with PF. Xing et al. ^[9][10] developed a wind speed forecasting system with a novel multi-objective optimizer for PF and IF (multi-objective Aquila optimizer (MOAO)). Data pre-processing (fuzzy information granulation (FIG)) and optimal benchmark model selection (OBMS) are also included in the proposed system. The study used six intervals of wind speed from two sites and employed 14 benchmark models and 3 combined forecasting models. Here the FIG uses fuzzy windows to extract effective wind speed data and it significantly enhances the forecasting effectiveness. On the other hand, OBMS chooses the optimal five models having the best performance in a specific situation, which significantly improves the combined model performance. The optimal solutions were effectively achieved by MOAO and the proposed model was reported to have better stability and superior forecasting effectiveness. The long short-term memory (LSTM) network is appealing for time series prediction problems owing to its superior ability to handle long-term dependency problems ^[20][36]. It has gained wide attraction due to its ability to address the gradient explosion problem in conventional neural networks. Moreover, it learns and remembers both short-term and long-term information making it suitable for time-series predictions ^[17][19]. Chen et al. ^[20][36] constructed a novel hybrid model based on the LSTM network and the BP neural network for short-term wind speed forecasting. Here, data de-noising is performed by using SSA, whereas CEEMDAN is used to decompose the de-noised data into IMF components, which significantly improves the signal-to-noise ratio. The error accumulation and computational redundancy are minimized by determining the time complexity and the correlation of each IMF component using fuzzy entropy value and the Spearman correlation, respectively. Training and prediction are carried out by the LSTM network and BP neural network optimized by the sparrow search algorithm and results were obtained by superimposing the results of two schemes. Though quantile-based probabilistic forecasting models are capable of producing satisfied prediction intervals, the obtained prediction intervals can be crossed and may violate the monotonicity of different conditional quantiles. Moreover, the forecasting performance of the models is affected by the completeness and quality of features. Consequently, mining adequate information from limited data is crucial. For this purpose, Zou et al. ^[8][9] developed a hybrid probabilistic wind forecasting model based on deep learning, multi-scale feature (MSF) extraction, kernel density estimation (KDE), and non-crossing quantile loss. Here, the multilayer CNN is employed to extract MSFs with simple patterns. Further extractions and encoding of temporal information for features are conducted by using attention-based LSTM, which reduces the computational cost. The positive difference of adjacent conditional quantiles was obtained from the final feature obtained by concatenating all the encoded feature vectors. The monotonicity of different conditional quantiles was ensured based on non-crossing quantile loss. The forecasting uncertainty was comprehensively evaluated by estimating probability density functions (PDFs) for prediction intervals using KDE. The study used wind data in four different places in South Dakota in 2012 as four datasets and each dataset has 8760 data and is divided into training (first 8 months’ data) and test (last 4 months’ data) sub-datasets. It has been reported that MSFs improve forecasting performance and the crossing problem associated with quantile-based models can be effectively solved by non-crossing quantile loss. The proposed model generates highly accurate one-step-ahead wind speed forecasts. This model can further be upgraded for multi-step ahead probabilistic forecasting. The convolutional neural network (CNN) is a popular deep neural network. It is multilayered and feed-forwarding (continuous learning in one direction). CNN has gained attraction due to its superior ability to extract hidden spatial features. It requires a lesser number of parameters compared to other deep neural network models, thus making it converge faster. In addition, connectivity and weight sharing are comparatively lower in CNN ^[21][13]. During the training phase, the regular recurrent neural networks only consider the previous correlation of sequential data. Bi-directional long short-term memory (Bi-LSTM) is a modification of LSTM which considers both forward and backward layers of LSTM. The forward LSTM extracts the past information of the input sequence whereas the backward LSTM obtains the future features of the sequential data. It uses both connections before and after updating the sequential neurons’ weights to simultaneously analyze the past and future information of time series data ^[21][13]. In the quaternion convolutional neural network (QCNN), interior relationships are encoded using the quaternion algebra while exterior relationships are trained by the convolutional method. Accurate long-term wind forecasting is indispensable since wind turbine installation location is mainly determined based on the long-term wind energy potential. As a result, economic feasibility analysis for a potential wind turbine as well as the selection of wind power equipment attributes depends upon the long-term wind forecast. Neshat et al. ^[21][13] constructed a hybrid wind forecasting model based on QCNN and Bi-LSTM for long-term wind forecasting. The model can predict the wind speed highly accurately for up to 1 day into the future. An adaptive decomposition technique including VMD and arithmetic optimization algorithm (AOA) was used to obtain IMFs. Here, the VMD decomposes the wind data into optimal signal components while the AOA optimizes the parameters of the VMD. Wind data from 2011 to 2020, in Lesvos and the Samothraki Greek islands, were used to construct and validate the model. The performance of the proposed hybrid model was compared with five popular machine learning models including Bi-LSTM and standard LSTM and two other hybrid models; the proposed model has superior accuracy and stability. The applicability of the proposed model to other wind datasets from different regions needs to be evaluated. An ensemble multi-feature complementary prediction model for wind speed forecasting was constructed by Wang et al. ^[22][37]. In this model, the SAE algorithm is employed to reconstruct wind speed subsequences decomposed by VMD. Multiple features of wind speed sequence can be fully explored by SEA. In addition, it minimizes the complexity of the algorithm and, avoids redundancy. The complementary prediction algorithm used in this model consists of support vector regression (SVR) and bidirectional LSTM which is integrated and weighted by the linear weighted sum method (LWSM). The proposed algorithm eliminates the local minimum problem. For optimizing the prediction results, the model uses the bidirectional gated recurrent unit (BiGRU) which applies cuckoo linear integration to integrate the results of LWSM. The proposed model is claimed to have a better generalization ability and higher accuracy compared to other models. A one-day-ahead wind speed forecasting model based on a deep learning gated recurrent unit (GRU) network was developed by Wu et al. ^[23][38]. The performance of the GRU network was enhanced by selecting the necessary input variables according to the correlation coefficients with large values based on the Pearson correlation, the partial correlation, and the maximum information coefficient analyses. Further, hyperparameters of the network were determined by auto-correlation and partial auto-correlation analyses. The performance of the proposed model was evaluated by using the single error evaluation criteria (run-time, MAPE, and MSE error evaluation criteria) as well as the average accuracy evaluation technique based on the Friedman and Nemenyi hypothesis tests. The proposed model has better accuracy compared to three other popular models (the persistence model, SVR, and LSTM). This strategy of selecting input variables and setting hyperparameters can be used in other similar models to improve accuracy.

2.3.2. Wind Power Forecasting

Multi-step ahead time series forecasting models are usually two types, i.e., direct forecasting and recursive forecasting. However, in some instances, two approaches are combined as direct-recursive forecasting to minimize the forecasting error. The kernel-based machine learning is commonly used in wind forecasting, and the least-squares support vector machine (LSSVM) is one such method ^[13][21]. Here, the kernel function separates originally complex sample data with different dimensions and maps the corresponding data to a higher-dimensional space. The local kernel is sensitive to data distance characteristics whereas the global kernel is not affected by data distance. The hybrid kernel methods which combine features of multiple kernels are more efficient. Time series forecasting models are simple and effective. However, traditional time series forecasting models are prone to inaccuracies in multi-step time series forecasting ^[13][21]. A multi-step time series forecasting model based on a hybrid-kernel LSSVM was developed by Ding et al. ^[13][21] for short-term wind forecasting. This model is based on three processes, i.e., decomposition, classification, and reconstruction. Decomposing wind power time series and classification of decomposed components into three amplitude-frequency classes were conducted based on maximal wavelet decomposition and fuzzy C means (MWD-FCM) algorithm. Time series models were developed for each class separately based on the LSSVM with three different kernels and they were optimized by non-dominated sorting genetic algorithm II (NSGA-II). Two data sets which comprise real wind farm data sampled at an interval of 15 min in Shanxi Province, China in May 2016, and historical wind power generation data sampled at an interval of 10 min, in Sotavento in October 2017, were used to develop the model. The proposed model was compared with two benchmark models (empirical-mode-decomposition-LSSVM model and wavelet-decomposition-LSSVM model) and performance was analyzed by root mean square error (RMSE) and MAE. The proposed model was reported to have better accuracy in 5-step, 10-step, and 15-step ahead wind forecasting. Nonetheless, the model needs to be upgraded with further information for medium-term and long-term forecasting. In wind forecasting systems, data pre-processing is crucial for the effective extraction of original data features and for minimizing the signal-to-noise ratio. Some of the commonly used signal decomposition algorithms in wind data pre-processing are empirical wavelet transform (EWT), wavelet decomposition (WD), EMD and derivative algorithms, VMD, and singular spectral analysis (SSA) [14]. The basis function and threshold can impact the effect of WD. EMD and associated derivative algorithms are prone to the “endpoint effect”. In contrast, VMD is less sensitive to noise and is able to decompose components with similar frequencies [14]. Most wind forecasting models adopt linear weighted combinations. A hybrid wind forecasting model consisting of decomposing strategy, a nonlinear weighted combination, and two deep learning models was proposed by Jiandong et al. [14] for short-term forecasting. Here, the VMD technique was employed to decompose the original power series to enhance predictability. Sub-series (IMFs) prediction models were constructed by using the long short-term memory (LSTM) network and deep belief network combined with particle swarm optimization (PSO-DBN). The final prediction value was achieved by the non-linear combination mechanism based on the PSO-DBN model proposed. The nonlinear combination strategy is reported to be more effective than a linear combination strategy. Despite being time-consuming, the proposed hybrid model has enhanced performance overall to single models (the LSTM model, the PSO-DBN model, the BP model, and the Elman model). However, this model can be further improved by minimizing the delay characteristics in the algorithm and reducing the complexity. Moreover, the impact of non-Gaussian noise on the model performance needs to be investigated. A two-step wind power forecasting method based on an improved residual-based CNN was developed by Yildiz et al. ^[24][39] for very short-term forecasting. In this model, VMD-based processes extract features and convert them into images. Then an improved residual-based deep CNN with the stochastic gradient descent (SGD) optimization algorithm is employed to predict wind power. The combined dataset of wind speed, wind direction, and wind power from a wind farm in Turkey between January and December 2018 was used to develop the model. The proposed model was compared with other novel deep learning architectures such as SqueezeNet, VGG-16, GoogLeNet, AlexNet, and ResNet-18 and its forecasting accuracy was superior. Attention mechanisms that assign different weights to different input features to quantify their relevant importance in forecasting can greatly improve the accuracy and generalization of forecasting models ^[15][18]. Only a few studies have been conducted on attention mechanisms and most of them have focused on the attention design of input features without much focus on information extraction from hidden layers ^[15][18]. Tian et al. ^[15][18] developed a novel single-step-ahead wind power forecasting model which includes feature decomposition, self-attention, forecasting, optimization, and performance evaluation modules. The feature decomposition module employs improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) to decompose wind power data into several intrinsic mode functions (IMFs) with different frequencies, and high-frequency IMFs are removed as redundant noise. The self-attention module is based on a dual-stage self-attention mechanism (DSSAM), and it assigns relevant attention weights to features in the input layers as well as hidden layers. In the forecasting module, a gated recurrent unit (GRU) combined with the DSSAM predicts wind power. The adaptive moment estimation (Adam) is used to optimize the parameters of the forecasting module by means of minimizing the MSE. The performance of the proposed model was compared with existing linear and nonlinear benchmark models with and without other feature decomposition methods. The wind power produced by turbines is influenced by meteorological factors such as temperature, moisture, pressurization, wind direction, and wind speed. Al-qaness et al. ^[25][40] used a time series forecasting approach based on a modified adaptive neuro-fuzzy inference system (ANFIS) for wind power forecasting. Four wind power datasets from wind turbines in France, from 2017 to 2020, were used for the model development. A modified marine predator algorithm (MPA) which improves the searchability of the standard MPA was used to enhance the conventional ANFIS so that ANFIS parameters are optimized for improved prediction accuracy. The proposed model was compared with other modified ANFIS models and some benchmark models (LSTM, NN, and SVM) and demonstrated that it has significantly better performance and accuracy. In developing or enhancing wind power forecasting models, the majority of the studies have focused on point forecasting and only a few have considered interval forecasting. Compared to interval forecasting, point forecasting generates limited information beyond the predicted value. Prediction errors are inevitable in wind power forecasting due to the impact of local topography, nonlinear dynamics of wind turbines, and unplanned downtime. Interval prediction is more useful in this regard as it yields a range of predicted values with probability, thus making decision-making and risk analysis easier. An ideal prediction interval should have valid coverage in finite samples and should be as narrow as possible in the input space. The broader the interval, the higher the uncertainty of the prediction. Some of the popular and simple interval forecasting methods are mean-variance estimation, Bayesian, bootstrap-based method, etc. ^[26][41]. However, Bayesian and bootstrap-based methods usually have a high computational burden, making them time-consuming. Furthermore, the mean-variance estimation method suffers from low empirical coverage probability. Further, these methods require predefined assumptions or other prior hypotheses on the probability distribution of wind data. On the other hand, methods such as quantile regression, kernel density estimation, and ensemble simulations require no such prior assumptions ^[26][41]. However, valid coverage in finite samples is not always guaranteed with these methods. In contrast, conformal prediction can construct prediction intervals with valid coverage without prior distributional assumptions. Full conformal prediction and split conformal prediction are the two most common conformal prediction methods ^[26][41]. A wind power interval forecasting model based on a temporal convolutional network (TCN) combined with the conformalized quantile regression (CQR) algorithm was developed by Hu et al. ^[26][41]. The proposed model can control the mis-coverage rate without depending on the choice or the accuracy of the underlying estimators while adapting heteroscedasticity within the wind data, resulting in narrow prediction intervals that lead to higher accuracy. The proposed model was compared with some benchmark models (BPNN, RNN, LSTM, and GRU) and it was shown to have better performance in both point prediction and interval prediction with valid coverage and shorter interval bandwidth. More importantly, drawbacks such as iterative propagation and gradient explosion associated with conventional RNN-based models do not present in the proposed model and it is capable of handling very long sequences concurrently. This model needs to be further improved to overcome the crossing problem associated with quantile prediction. In addition, the structure of the model can be further modified to achieve joint multi-interval prediction and mitigate limitations with the network depth and the dilation factor. Zang et al. ^[17][19] constructed a hybrid model based on ARIMA with an additional seasonal component and LSTM network for short-term wind power forecasting of an offshore wind turbine. Neither linear nor nonlinear assumption was required as the proposed model was composed of linear and non-linear techniques. The decomposition technique discrete wavelet transform (DWT) was used to improve the prediction accuracy of the model. In addition, data preprocessing techniques such as re-sampling, isolation forest (IF), and interpolation were used to enhance the quality of the datasets used in the study.