Accurate forecasting is one of the keys to the efficient use of the limited existing energy resources and plays an important role in sustainable development. The existing research related to power forecasting can be grouped into three categories: (1) statistical models, (2) deep learning models, and (3) hybrid models. Statistical approaches are often based on autoregressive linear regression models. Deep learning models are based on various types of neural networks that employ a fixed lookback window to produce forecasts. Hybrid models combine two or more methods to create a single forecast.
1. Introduction
Limited natural resources, coupled with an incessantly growing global demand for energy, have given rise to an unsustainable situation. As the world’s population continues to expand, the need for fuel and energy has skyrocketed. Of particular concern is the rapid rise in demand for electricity, in part driven by the widespread adoption of electric cars and other environmentally friendly alternatives. The limited capacity for electricity generation necessitates careful planning and efficient production to meet these escalating demands.
Efficiency in power distribution and utilization hinges upon the accurate forecasting of future supply and demand. The ability to forecast electricity generation accurately and in a timely manner enables utilities to maximize the utilization of available resources. As a result, forecasting electricity generation has become an imperative task in modern industrialized societies. However, the increasing integration of renewable energy sources into today’s power systems has introduced greater volatility into electricity generation. Despite an extensive body of literature dedicated to forecasting electricity prices, few recent studies have addressed the specific challenge of forecasting electricity generation. This research aims to fill this gap by proposing a novel approach to forecasting electric power generation based on transfer learning.
The existing approaches to energy forecasting can be grouped into three main categories: statistical, deep learning, and hybrid methods. Statistical methods employ autoregressive models to construct forecasts. While statistical methods are capable of producing accurate forecasts, they are slow to compute and require significant execution time. Deep learning approaches utilize a plethora of deep learning models to perform forecasts. While deep learning models have produced accurate results in some cases, they are not wellsuited for energyrelated forecasts due to the lack of data. It is well known that deep learning thrives with big data. Since time series data in energy domains are limited, it is not sufficient to fully train deep learning models. Finally, hybrid models combine two or more methods to design a multistep forecasting model. While these approaches may work in special cases, they are difficult to generalize. Recently, transfer learning has shown immense potential to accelerate the training process and improve accuracy.
Transfer learning, a powerful technique for training deep learning models, has proven successful in various domains, such as computer vision, speech recognition, and large language models. However, its application within the context of electric power forecasting has been limited. In this research, the researchers propose a forecasting model based on transfer learning principles. In particular, the researchers employ a hitherto unexplored approach based on zeroshot learning in order to achieve minimum execution time. Unlike traditional transfer learning paradigms, zeroshot learning does not require additional finetuning of the pretrained base model, which leads to a more efficient and flexible model. To implement the strategy, the researchers utilize the Neural Basis Expansion Analysis for Time Series (NBEATS) model on a vast dataset comprising diverse time series data. The trained model is applied to forecast electric power generation using zeroshot learning.
Transfer learning has emerged to tackle the issue of effectively training deep learning models in the regime of limited data availability. The core idea of transfer learning is to pretrain a model on an extremely large dataset and then finetune the trained model on the specific task using an additional, smaller dataset. Since recurrent models that process data in a sequential manner are notoriously slow to train, they are not well suited for transfer learning. On the other hand, the NBEATS model consists of fully connected blocks with residual connections that are fast to train. Consequently, it is better suited for transfer learning than recurrent models.
2. Statistical Models
Traditional statistical techniques remain popular approaches to electricity forecasting. In most cases, the time series is modeled via an autoregressive linear regression model. The series’ past values (input) are employed to predict its future values (output). Least squares linear regression, maximum likelihood estimation, and autoregressive moving average methods are common methods for estimating model parameters. Recently, linear regression models with a large number of input features that employ regularization techniques have risen to prominence. The most commonly used regularization method is the least absolute shrinkage and selection operator (LASSO), which minimizes the 𝐿1 norm of the feature vector, and has been shown to improve the model’s performance ^{[1]}^{[2]}. Another development has been the application of additional preprocessing techniques such as variance stabilizing transformations ^{[3]} and longterm seasonal components ^{[4]}. In addition, ensemble methods that combine multiple forecasts of the same model calibrated on different windows have also become popular ^{[5]}. Ensemble algorithms based on a statistical analysis of the forecast error, signal shape, and difference of derivatives in joint points have been used to forecast current and voltage time series ^{[6]}. Finally, the researchers note that unlike financial time series forecasting, where generalized autoregressive conditional heteroskedastic (GARCH) models are ubiquitous, electricity forecasting is more accurate using the basic autoregressive (AR) and autoregressive integrated moving average (ARIMA) models ^{[7]}^{[8]}.
3. Deep Learning Models
Deep learning is used extensively in time series forecasting. The main workhorse has been the recurrent neural network (RNN) and its extensions such as gated recurrent units (GRUs) and long shortterm memory (LSTM) networks. The RNN model is designed specifically for sequencetosequence modeling, where the output in the next time step is based on the current input as well as the previous output ^{[9]}. A large number of studies have used LSTM, whose architecture is designed to reduce the issue of the vanishing gradient signal, as the core forecasting model with varying degrees of success ^{[10]}^{[11]}^{[12]}.
Convolutional neural networks (CNNs) have also been employed in time series forecasting. Notably, the temporal convolutional network (TCN) has been a popular forecasting model in various applications, including electricity forecasting ^{[13]}. The TCN model consists of three main components: causal convolutions, dilated convolutions, and residual connections. More recently, pure deep learning models based on stacked fully connected layers have shown impressive results. The NBEATS model showed stateoftheart performance on several wellknown datasets, including M3, M4, and TOURISM competition datasets containing time series from diverse domains ^{[14]}. Its architecture employs backward and forward residual links and a very deep stack of fully connected layers. It has several desirable properties: being interpretable, applicable without modification to a wide array of target domains, and fast to train. A similar model, Neural Hierarchical Interpolation for Time Series Forecasting (NHiTS), showed even greater accuracy on a wide range of forecasting tasks, including electricity forecasting ^{[15]}.
4. Hybrid Models
Over the last several years, there has been an explosion of hybrid forecasting models, which comprise at least two of the following five modules ^{[16]}:

Decomposing data ^{[17]}^{[18]};

Feature selection ^{[19]}^{[20]};

Clustering data ^{[21]};

One or more forecasting models whose predictions are combined ^{[22]}^{[23]};

A heuristic optimization algorithm to either estimate the models or their hyperparameters ^{[24]}.
By combining two or more different techniques, researchers aim to overcome the deficiencies of individual components. The drawbacks of hybrid approaches include increased computing time and increased model complexity, which hinders model interpretability.
5. Transfer Learning
There are several studies that employ transfer learning in energy forecasting. Existing methods use traditional transfer learning techniques, where a model is pretrained on external (source) data that are similar to the target data. Then, the target data are used to finetune the model ^{[25]}^{[26]}^{[27]}^{[28]}^{[29]}. In ^{[25]}, the authors employed a conventional approach to transfer learning. First, a CNN model is pretrained on public data that are similar to the target data. Then, the last layers of the model are retrained using the target training data. A similar approach is employed in ^{[26]}, where a model is pretrained using external data that are similar to the target data. Then, the target data are doubled via a generative adversarial network to increase the size of the target training data. Finally, the last layers of the model are retrained using the enlarged target data. Furthermore, in ^{[27]}, the authors explored three different transfer learning strategies: (1) retraining only the last hidden layer, (2) using the pretrained model to initialize the weights of the final model, and (3) retraining the last hidden layer from scratch. Sequential transfer learning, where the time frame of the source data precedes that of the target data, was explored in ^{[30]}.
Another approach to using transfer knowledge from external data is based on data augmentation. In ^{[31]}, the authors propose a transfer learning approach called Hephaestus designed for crossbuilding energy forecasting. In this approach, the information from other buildings (source) is used together with the target building to train a forecasting model. Since both the source and target information is used simultaneously to train the model, this approach is akin to augmenting the training data. Other innovative approaches include chain transfer learning, where multiple models are trained in sequence. In ^{[32]}, the authors use chainbased transfer learning to forecast several energy meters. In this approach, the first meter is trained traditionally using RNN. The model for the next meter starts the training process with the pretrained model from the first meter. The process continues in a chainlike manner.