3. Big Data Analytics in a Power System Context
BDA is a class of techniques used for processing data and performing functions such as K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Deep Neural Networks (DNN), Long Short-Term Memory (LSTM), etc. BDA has been widely applied in solving various complex problems, including problems such as classification, clustering, pattern recognition, predictive analysis, data forecasting, statistical analysis, natural language processing, etc.
3.1. Big Data Analytics
3.1.1. Artificial Neural Network (ANN)
ANN is among the most popular Artificial Intelligence (AI) approaches that was inspired by biological nervous systems, consisting of neurons and a web of their interconnections. Simple processing elements, emulating a typical biological neuron, and a network of those interconnected elements allow for identification of complex patterns within a set of data, provided as an input, and for storage and quick application of those identified patterns for future use. ANN is usually made of three layers, namely the input layer, output layer, and hidden layer. The input layer takes the data from the network and is connected to the hidden layer and then the output layer. Based on the interconnection of nodes and data movement between the layers, ANN can be classified into two basic categories: the feedforward ANN and the recurrent ANN. In the feedforward ANN, the information moves from input layer to output layer in one direction. The recurrent ANN allows some of the information to move in the opposite direction as well.
Figure 1 presents a simplified structure of an ANN that consists of hidden and output layers. Each layer has a set of weights (
W) to be assigned and to be applied to the input vector (
p); then, a bias vector (
b) is added upon the weighted input vector before it is fed into a transfer function (
f) for further processing. The output of the transfer function,
f, in the hidden layer serves as an input to the output layer; the results generated from the transfer function,
f, in the output layer will be the final outputs of an ANN network. The ANN parameters, including weights and biases, are determined by training the ANN, involving a set of data (including both inputs and the expected outputs) fed as an input to the ANN and adjusting the ANN parameters until the ANN outputs match the outputs of the training dataset to a certain desired accuracy
[5].
Figure 1. Illustration of the structure of an ANN.
ANN has been widely used for various applications in power systems that require identification of a hidden relationship between the known input–output datasets and then applying that identified relationship to an input dataset—with unknown outputs—to quickly predict outputs. With its merit of powerful data analytics, it has been also investigated for the applications of load forecasting, economic dispatch, fault diagnosis, harmonic analyzing, and system security assessment
[11].
The ANN structure, such as the network size (e.g., the sizes for the input/output neurons) or the choice of transfer function, is not universal for all applications. The ANN structure should be selected and optimized for the needs of the specific application. With its merits of prediction accuracy and high adaptation, beyond applications in power systems, ANN has also attracted great attention in a wide range of applications in the areas of prediction, clustering, curve fitting, etc.
3.1.2. Deep Learning Techniques
Deep learning techniques have attracted a great deal of attention in the past decade
[12] due to their promising results and great accuracy in large-scale pattern recognition, especially in solving visual recognition problems
[6]. Deep learning techniques include Multilayer Perceptrons (MLPs), autoencoders, Convolutional Neural Networks (CNNs), LSTM, Recurrent Neural Networks (RNNs), etc.
MLP, as one of the early developed deep learning techniques, is a feedforward neural network with multiple perceptron layers equipped with activation functions. Both input layers and output layers in MLPs are Fully-Connected (FC). MLPs have been popularly applied to image and speech recognition problems.
Autoencoders mainly consist of three components: encoder, coder, and decoder. They are a specific type of feedforward neural network that have identical inputs and outputs. They have been mainly used for image processing, popularity prediction, etc.
CNNs consist of convolution layers that have a set of learnable filters/kernels that slide through input data to extract patterns in the data. Following the convolution layers, an FC neural network is used for classification. In both convolutional layers and neural network layers, multiple layers are typically used to model the complexity of patterns in large data. As one of the most popular deep learning techniques, CNN is popularly used for image recognition. RNNs, which have connections that form directed cycles, allow the outputs to be fed into hidden state as inputs, allowing for the previous inputs to be memorized. RNNs are commonly used for time-series analysis and for natural-language processing problems.
It has been found that deep learning techniques typically perform better than general neural networks, especially in solving multiple data class problems and machine learning applications with complex data structures. To cope with a large amount of data and to facilitate the extraction of more informative features
[13], deep learning techniques usually require a relatively large number of hierarchical layers, suggesting the need for a large computational effort
[14]. Deep learning techniques, due to the development in high-performance hardware, sometimes specifically designed for faster execution of those techniques, are becoming increasingly popular in various practical applications.
With its powerful pattern recognition capabilities and great accuracy in exacting information from complex data structure (formed by measurements collected from various locations and devices with different sampling frequency), deep learning has great potential in solving various power system problems. For instance, the unsupervised deep learning approach auto-encoder has been applied for load profile classification
[15]. CNNs have been used to estimate the state–action value function for controlling residential load control
[16], although its application at a system level is limited. Further exploration is needed to fully exploit the powerful capabilities of CNNs in pattern recognition and estimation applications.
3.2. Approaches to Integrate BDA in a Power System Context
Due to the large scale of actual power system/networks, it is impractical, if not impossible, or cost ineffective to have measurements at all desired locations. On the other hand, not all measurement data are useful to achieve a desired application’s objective. Variables/parameters in BDA, therefore, should be carefully selected and processed to ensure usefulness of the collected data specific to a selected or prevailing system scenario.
Furthermore, power system performance highly depends on the topologies and operating conditions that vary constantly. It is desirable that the topology/configuration of power systems can be embedded in the input matrices of the BDA learning mechanisms. Per
[17], the System Area Mapping process of feature extraction from the input data matrix was analyzed from a power system configuration perspective. The input matrix was arranged in a way that a patch in the input matrix summarizes the topology information of the corresponding area in the considered power system. Taking a 24-bus test network (given in
Figure 2) as an example, different square patches in the input matrix map corresponding areas in the considered power system. Later, by sliding the kernels through these square patches, the features/characteristics in the local area can be extracted and integrated into a higher level in the feature map. Usually, many different kernels will be used in order to extract information from different aspects, capturing a varied set of characteristics and patterns that exist in a considered power system. Through this large set of kernels and through a number of feature extraction layers, useful information captured at a local area will eventually be summarized and integrated into a global area. These approaches should be tailored from a power system context, particularly where the performance of a considered technique as part of the BDA is greatly dependent on the considered power system, its configuration/topology, and its operating conditions.
Figure 2. Illustration of system area mapping