1. Introduction
Current information and communication technologies (ICTs) have achieved a high degree of penetration in all critical infrastructure (CI) systems, owing to the ever-increasing capabilities of their services in terms of coverage, throughput capacity, latency, scalability, and privacy
[1][2][3][4][1,2,3,4]. In power systems, the massive introduction of telecommunication devices accelerated the shift toward smart grids (SGs)
[5] that come with a whole new package of functionalities such as automated control, smart sensing and metering, high-power converters, and modern energy management techniques based on the optimization of demand, energy, and network availability
[6]. The high-performance smart grid allows thereby for the insertion of new applications in the network like distributed generation, Industrial Internet of Things (IIOT), and electrical vehicles
[7]. This comes, however at the expense of increased complexity, which brings new vulnerabilities and broadens the attack surface
[8]. Recent extreme events of natural disasters, cyber-attacks, and man-made errors which we refer to as HILP events, have shown that SGs are susceptible to strong disruptions given the large-scale networks they represent, and the attendant interdependencies
[9]. Some recent examples are the power disruptions in the US in 2017, caused by hurricanes and wildfires
[10], which caused a cumulative damage of $306.2 billion, affecting a total of 47 million people—nearly 15 percent of the nation’s population. For instance, at the peak of hurricane Irma, more than 6.7 million electrical customers were without power
[11], and hurricane Maria severely damaged the Puerto Rico power grid leaving 1.5 million people out of power
[12]. China’s severe ice storm in 2008 resulted in the service disruption of 2000 power substations and 8500 towers leading to power interruptions in 13 provinces and 170 cities
[13], and over 4 million customers went on power outage for over seven days during the Great East Japan Earthquake in 2011
[14]. During the Ukraine power grid cyber-attack in 2015, 30 power substations were turned off, and hundreds of thousands of people were without electricity for a period from 1 to 6 h
[15][16][15,16].
Events like these reveal the need for strategies that are able to cope with such harsh impacts, especially given that the capacity to operate resiliently against attacks and natural disasters is one of the multiple smart grid attributes
[17]. Resilience is defined as the ability to “anticipate, absorb, adapt to and/or rapidly recover from a disruptive event”
[18]. In line with this definition, the U.S. Presidential Policy Directives-21(PPD-21) introduces resilience as “the ability to prepare for and adapt to changing conditions and withstand and recover rapidly from disruptions”
[19]. This same directive involves the “fail safe” paradigm in system engineering through recommendations for cyber-physical security, while highlighting the shift toward “safe-to-fail” paradigm brought by cyber-physical resilience. Many conceptual frameworks are proposed for understanding and evaluating resilience, where the time dimension is very important, as various facets (anticipation, absorption, robustness, survivability, mitigation, flexibility, adaptability, restoration, and recovery) are linked to different temporal phases that describe system performance during an extreme event
[20][21][22][23][24][25][20,21,22,23,24,25]. Resilience moves from traditional risk assessment, which relies on probabilistic analysis of likely failures, toward dealing with unexpected events, requiring mitigation and healing strategies. The main difference is that risk assessment aims to achieve situational awareness and diagnosis, while resilience moves one step further by incorporating reactive actions against the contingency and launching restoration operations, which maintain the functionality of most critical loads and/or make them rapidly recoverable
[26].
Within the growing literature on power system resilience
[27][28][29][30][31][27,28,29,30,31], utilities are particularly interesting in quantitative assessments of resilience, which propose relevant indicators to guide cost-benefit studies before planning investments. In this context, multi-dimensional characteristics of resilience are a considerable challenge
[32][33][34][32,33,34]. Ouyang and Dueñas-Osorio
[35] tackled technical, organizational, and social dimensions of resilience, while providing an alternative to evaluate the economic dimension by estimation of economic losses. Only the technical dimension of the power network is widely investigated in the literature
[36], which reveals the need to examine all other dimensions for a comprehensive analysis of resilience
[33][37][38][33,37,38]. Technical and organizational dimensions are the most suitable in the case of power grids as they can be applicable at individual system levels, while social and economic dimensions are better suited for community level (interdependent systems), to which resilience studies should converge in the future
[38]. Temporal multi-phase resilience quantification is a well-adopted technique that can embed other dimensions by linking them to technical and organizational dimensions through the implementation of enhancement strategies. Unlike
[35], most proposed metrics in literature exclude pre-event and post-recovery phases, suggesting that quantification is conducted for a single scenario and not for a sequence of disruptive events, which corroborates the relevance of resilience for HILP disruptions. Work in
[39] introduced resilience-based component importance measures centered in the recovery phase; on the one hand by establishing a ranking for load restoration using optimal repair time, and on the other hand, by quantifying the potential loss in optimal system resilience due to a delay in component repair process (computed through resilience reduction worth metric). Likewise,
[40] focuses on the recovery stage of resilience, with the goal of comparing different restoration strategies and selecting an appropriate performance measure. Authors in
[41] proposed a multi-phase framework to assess the resilience of the UK power transmission network under a windstorm. The framework considered both infrastructural and operational aspects, introducing four simple metrics to describe the degree and speed of degradation, duration of the disruption, and recovery speed. Grid connectivity and operational metrics can jointly describe the whole span of post-event analysis, and be used for planning short-term mitigation and recovery, or long-term hardening
[42]. Resilience strategies to minimize system performance loss can be further analyzed under budget constraint by a tri-level planner-attacker-defender model
[43], where a planner optimizes long-term transmission network expansion before an attack hits the system. Short-term switching operations are then applied in reaction. Resilience is quantified using customer demand not supplied, which includes both mitigation and recovery capabilities in the system. Many other optimization models and performance measures are adopted in related studies
[44][45][44,45].
Given widely stretched power networks, resilience studied at the system level for generation and transmission, does not (or negligibly) include distribution grid components
[35][39][40][41][42][43][35,39,40,41,42,43]. In 2010, only 15% to 20% of feeders implement distribution automation in the North American grid, one of the most advanced electrical systems
[46]. This illustrates that the PDN is the most fragile level of electrical systems due to legacy “blindness” and manual operations along with electromechanical components
[47], especially with the fact that an estimated 90% of customer outages in the US are related to this part of the system
[48].
The advent of smart grids renewed interest in enhancing the PDN performance
[49] as nearly all SG provided abilities of self-healing, high reliability, energy management, and real-time pricing are empowered by technologies introduced at the distribution level such as advanced metering, automation, distributed generation, and distributed storage
[50]. ICTs are the main enabler of this new portfolio of applications
[7], by transforming a traditionally one-way, limited-control, and radial PDN into a two-way power flow, intelligent, and mesh-networked grid capable of guaranteeing improved service for all connected loads
[51]. In this regard, expected high-performance capabilities of smart distribution grid can succeed in coping with most failures in the system
[49][50][49,50]. The smart PDN remains susceptible to HILP events, or even more prone in some cases, due to increased uncertainty (in events, load, distributed generation, market prices)
[52][53][54][52,53,54] and strong dependency on telecommunications that widen the attack surface
[55] and may cause undesirable cascade effects
[56]. Consequently, the resilience of smart PDN becomes a concern from both electric and communication domain perspectives, as a failure in the telecommunication service may affect the electric service
[57] and vice-versa
[58]. Recent publications recommend a joint handling of smart PDN resilience quantification as the robustness and adaptation ability of a coupled system are even lower than a single system
[56][59][60][61][56,59,60,61]. However, such an approach needs to build upon a solid understanding of resilience assessment of electric and communication domains when considered distinctly.
The present paper aims to set the ground for future joint evaluation of PDN resilience by reviewing relevant works, centered thus far on electric service, and to a lesser extent on ICT service. Essentially, the type of HILP event is identified from each selected contribution with details in the method used for contingency characterization. Also, the measure of performance, recognized as an enabler for resilience quantification
[62], is tracked through this work to explain how it is defined and computed, relying usually on system modeling, or empirical and surrogate models in some cases. In addition, a classification based on the temporal phase where resilience evaluation takes place is proposed, which allows for addressing practical requirements of utility companies. The resilience phase-based approach was linked with different objectives of the assessment, from simple metrics evaluation, to either planning or response for survivability and recovery, achieved through a variety of improvement strategies for which allocation is optimized under the constraint of a limited budget. This bridges resilience studies and economic considerations in order to help stakeholders in investment plan elaboration and crisis decision-making. Aspects of cost, critical load, microgrids, and uncertainty of hazards, load, and distributed generation are discussed to show their high importance, and available tools to date for their involvement in the study.
We extend by this work the wide spectrum of subjects associated with resilience quantification in power networks (modeling and simulation, enhancement strategies, metrics, and extreme events), covered in recent reviews
[28][36][44][45][63][64][65][28,36,44,45,63,64,65]. The main contributions and novelty of this paper can be summarized as follows: (a) focus on resilience assessment of both electric and telecommunications domains of smart power distribution networks. (b) Detailed analysis and classification of performance calculation techniques. (c) Fine-grained categorization of quantitative resilience works based on time of evaluation and target objective.
Finally, despite the considerable number of works analyzed and relatively deep examination of reviewed methods for resilience quantification in smart PDNs, this paper does not claim to be comprehensive in the issues addressed (and related references), but remains complete enough to give a good overall perspective of the research trends and understanding of challenges and opportunities.