Reinforcement Learning in Home Energy Management System: History
Please note this is an old version of this entry, which may differ significantly from the current revision.
Contributor: ,

The steep rise in reinforcement learning in various applications in energy, as well as the penetration of home automation in recent years are the motivation for this article. It surveys the use of reinforcement learning in various home energy management system (HEMS) applications, which have been classified into a few major categories. The survey also considers what kind of reinforcement learning algorithms are used in each HEMS application. This investigation reveals that research in the use of reinforcement learning in smart homes is still in its infancy. 

  • home energy management systems (HEMS)
  • reinforcement learning (RL)
  • deep neural network (DNN)
  • Q-value
  • policy gradient
  • natural gradient
  • actor–critic
  • residential
  • commercial
  • academic

1. Introduction

The largest group of consumers of electricity in the US are residential units. In the year 2020, this sector alone accounted for approximately 40% of all electricity usage [1]. The average daily residential consumption of electricity is 12 kWh per person [2]. Therefore, effectively managing the usage of electricity in homes, while maintaining acceptable comfort levels, is vital to address the global challenges of dwindling natural resources and climate change. Rapid technological advances have now made home energy management systems (HEMS) an attainable goal that is worth pursuing. HEMS consist of automation technologies that can respond to a continuously or periodically changing home environmental as well as relevant external conditions, without human intervention [3,4]. In this review, the term ‘home’ is taken in a broad context to also include all residential units, classrooms, apartments, offices complexes, and other buildings in the smart grid [5,6,7,8].
Artificial Intelligence (AI), more specifically machine learning, is one of the key contributing factors that have helped realize HEMS today [9,10,11]. Reinforcement learning (RL) is a class of machine learning algorithms that is making deep inroads in various applications in HEMS. This learning paradigm incorporates the twin capabilities of learning from experience and learning at higher levels of abstraction. It allows algorithmic agents to replace human beings in the real world, including in homes and buildings, in applications that had hitherto been considered to be beyond today’s capabilities.
RL allows an algorithmic entity to make sequences of decisions and implement actions from experience in the same manner as a human being [12,13,14,15,16,17]. DNN has proven to be a powerful tool in RL, for it endows the RL agent with the capability to adapt to a wide variety of complex real-world applications [18,19]. Moreover, it has been proposed in [20] that RL can attain the ultimate goal of artificial general intelligence [21].
Consequently, RL is making deep inroads into many application domains today. It has been applied extensively to robotics [22]. Specific applications in this area include robotic manipulation with many degrees of freedom [23,24] and the navigation and path planning of mobile robots and UAVs [25,26,27]. RL finds widespread applications in communications and networking [28,29,30]. It has been used in 5G-enabled UAVs with cognitive capabilities [31], cybersecurity [32,33,34], and edge computing [35]. In intelligent transportation systems, RL is used in a range of applications such as vehicle dispatching in online ride-hailing platforms [36].
Other domains where RL has been used include hospital decision making [37], precision agriculture [38], and fluid mechanics [39]. The financial industry is another important sector where RL has been adopted for several scenarios [40,41,42]. It is of little surprise that RL has been extensively used to solve various problems in energy systems [43,44,45,46,47]. Another review article on the use of RL [47] considers three application areas in frequency and voltage control as well as in energy management.
RL is increasingly being used in HEMS applications and several review papers have already been published. The review article in [48] focuses on RL for HVAC and water heaters. The paper in [49] is based on research published between 1997 and 2019. The survey observes that only 11% of published research reports the deployment of RL in actual HEMS. The article in [50] specifically focuses on occupant comfort in residences and offices. A more recent review on building energy management [51] focuses on deep neural network-based RL. A recent article [52] considers RL along with model predictive control in smart building applications. The article in [53] is a survey of RL in demand response.

2. Home Energy Management Systems

HEMS refers to a slew of automation techniques that can respond to continuously or periodically changing the home/building’s internal as well as relevant external conditions, and without the need for human intervention. This section addresses the enabling technologies that make this an attainable goal.

2.1. Networking and Communication

All HEMS devices must have the ability to send/receive data with each other using the same communication protocol. HEMS provides the occupants with the tools that allow them to monitor, manage, and control all the activities within the system. The advancements in technologies and more specifically in IoT-enabled devices and wireless communications protocols such as ZigBee, Wi-Fi, and Z-Wave made HEMS feasible [54,55]. These smart devices are connected through a home area network (HAN) and/or to the internet, i.e., a wide area network (WAN).
The choice of communication protocol for home automation is an open question. To a large extent, it depends on the user’s personal requirements. If it is desired to automate a smaller set of home appliances with ease of installation, and operability in a plug-and-play manner, Wi-Fi is the appropriate one to use. However, with more extensive automation requirements, involving tens through to hundreds of smart devices, Wi-Fi is no longer the optimal choice. There are issues relating to scalability and signal interference in Wi-Fi. More importantly, due to its relatively high energy consumption, Wi-Fi is not appropriate for battery-powered devices.
Under these circumstances, ZigBee and Z-Wave are more appropriate [56]. These communication protocols dominate today’s home automation market. There are many common features shared between the two protocols. Both protocols use RF communication mode and offer two-way communication. Both ZigBee and Z-Wave enjoy well established commercial relationships with various companies, with tens of hundreds of smart devices using one of these protocols.
Z-Wave is superior to than ZigBee in terms of the range of transmission (120 m with three devices as repeaters vs. 60 m with two devices working as repeaters). In terms of inter-brand operability, Z-Wave again holds the advantage. However, ZigBee is more competitive in terms of data rate of transmission as well as in the number of connected devices. Z-Wave was specially created for home automation applications, while ZigBee is used in a wider range of places such as industry, research, health care, and home automation [57]. A study conducted by [58] foresees that ZigBee is most likely to be the standard communication protocol for HEMS. However, due to the presence of numerous factors, it is still difficult to tell with high certainty if this forecast would take place in future. It is also possible that an alternative communication protocol will emerge in future.
HEMS requires this level of connectivity to be able to access electricity price from the smart grid through the smart meter and control all the system’s elements accordingly (e.g., turn on/off the TV, control the thermostat settings, determine the charge/discharge battery timings, etc.). In some scenarios, HEMS uses the forecasted electricity prices to schedule shiftable loads (e.g., washing machine, dryer, electric vehicle charging) [54].

2.2. Sensors and Controller Platforms

HEMS consists of smart appliances with sensors, these IoT-enabled devices communicate with the controller by sending and receiving data. They collect information from the environment and/or about their electricity usage using built-in sensors. The smart meter gathers information regarding the total consumers’ consumption from the appliances, the peak load period, and electricity price from the smart grid.
The controller can be in the form of a physical computer located within the premises, that is equipped with the ability to run complex algorithms. An alternate approach is to leverage any of the cloud services that are available to the consumers through cloud computing firms.
The controller gathers information from the following sources: (i) the energy grid through the smart meter, which includes the power supply status and electricity price, (ii) the status of renewable energy and the energy storage systems, (iii) the electricity usage of each smart device at home, and (iv) the outside environment. Then it processes all the data through a computational algorithm to take specific action for each device in the whole system separately [5].

2.3. Control Algorithms

AI and machine learning methods are making deep inroads into HEMS [10,59]. HEMS algorithms incorporated into the controller might be in the form of simple knowledge-based systems. These approaches embody a set of if-then-else rules, which may be crisp or fuzzy. However, due to their reliance on a fixed set of rules, such methods may not be of much practical use with real-time controllers. Moreover, they cannot effectively leverage the large amount of data available today [5]. Although it is possible to impart a certain degree of trainability to fuzzy systems, the structural bottleneck of consolidating all inputs using only conjunctions (and) and disjunctions (or) still persists.
Numerical optimization comprises of another class of computational methods for the smart home controller. These methods entail an objective function that is to be either minimized (e.g., cost) or maximized (e.g., occupant comfort), as well as a set of constraints imposed by the underlying physical HEMS appliances and limitations. Due to its simplicity, linear programing is a popular choice for this class of algorithms. More recently, game theoretic approaches have emerged as an alternative approach for various HEMS optimization problems [5].
In recent years, artificial intelligence and machine learning, more specifically deep learning techniques, have become popular for HEMS applications. Deep learning takes advantage of all the available data for training the neural network to predict the output and control the connected devices. It is very helpful to forecast the weather, load, and electricity price. Furthermore, it handles non-linearities without resorting to explicit mathematical models. Since 2013, there have been significant efforts directed at using deep neural networks within an RL framework [60,61], that have met with much success.

3. Use of Reinforcement Learning in Home Energy Management Systems

This section addresses aspects of the survey on the use of RL approaches for various HEMS applications. All articles in this survey have been published in established technical journals that were published or made available online within the past five years.

3.1. Application Classes

In this study, all applications were divided into five classes as in Figure 1 below.
Figure 1. HEMS Applications. All applications of reinforcement learning in home energy management systems are classified into the five categories shown.
(i)
Heating, Ventilation and Air Conditioning, Fans and Water Heaters: Heating, ventilation, and air conditioning (HVAC) systems alone are responsible for about half of the total electricity consumption [48,101,102,103,104]. In this survey, HVAC, fans and water heaters (WH) have been placed under a single category. Effective control of these loads is a major research topic in HEMS.
(ii)
Electric Vehicles, Energy Storage, and Renewable Generation: The charging of electric vehicles (EVs) and energy storage (ES) devices, i.e., batteries are studied in the literature as in [105,106]. Wherever applicable, EV and ES must be charged in coordination with renewable generation (RG) such as solar panels and wind turbines. The aim is to make decisions in order to save energy costs, while addressing comfort and other consumer requirements. Thus, EV, ES, and RG have been placed under a single class for the purpose of this survey.
(iii)
Other Loads: Suitable scheduling of several home appliances such as dishwasher, washing machine, etc., can be achieved through HEMS to save energy usage or cost. Lighting schedules are important in buildings with large occupancy. These loads have been lumped into a single class.
(iv)
Demand Response: With the rapid proliferation of green energies into homes and buildings, and these sources merged into the grid, demand response (DR) has acquired much research significance in HEMS. DR programs help in load balancing, by scheduling and/or controlling shiftable loads and in incentivizing participants [107,108] to do so through HEMS. RL for DR is one of the classes in this survey.
(v)
Peer-to-Peer Trading: Home energy management has been used to maximize the profit for the prosumers by trading the electricity with each other directly in peer-to-peer (P2P) trading or indirectly through a third party as in [109]. Currently, theoretical research on automated trading is receiving significant attention. P2P trading is the fifth and final application category to have been considered in this survey.
Each application class is associated with an objective function and a building type that are discussed in subsequent paragraphs. The schematic in Figure 2 shows all links that have been covered by the articles in this survey.
Figure 2. Building Types and Objectives. The building type and the RL’s objective of each application class. Note that the links are based on the existing literature covered in the survey. The absence of a link does not necessarily imply that the building type/objective cannot be used for the application class.
Figure 3 shows the number of research articles that applied RL to each class. Note that a significant proportion of these papers addressed more than one class. More than third of the papers we reviewed focused only on HVAC, fans and water heaters. Just above 10% of the papers studied RL control for the energy storage (ES) systems. Only 7% of the papers focused on the energy trading. However, most of the papers (46%) are targeting more than one object. These results are shown in Figure 3.
Figure 3. Application Classes. The total number of articles in each application class (left), as well as their corresponding proportions (right).

3.2. Objectives and Building Types

Within these HEMS applications, RL has been applied in several ways. It has been used to reduce energy consumption within residential units and buildings [110]. It has also been used to achieve a higher comfort level for the occupants [111]. In operations at the interface between the residential units and the energy grid, RL has been applied to maximize prosumers profit in energy trading as well as for load balancing.
For this purpose, we break down the objectives into three different types as listed below.
(i)
Energy Cost: The cost of using any electrical device by the consumer and in most of the cases it is proportionally related to its energy consumption. In this paper we use the terms ‘cost’ and ‘consumption’ interchangeably.
(ii)
Occupant Comfort: the main factor that can affect the occupant’s comfort is the thermal comfort, which depends mainly on the room temperature and humidity.
(iii)
Load Balance: Power supply companies try to achieve load balance by reducing the power consumption of consumers at peak periods to match the station power supply. The consumers are motivated to participate in such programs by price incentives.
Figure 3 illustrates the RL objectives that were used in each application class.
Next, all buildings and complexes were categorized into the following three types.
(i)
Residential: for the purpose of this survey, individual homes, residential communities, as well as apartment complexes fall under this type of building.
(ii)
Commercial: these buildings include offices, office complexes, shops, malls, hotels, as well as industrial buildings.
(iii)
Academic: academic buildings range from schools, university classrooms, buildings, research laboratories, up to entire campuses.
The research literature in this survey revealed that for residential buildings, RL was applied in all five application classes. However, in case of commercial and academic buildings, RL was typically applied to the first three categories, i.e., to HVAC, fans and WH, to EVs, ESs and RGs, as well as to other loads. This is shown in Figure 3.
Figure 4 illustrates the outcome of this survey. It may be noted that in the largest proportion of articles (42%) the RL algorithm took into account both cost and comfort. About 27% of all articles addressed cost as the only objective, thereby defining the second largest proportion.
Figure 4. Objectives and Building Types. Proportions of articles in each objective (left) and building type (right).

3.3. Deployment, Multi-Agents, and Discretization

The proportion of research articles where RL was actually deployed in the real world was studied. It was found that only 12% of research articles report results where RL was used with real HEMS. The results are consistent with an earlier survey [49] where this proportion was 11%. The results are shown in Figure 5.
Figure 5. Real-World, Multi-Agents, and Discretization. Proportions of articles deployed in real world HEMS (left), using multi-agents (middle), and whether the states/actions are discrete or continuous (right).

This entry is adapted from the peer-reviewed paper 10.3390/en15176392

This entry is offline, you can click here to edit this entry!
ScholarVision Creations