Emerging applications of IoT (the Internet of Things), such as smart transportation, health, and energy, are envisioned to greatly enhance the societal infrastructure and quality of life of individuals. In such innovative IoT applications, cost-efficient real-time decision-making is critical to facilitate, for example, effective transportation management and healthcare.
1. Introduction
The IoTnternet of Things (the Internet of ThingsIoT) envisions to enable many innovative applications, such as smart transportation, healthcare, and emergency response [1][2][3][4]. In IoT, timely decision-making using real-time sensor data is essential. For example, drivers in New York, Chicago, and Philadelphia lost 102, 104, and 90 h on average in 2021 despite a −27% to −37% drop since 2019 due to the reduced traffic during the COVID-19 pandemic [5]. Real-time decision-making for efficient traffic routing based on sensor data streams from roadside sensors (if any) or dashboard-mounted smartphones can greatly alleviate traffic congestion [6][7]. Also, an agent for real-time decision-making needs to find an available route among several alternative routes to send an ambulance to a patient when some of them are unavailable because of construction, social/political event, or disaster [8]. As another example, patients in an emergency department or intensive care unit with abnormal shock index values have much higher mortality rates [9] and higher risks to suffer from hyperlactatemia [10] and cardiac arrest [11]. Thus, making real-time triage decisions based on the analysis of physiological sensor data from wearable devices within decision-making deadlines is desirable.
In the presence of alternative actions, a real-time decision-maker needs to select one of them that is currently feasible within decision-making deadlines using fresh sensor data that represent the current real-world status to minimize, for example, traffic congestion or mortality in an emergency department. Furthermore, a real-time decision-maker should require IoT devices to provide minimal sensor data necessary for decision-making only to avoid possible network congestion and significant energy consumption in IoT devices for transmitting redundant sensor data wirelessly. Logic predicates, also called Boolean queries, can effectively evaluate alternative courses of action in IoT
[8][12][13]. For example, an ambulance may try to find an available route among several alternative routes to a patient where some of them are unavailable due to construction, a social/political event, or disaster. Let us suppose that there are two alternative routes,
A-B-C and
D-E-F, which are expressed as
where ∧ and ∨ represent the logical AND and OR operator, respectively. If road segment A of the route A-B-C is unavailable, the data indicating the status of the road segment B or C does not have to be retrieved from the sensors and analyzed for real-time decision making, but can be short-circuited to reduce the latency and resource consumption [8][12][13]. Similarly, effective treatment can be selected among alternative treatments by efficiently analyzing the logic predicate in a timely manner using fresh data that represent the current status of the patients in an emergency department or intensive care unit. Emergency vehicle routing and triage/treatment as running examples for real-time decision support were used.
2. Pull Model and Data Freshness
In
[8][12][13], the real-time decision-maker employs the pull model, in which it pulls (retrieves) data from sensors over a single wireless connection upon an event of interest to analyze, for example, the availabilities of alternative routes. To make decisions based on fresh data representing the current real-world status, the real-time decision-maker in
[8][12][13] periodically retrieves sensor data based on their validity intervals—the notion originated in real-time databases (RTDBs)
[14][15]. A sensor data object is fresh within its predefined validity interval; however, the real-time decision-making system considers it stale after the validity interval expires. By doing this, the system ensures that it makes real-time decisions based on fresh data representing the current real-world status.
Although managing the data freshness (data temporal consistency) via validity intervals could be effective in RTDBs with its own sensors, it can be too strict and expensive in IoT. First, sensor data, such as indoor temperature readings, may not normally change significantly in a short time period. Thus, the data could be still valid even after its validity interval expires. Periodic updates even in the absence of any noteworthy change may incur unnecessary consumption of the precious wireless bandwidth and energy in IoT devices without enhancing real-time decision making.
Moreover, if a decision-making task uses several sensor data with different validity intervals, the real-time decision-maker may have to retrieve the data repeatedly to ensure that all of them are fresh until the decision task completes. The system also should undo and redo any analysis performed using stale data. Hu et al.
[8] investigate this problem for a single decision task that uses sensor data pulled over a wireless connection. Their algorithm, called the LVF (Least Volatile First), pulls the data with the longest validity interval first. By doing this, LVF minimizes repeated data retrievals for one decision task that pulls sensor data with different validity intervals.
Kim et al.
[16][17] extend LVF to schedule multiple real-time decision-making tasks with potentially different deadlines using fresh data. Their algorithm, called EDEF-LVF (Earliest Deadline or Expiration First-Least Volatile First), schedules the real-time task with the earliest deadline or the shortest time to the expiration of the validity interval first. Within each task, the least volatile sensor data is retrieved first, similar to
[8]. They assume that there is a single bottleneck resource, such as a wireless connection, and real-time tasks do not share any data. Under the assumptions, EDEF-LVF is optimal in the sense that it can schedule real-time decision-making tasks to meet their deadlines and data validity constraints if such a schedule exists. In addition, Kim et al.
[18] devise several suboptimal heuristics to efficiently schedule real-time decision-making tasks that share sensor data with each other.
However, none of these approaches
[8][12][13][16][17][18] is free of repeated sensor data retrievals and re-executions of data analytics upon expiration of any validity interval. As a result, the precious wireless bandwidth and energy of IoT devices can be wasted and many deadlines for real-time decision making can be missed. In an extreme case, it may become impossible to run a task using fresh data as per the strict notion of validity intervals. For the sake of simplicity, let us suppose that there is only one real-time task that needs to pull data A and B from sensors deployed in a wide area over a wireless connection with relatively low bandwidth. Using LVF, the task pulls A with the longer validity interval first. When it tries to pull B, however, the wireless connection may become unstable. As a result, the sensor should retransmit B several times. Meanwhile, the validity interval of A expires. By the time a new version of A arrives, the validity interval of B may expire, and the whole process may repeat indefinitely. Finally, the system misses the deadline of the real-time decision-making task, wasting the bandwidth and energy. If there are multiple real-time decision-making tasks in the system, the problem may become worse. In addition to the situations described above, a real-time task can be preempted by a higher priority task, such as a task with an earlier deadline under the EDF (Earliest Deadline First) scheduling algorithm. When all higher priority tasks are completed, the preempted task may have to pull certain sensor data again, if their validity intervals have expired already.
The root cause of the problem is using the rigid freshness requirements based on data validity intervals. Surprisingly little work has been done to address this critical issue for cost-efficient real-time decision-making in IoT. A viable way to address the problem is the adaptive updated policy based on flexible validity intervals
[19][20][21]. Instead of using fixed validity intervals, the validity intervals of sensor data are dynamically adapted based on their access to update ratio in RTDBs such that the validity intervals of the data updated frequently but accessed infrequently are extended, if necessary, to reduce update workloads under overload
[19][20][21]. The notion of flexible validity intervals can be extended to efficiently manage the data freshness for real-time decision-making in IoT. Instead of requiring the real-time decision-maker to pull data from IoT devices,
sensors start to push data into the decision-maker when they detect an event of interest, e.g., a moving object in surveillance or traffic congestion in transportation management. After sending the first sensor readings to the decision-maker upon an event, the sensors only send new data if they differ from the previous version by more than the specified threshold. They periodically send a heartbeat message to the real-time decision-maker to indicate that they are still alive and monitoring the event of interest, even though they have not transferred new data to the decision-maker due to little changes. When the decision-maker receives a heartbeat message from a device, it extends the flexible validity interval to the next heartbeat period. On the other hand, when the sensor data changes by more than the threshold, the device sends new data to the decision-maker. By doing this, the decision-maker can avoid significantly wasting the network bandwidth, computational resources, and energy to repeatedly pull sensor data from IoT devices due to the expirations of strict validity intervals even when the actual data values hardly change.
3. Sensor Data Analytics via Machine Learning for Real-Time Decision Making
Machine learning is effective to analyze sensor data. For example, the availability of a bridge or a road segment can be analyzed by a CNN (Convolutional Neural Network)
[22], which is very effective for image processing and computer vision. Thus, machine learning is useful to evaluate the literals of a DNF predicate for real-time decision support. Sequence models are also useful for real-time decision support in IoT. For example, Markov decision processes
[23] and partially observable Markov decision processes
[23] are leveraged for near real-time health monitoring, treatments, and interventions in various medical applications
[24]. More recently, long-short term memory (LSTM), which is an artificial recurrent neural network (RNN) architecture effective for sequence modeling, has been applied to detect emotion
[25], to predict cardiovascular disease risk factors
[26], and to predict healthcare trajectories
[27]. Machine learning is applied to smart homes
[28][29][30]. Guo et al. have designed a graph CNN optimized for traffic predictions
[31]. In
[32][33][34][35], GRNN (General Regression Neural Network) and GRNN-SGTM (GRNN-Successive Geometric Transformation Model) are used to recover missing IoT data, respectively. Wang et al.
[36] devise a GRNN and a multivariate polynomial regression model to estimate unmeasurable water quality parameters from measurable parameters. In addition, Tien
[37] gives a high-level view of IoT, (near) real-time decision making, and artificial intelligence instead of focusing on technical approaches for real-time decision support in IoT.
Although it is effective for data analytics, machine learning is resource hungry. A complex machine learning model often consumes a significant amount of memory and computational resources, such as CPU cycles and GPU (Graphics Processing Unit) thread blocks, that may not be available in IoT devices with relatively little resources. Thus, in IoT devices, it is hard to run sophisticated prediction models in a timely manner to meet stringent timing constraints. A naive approach to address this challenge is transferring all sensor data from IoT devices to the cloud with virtually infinite resources. However, this approach is unsustainable, as described before. Therefore, the question of “where to analyze sensor data?” is as important as the question of “how to analyze them efficiently?”. Ultimately, it is desirable to optimize the tradeoff between the timeliness and bandwidth conservation of real-time data analytics near IoT devices vs. the scalability of data analytics in the cloud. In this regard, the relative advantages and disadvantages of sensor data analytics in IoT devices were summarized, at the network edge, and the cloud in Table 1, and discuss them in the following.
Table 1. Comparisons of real-time decision-making at different places.
|
Cloud |
Edge |
IoT End-Devices |
High |
Medium |
Low |
The first category is centralized analytics of sensor data in the cloud. A cloud has abundant computational resources and provides rich functionalities, such as very deep learning with many layers and training complex machine learning models using big datasets. Another advantage of real-time analytics in the cloud is that it can support real-time data analytics in a more global geographic area. However, centralized data analytics for real-time decision making in the cloud has several serious drawbacks:
-
It requires IoT devices to transmit all sensor data to the cloud for analytics, incurring long, unpredictable latency, and many deadlines miss in real-time decision making. (The Internet backbone latency is relatively long and varies significantly from tens to hundreds of milliseconds
[38].) Tardy decisions may lead to undesirable results, such as severe traffic congestion or chaos in an emergency department.
-
Such a naive approach may saturate the core network with the limited bandwidth as the number of sensors and IoT devices is increasing rapidly
[39
]
[40]. It may substantially impair the performance, scalability, and availability of the Internet. Thus, centralized analytics of sensor data in IoT is unsustainable.
Latency |
High |
Medium |
Low |
Bandwidth consumption |
High |
Medium |
Low |
Energy consumption |
High |
Medium |
Low |
-
-
IoT devices with limited resources may not be able to support sophisticated machine learning models or extensive model training. Instead, they typically use simplified models trained in the cloud to analyze local sensor data in a timely fashion
[45][46]; however, the stripped-down model may suffer from lower predictive performance.
-
-
In addition, IoT devices may consume a lot of precious energy and wireless bandwidth to transfer all their sensor data to the cloud for centralized data analytics in the cloud. Typically, IoT devices communicate wirelessly for the ease of deployment in a distributed area. Wireless networking consumes a significant fraction of the energy in an IoT device
[41][42]. Wireless IoT networks, such as LPWAN (Low-Power Wide-Area Network)
[43][44], often have stringent bandwidth constraints.
To address these problems, a system designer can consider another extreme—on-device analytics, where all data analytics occur in IoT end-devices. By supporting distributed analytics of sensor data, this approach can significantly reduce the latency and bandwidth consumption compared to the centralized analytics in the cloud. However, this approach also has several challenges:
By analyzing sensor data at the network edge near IoT devices and sensors, edge analytics
[47][48][49][50] aims to integrate the advantages of cloud and on-device analytics, while mitigating their shortcomings. Edge computing brings more computational resources at the network edge near data sources. It can be supported at different places. First, IoT end devices can preprocess sensor data and perform lightweight analytics
[51][52][53]. Second, an edge node, such as an IoT gateway, access point, cellular base station, or software-defined routers/switches, can collect and analyze data from IoT devices
[52][54][55][56]. Edge servers deployed at the network edge can be leveraged for more sophisticated data analytics
[49].
Thus, edge analytics for real-time decision support can be performed in a
hierarchical and event-driven manner. An IoT device preprocesses sensor data and performs a lightweight analysis of them to detect any event of interest while filtering irrelevant data out. An IoT gateway, if any, further analyzes data received from the devices connected to the gateway. It forwards important information, if any, to one or more relevant edge servers. For example, traffic cameras can send images to the edge server in charge of monitoring traffic flows in a specific area of a big city. Li et al.
[57], on-camera filtering is performed for efficient real-time video analytics. In
[58], an IoT camera analyzes the traffic flow using a low-resolution image and the edge server also analyzes the image, identifies an important part of the image (if any) in terms of data analytics, and requests an important part in high resolution from the device. IoT devices in a smart building can transfer their sensor readings to the IoT gateway on the same floor for efficient HVAC (Heating, Ventilation, and Air Conditioning). In these examples, IoT devices can do relatively simple data analytics to drop redundant or low-quality data, such as blurry images
[57]. Edge servers analyze real-time sensor data from multiple IoT devices/gateways to derive a more comprehensive view of the real-world status essential for real-time decision making. They can also communicate with each other to exchange information for a global view of real-world situations, such as the overall traffic flow in a city or hurricane paths in a nation. Edge computing and analytics are a booming area of research and industrial adoption due to their significant potential. Leveraging emerging edge computing for cost-efficient real-time decision support is in an early stage of research with ample room to grow.
Overall, efficient evaluations of predicates are important across IoT devices, gateways, and edge/cloud servers to significantly reduce latency as well as energy and bandwidth consumption. The efficiency of real-time decision-making can also be further enhanced by effectively exploiting cloud, on-device, and edge analytics frameworks and synthesizing them to optimize timing, predictive performance, bandwidth, and other resource consumption. Relatively little work, however, has been done for real-time decision-making in IoT from this holistic, overarching perspective.
Another promising direction for real-time analytics of sensor data on IoT devices is model compression
[59][60][61][62]. The key idea of model compression is to compact a machine learning model to minimize the resource requirements without significantly reducing the predictive performance of the compressed model. Especially, deep learning has been very successful and outperformed other machine learning techniques in killer applications, such as computer vision and natural language processing. DNNs (Deep Neural Networks) with many hidden layers and parameters, however, consume a lot of memory, computation time, and energy. They are too big and too expensive to learn on low-end IoT devices. The motivation for model compression is to significantly reduce the memory consumption and computational complexity of DNNs without significantly comprising their accuracy. Effective approaches for model compression include (1) compact models, (2) tensor decomposition, (3) data quantization, and (4) network sparsification
[59]:
-
Compact CNNs (Convolutional Neural Networks) are created by leveraging the spatial correlation within a convolutional layer to convolve feature maps with multiple weight kernels (Compact RNNs (Recurrent Neural Networks) for sequence data analysis has also received significant attention from researchers
[59].). They also leverage the intra-layer and inter-layer channel correlations to aggregate feature maps with different topologies. In addition, network architecture search (NAS) aims to automatically optimize the DNN architecture.
-
Tensor/matrix operations are the basic computation in neural networks. Thus, compressing tensors, typically via matrix decomposition, is an effective way to shrink and accelerate DNNs.
-
Data quantization decreases the bit width of the data that flow through a DNN model to reduce the model size and save memory while simplifying the operations for computational acceleration.
-
Network sparsification attempts to make neural networks sparse, via weight pruning and neuron pruning, instead of simplifying the arithmetic via data quantization.
Model compression in hardware, as well as hardware and algorithm co-design, is also effective. Good surveys are given in
[59][60].