Data Gathering Techniques in WSN: A Cross-Layer View: Comparison
Please note this is a comparison between Version 1 by Mark Shifrin and Version 4 by Jason Zhu.

 Wireless sensor networks (WSNs) have taken a giant leap in scale, expanding their applicability to a large variety of technological domains and applications, ranging from the Internet of things (IoT) for smart cities and smart homes to wearable technology healthcare applications, underwater, agricultural and environmental monitoring and many more. This expansion is rapidly growing every passing day in terms of the variety, heterogeneity and the number of devices which such applications support. Data collection is commonly the core application in WSN and IoT networks, which are typically composed of a large variety of devices, some constrained by their resources (e.g., processing, storage, energy) and some by highly diverse demands. Many challenges span all the conceptual communication layers, from the Physical to the Applicational. In addition, the integrated unit architecture and the platform design can be subject to various stringent constraints. For example, size requirements can impose a strict constraint on the device design; low power consumption, low production cost, and self-operation can represent additional constraints.  Accordingly, the device architecture is fundamental and affects many other factors in the system. For example, power supply affects the life span; it also affects transmission range, memory, and processing unit, which in turn can affect the algorithms that can be executed on the device, etc.

  • wireless sensor networks (WSNs)
  • Internet of things (IoT)
  • Architecture

1. Introduction

Wireless sensor networks (WSN) are data measurement and gathering networks based on small hardware (HW) units capable of sensing, monitoring, or measuring their surroundings. The sensed data are transmitted directly or by relay via other sensors to some sink or server or a base station. The ultimate objective of such a configuration is to provide control or exploration capabilities over an area where the network is deployed. WSN characteristics can vary substantially: they can be composed of a few to hundreds of thousands of sensors; the monitored terrain can range from a small coverage area (e.g., the human body) to a vast realm (e.g., a forest area for fire detection); the sensed variables of interest of the surroundings are diverse (e.g., weather or health parameters, acceleration, pollution); and the sensors can have different characteristics (e.g., size, computational power, energy source).
The Internet of things (IoT) aims to improve day-to-day life. The concept includes smart cities, smart homes, pervasive health care, assisted living, environmental monitoring, surveillance, and so on. The IoT paradigm relies on interconnecting a large number of devices (things) linked by the Internet via heterogeneous access networks through which they can exchange information with one or more Internet gateways that can process the data, take action, and forward them to another destination if needed. Since many IoT devices are expected to be wireless, and since sensing is one of the main tasks and tools utilized by the IoT paradigm, IoT systems will rely extensively on WSN technology. The scale of scenarios where WSN are deployed nowadays is vast. Traditionally, WSN were classified based on their placement (e.g., terrestrial, underground, multimedia) [1]. Since WSNs are closely associated with IoT, contemporary classification tends to re-attribute the notions of the WSN domain to the IoT domain [2] and classify them based on their primary objectives, such as smart cities [3][4][3,4], healthcare [5], retail and leisure [6], utilities (e.g., smart home energy control, water metering and leak detection, and other general infrastructure monitoring networks) [7], agriculture and environmental safety (e.g., smart farming and harvesting, pest control [8][9][10][8,9,10], seismology monitoring [11][12][11,12], oceanology [13]), and more.
As previously explained, one of the main tasks of both WSN and IoT systems is data collection and dissemination. Reports are collected from the devices, and updates and operational assignments are distributed. Maintenance and functional assessments are also collected and disseminated. Data collection and dissemination in very dense networks such as WSNs and IoT networks which span heterogeneous devices, a significant percentage of which are expected to be small, with very constrained processing, storage, and energy resources and with minimal network capabilities, is challenging and draws significant attention both by the industrial and academic communities. Some of these challenges include: (i) Information management — the amount of information collected or needing to be disseminated to the relevant entities is enormous, and some is expected to be redundant, both in terms of the information sent by each device, which can be compressed, and in terms of same information received by different entities. Accordingly, innovative techniques are required for data compression to reduce transmitted data over wireless channels and aggregation techniques that exploit the redundancy between information sent by the different entities. (ii) Data analysis and reaction — the expected vast data exchange and the low latency requirement (at least for some of the information collected) require processing and analysis of data in real-time or near real-time, to enable timely decision making and instantaneous action-taking.
The ability to successfully transmit and gather vast streams of data incoming from an enormous number of devices and sensors and finally to successfully analyze them, in order to automatically control a much larger scope of everyday life systems, directly couples the process of data gathering with Big Data related challenges (e.g., [14][15][16][17][14,15,16,17]). Furthermore, leveraging Cloud Computing platforms offers significant advances in data analytical abilities (e.g., [18][19][20][18,19,20]). It provides new horizons to further develop and increase the size of WSN/IoT networks both in the sense of the number of sensing units and in the sense of the amount of the acquired data (e.g., [21][22][23][21,22,23]). (iii) Connectivity — collecting and disseminating data from and to many devices, potentially through vast, dense, heterogeneous networks, will be one of the biggest challenges of the future of IoT; accordingly, novel MAC protocols and coding schemes should be devised to comply with this challenge. With this respect, air time utilization and energy efficiency are of primary importance for the MAC layer protocol design. Any MAC layer protocol should ensure that devices utilize the wireless channel frugally and with minimum energy consumption. (iv) Security and Privacy — Connecting enormous numbers of devices to the Internet exposes the IoT network to serious security vulnerabilities. All the more so since the relevant entities are limited. Accordingly, issues such as authenticity, data encryption, and vulnerability to attacks (e.g., device impersonation) are critical for the IoT paradigm’s continuous growth (e.g., [24]). In addition, since the information transmitted over the WSN and IoT networks can be highly confidential (e.g., health reports, device tracking), the collection and dissemination of this information create significant challenges related to data protection and privacy.
This survey will explore the state-of-the-art of data collection and dissemination aspects in WSNs and IoT environments mentioned above. We will review essential milestones yet mainly focus on recent publications and present the new trends and research directions. Our resources included mainly Google Scholar, IEEE Xplore and our university’s library databases, utilizing the keywords of this paper. We also used important references from the bilbiography of the initial papers and ones that cited them. Data collection spans all the networking layers, from the physical implementation of transmitting bits across a communication medium to the application layer. Due to the wide-ranging scope of the topic, we will not be able to cover all its aspects (for example, in this paper, we will not discuss the critical topic of security and privacy). Some of the issues will be covered more thoroughly than others. However, since some of the topics we discussed rely on the general wireless communication technology and on broad setup protocols which are not data-gathering oriented per se, on some of the topics, we will provide a more comprehensive background and describe protocols that are aimed at a broader domain than data-gathering. For example, many medium access control (MAC) and wireless routing protocols are designed for a wide range of topologies, traffic patterns, quality of service requirements, etc. Even though they can be applied, they are not explicitly designed for data gathering. We will include some more general yet essential studies in our survey. To grasp the whole picture and to better understand some data-gathering-related issues, in some cases, we will delve into the pertinent background and stray into some peripheral topics. We will cover topics related to all layers of the protocol stack. Sometimes classification based on a stack is not clear-cut, as some of the issues involve multiple stacks.

2. Application-Oriented

Many sensor platforms are application-oriented. Occasionally, their suggested architecture can be applied to other applications; however, their design and evaluation are typically aimed at a specific one. Hence, in many cases, both hardware and software technological developments are introduced for effective functioning. One of the most common tasks of WSN is the obvious one of monitoring a terrain. There are many variants of WSN monitoring. For example, the requirement can be to monitor every point in the Field of Interest (FoI) vs. monitoring a limited number of specific locations or targets (aka target coverage) vs. just monitoring a border of a region to detect intruders (aka barrier coverage). The coverage problem typically involves selecting a subset of sensors that fulfill the monitoring objective while maintaining network connectivity. The sensors’ capabilities and the monitoring objective determine the network topology.
We present several recent examples that mainly concentrate on connectivity and data gathering under the constraints of the monitoring objective. Biswas et al. [25] focus on energy-efficient data gathering in target coverage problem, in which an n sensor WSN needs to monitor T specific targets, and there exists a route (multi-hop) from each source to the sink. The paper assumes that the source nodes that sense the targets and initiate data packets into the network are known, and deals with the forwarding of these packets to the sink. The paper proposes a distributed data gathering algorithm in which after each node discovers its neighbors and their hop-count to the sink, it will forward data packets (when required) to its neighbor with maximum remaining energy and a lower hop count to the sink (the remaining energy is assumed to be known). Ammari [26] focuses on the k-coverage problem in which each point in the FoI is required to be covered by at least k sensors at any time, and each active sensor participating in the monitoring task is required to be connected to the sink (possibly via a multi-hop route). The paper assumes that the sensors are heterogeneous (they do not have the same characteristics) and mobile, hence the sensors can move toward any region of interest in the deployment field to participate in any deficient k-coverage area and can also act as mobile proxy sinks that collect sensed data from the sensors and deliver them to the sink. Ammari [26] partitions the problem into two problems which are solved sequentially. Namely, the mobile k-coverage problem, which selects a minimum subset of active sensors that solve the k-coverage problem and the data gathering problem, and devise a forwarding scheme from the active sensors to the sink such that the energy consumption due to sensor mobility and communication is minimized.
Mdemaya and Bomgni [27] utilize mobile sensors to achieve area coverage. These mobile sensors can be moved and relocated to cover holes after the random deployment. RThesearche authors suggest a two-phase approach. According to the first one, the monitoring area after the initial random deployment is identified (by the BS), and mobile nodes are relocated to cover the monitoring holes detected after the initial deployment, trying to ensure full coverage of the AoI by the static and relocated sensors. At the second stage, the proposed algorithm schedules the sensors’ activity (awakening and transmission times) that minimizes the energy consumption of the nodes while collecting and sending data to the base station. To this end, the paper distinguishes between “normal” nodes and cluster heads. A survey that reviews algorithms and techniques related to the connectivity-coverage issues in WSN can be found in Boukerche and Sun [28].
Occasionally, WSN architectures and designs are more application-oriented. For example, Cerchecci et al. [28][29] propose a sensor node topology that uses low-cost and low-power components for energy-efficient waste management in the context of smart cities. The architecture described in [28][29] suggests a node architecture for measuring the filling level of trash bins and utilizes LoRa LPWAN (low-power wide-area network) technology for real-time data transmission to collect the measured data in a remote data collection center. The design of a sensor node that can detect the presence of water on home floors and provide early warning of water leaks is suggested by Teixidó et al. [29][30]. The paper presents and deploys both hardware and software of the network components (flood sensing nodes, actuator nodes, and a control central); communication within the sensor network relies on the IEEE 802.15.4 standard. Borrero and Zabalo [30][31] present a low-cost agriculture-oriented system. The suggested system is based on LoRa technology and can collect various measurements, such as humidity, ambient temperature, soil moisture, and temperature, and enables a farmer to access all of the information necessary to achieve efficient irrigation management of crops in real time. The developed wireless sensor node has been optimized both in hardware and software and exhibits very low power consumption.

3. Energy-Harvesting (EH)

One of the main concerns of the sensor platform’s design is the source of energy. Typically, the energy source is a battery attached to the sensor platform. It is utilized to provide power to all the required operations, e.g., wireless transmission, computation, memory, etc. The battery properties (e.g., technology used and size) can determine its lifespan as well as several other properties, e.g., transmission range. In many systems, the battery is a burden, as it increases the cost of the system, constrains the platform size, and most importantly, requires to be replaced occasionally. The challenge of saving power spans all the protocol stack; energy considerations show up in each part of this survey. As with the other layers, PHY layer innovations have also been suggested as to how to utilize battery power efficiently.
An alternative approach to overcome the battery hurdle is to embed a mechanism that harvests energy. Such a mechanism can be embedded alongside the battery to extend its lifespan, or more commonly, it can completely replace the battery so that all the functions rely on it. Batteryless WSNs that rely solely on energy-harvesting (EH)-WSN can compromise performance; for example, their transmission range can be shorter, the available energy can constrain their awake time, and so on. One of the main challenges is to locate the ambient resource from which the energy can be harvested. Many studies have explored different energy sources that can supplement energy, such as solar, vibration, wind, motion, electromagnetic, and more. Numerous comprehensive technological overviews with their advantages and limitations, energy harvesting modeling, challenge expectations, and prospects can be found in, for example, Refs. [31][32][33][34][35][36][32,33,34,35,36,37]. A more recent system design resviearchw on battery-free and energy-aware WSNs, which utilize ambient energy or wireless energy transmission, is given in [37][38]. It addresses energy supply strategies and provides insight into energy management methods and possibilities for energy saving at the node and network levels.
Khalid et al. [38][39] suggest a zero-power wireless sensor architecture that consists of a capacitive sensor (a sensor that associates the parameter of interest with the change in the capacitance), an RFID chip, a circulator (allows power flow between three defined ports), and an antenna (batteryless). The conceptual idea is that the sensor reflects the signal received from the RFID, with a change in phase, which is relative to the sensed value. Design and implementation of an energetically autonomous WSN platform for ambient monitoring in indoor environments are suggested by Abella et al. [39][40]. The proposed self-powered autonomous sensor node platform relies on embedded photo-voltaic (PV) panels to harvest the energy, a microcontroller and an RF transceiver with an attached antenna. The suggested architecture was prototyped and validated experimentally. Lee et al. [40][41] propose a floating wireless device with energy harvesting capability. The floating device is energetically self-sustaining for extended operational hours. It supports long-range communication between wireless sensor nodes and a gateway relying on the LoRa technology while deployed over a water surface. The floating device can be used as an environmental monitoring station to remotely collect weather and water quality information. Ref. [41][42] present the design of a wireless sensor node, powered by solar energy, that collects environmental data and can transmit it across vast distances (directly to the cloud). The architecture presented therein relies on low-power wide-area network (LPWAN) protocols that provide a long-range communication system with limited data to transmit and high energy efficiency. The authors utilize Sigfox technology in their proof-of-concept design.

4. Topology

Throughout the survey, the interaction of WSN and IoT will arise in multiple contexts. While this survey mainly deals with data gathering by means of wireless units, an IoT unit presumes a more high-level entity for localized data gathering. To assess the connection between these two concepts, the reader is advised to refer to the most recent work by Devadas et al. [42][43], for example, where rthesearche authors enumerate the IoT data management frameworks, challenges and issues. The chapter focuses on three layers of data management in IoT networks, communication, storage and processing. In addition, deployment of IoT Data management for smart home and smart city is described.
It is essential to distinguish between a one-directional WSN platform, where sensors merely gather the data and activate a specific infrastructure and set of technologies to further send it to a sink, and a bi-directional WSN platform, where the sensors are expected to be able to act according to control messages received from a sink. In the latter case, the sink might be a higher-level entity (e.g., a cloud-based server). While the general data-gathering techniques are usually agnostic of the control direction, additional constraints might be imposed. Delay of the responses, latency, BW usage efficiency, security, and privacy are some of the demands to consider. Another example of a bi-directional platform can be seen in social sensor clouds (SSC), which connect a social network with a sensor network via a cloud infrastructure. See, for example, Zhu et al. [43][44], which presents a scenario of a smart village and provides discussion on various aspects including green planning, energy concerns, and speed of data gathering and sharing. In Dinh and Kim [44][45], an on-demand WSN platform is designed. RThesearche authors suggest a data-gathering protocol that addresses bandwidth consumption and delivery latency and minimizes the number of requests to save resources. An infrastructure where sensors form groups belonging to private owners constitutes a special case. This may be the case in a smart city environment; this means that privacy and/or security considerations should be prioritized. This is the topic addressed by Zhu et al. [45][46]. RThesearche authors provide a trust-assisted cloud for WSN but have throughput issues in mind. Kuo et al. [46][47] suggest a WSN-based IoT platform that provides a reliable connection between sensors in the field and the database on the Internet. The proposed platform is based on the IEEE 802.15.4e time-slotted channel-hopping protocol with resource-constrained devices supporting heterogeneous applications. The paper suggests a scheme that compensates the clock drift for every timeslot to maintain the clock synchronization required for the time-slotted channel-hopping protocol.
Edge computing, as discussed by Satyanarayanan [47][48], allows distributing the data gathering burden across multiple cloudlets, which might be highly beneficial for large WSN. This platform paradigm aims to improve many important aspects: reduced latency of data delivery, increased bandwidth, scalability, resilience to possible cloud outages, and privacy control. However, the platform presumes an initial capital investment and later maintenance.
A virtual sensor network was proposed by Abdelwahab et al. [48][49]. Once a user-initiated sensing request is dispatched to a cloud, a suitable set of sensors is found for the task. The decision is made according to the cost function, which depends on the specific (e.g., monetary) cost of using sensors from the designated set, the benefit that can be received from using these sensors, and their effectiveness in distances and delays (calculated, e.g., in number of hops from sensor to a sink/gateway), also expressed as virtual links. The cost might be customized, while a general virtualization problem is formulated and the algorithm is provided.
Integration of unmanned aerial vehicles (UAVs) and WSN for crop monitoring in precision agriculture is described by Popescu et al. [49][50]. RThesearche authors suggest a down-up scheme, where the collected data is hierarchically processed from the ground level to the cluster head (CH) level, then collected by the UAV level and finally delivered to the cloud for analysis and possible feedback. Particular emphasis is put on outlying measurements from specific sensors, as they can indicate either a possible sensor failure or an upcoming unusual event inside the agricultural field. The measured data were processed through a consensus algorithm. Concurrently, it suppressed outlier values left for further examination for the cloud-based analysis. In addition, this study focused on the UAV trajectory planning to collect the data observed by the WSN. Actual deployment with several tens of sensors and several CHs is provided and analyzed.
An implementation of a ubiquitous consumer data service for transmitting short messages to any computing platform is provided by Datta et al. [50][51]. RThesearche authors demonstrate a data cycle model that allows any device with sensor(s) to report data encoded in short messages. The raw data reaches a central or distributed computing platform, where it undergoes transformation and evolves into rich and structured valuable information for higher-layer applications. The proposed data cycle model and DataTweet architecture are aimed at smart city and large-scale crowd-sensing-based IoT scenarios.

5. Application-Oriented Network Architecture

We continue by covering special types of WSN platforms for data gathering and specialized application-driven architecture types. Ayele et al. [51][52] suggest an IoT network architecture for wildlife monitoring systems (WMS) for scenarios in which animals exhibit sparse mobility, which results in sporadic wireless links. In addition, they suggest a data forwarding enhancement that adopts the flood-store-carry-and-forward paradigm suggested in the seminal ZebraNet study by Juang et al. [52][53], in which in order to send data to the sink, the nodes disseminate it among themselves until it reaches the sink. Specifically, each node stores the data needing to be conveyed, waits for connectivity with other nodes, and distributes the data to them, and they repeat the same process. Accordingly, the data is spread throughout the entire network (i.e., flooding) and will eventually be received by the sink. RThesearche authors in [51][52] suggest leveraging locally available routing parameters to improve opportunistic data forwarding algorithms by managing the data replication decision.
Saleh et al. [53][54] suggest extending the lifetime of a wireless sensor network used in mobile healthcare applications by increasing the number of bits transmitted per symbol, and specifically to rely on a quaternary interconnect scheme in which each transmitted symbol modulates two bits. A complementary neural network, static RAM-based architecture is suggested to reduce energy consumption in storage and transmissions during the data dissemination process. A WSN dedicated to home deployment for elderly healthcare and early health emergency alarm is discussed by Alsina-Pagès et al. [54][55]. RThesearche authors first raise privacy concerns related to the monitoring, and accordingly, advocate that only sound-based surveillance aimed to merely indicate alarming situations is appropriate. In order to further conform to the privacy demands, they focus on distributed architecture (rather than on a centralized one), where each of the WSN sensors sends encrypted identifiers of their measurement. The identification of events is built on feature extraction. This is done on the frequency domain by first dividing the incoming signal into blocks with Hamming sliding window, then transforming into the frequency domain using Discrete Fourier Transform (DFT) to evaluate the contribution of every band of the spectrum. The final coefficients are obtained after Discrete Cosine Transform (DCT). The conclusive parts of the proposed algorithm classify the coefficients, feeding them into Support Vector Machines which classifies the estimated audio event. ResearcheThe authors assert that the classification results could be further improved by incorporating a deep artificial neural network (ANN) into their system.
In AbeBer et al. [55][56], a similar method was implemented for urban noise monitoring. Namely, while STFT was utilized for the noise preprocessing, the classification of noise levels and events was performed by convolutional neural networks (CNNs). RThesearche authors used several previously published networks; see references therein. Similar methods for noise monitoring WSN were introduced by Siamwala et al. [56][57]. The frequency-domain analysis was performed. Then, classification by statistical methods was accomplished (Gaussian mixture model was used). In addition, rthesearche authors in [56][57] provide an elaborate WSN architecture, where energy-harvesting solar panels augment the sensors’ lifetime and the sensors’ state-of-charge is transmitted and tracked by central, more powerful nodes.
ScholarVision Creations