Submitted Successfully!
To reward your contribution, here is a gift for you: A free trial for our video production service.
Thank you for your contribution! You can also upload a video entry or images related to this topic.
Version Summary Created by Modification Content Size Created at Operation
1 -- 2317 2024-03-12 18:50:34 |
2 layout Meta information modification 2317 2024-03-13 02:52:21 |

Video Upload Options

Do you have a full video?


Are you sure to Delete?
If you have any further questions, please contact Encyclopedia Editorial Office.
Khan, M.N.U.; Cao, W.; Tang, Z.; Ullah, A.; Pan, W. Latest Existing Schemes on Data Aggregation. Encyclopedia. Available online: (accessed on 18 April 2024).
Khan MNU, Cao W, Tang Z, Ullah A, Pan W. Latest Existing Schemes on Data Aggregation. Encyclopedia. Available at: Accessed April 18, 2024.
Khan, Muhammad Nafees Ulfat, Weiping Cao, Zhiling Tang, Ata Ullah, Wanghua Pan. "Latest Existing Schemes on Data Aggregation" Encyclopedia, (accessed April 18, 2024).
Khan, M.N.U., Cao, W., Tang, Z., Ullah, A., & Pan, W. (2024, March 12). Latest Existing Schemes on Data Aggregation. In Encyclopedia.
Khan, Muhammad Nafees Ulfat, et al. "Latest Existing Schemes on Data Aggregation." Encyclopedia. Web. 12 March, 2024.
Latest Existing Schemes on Data Aggregation

The rapid development of the Internet of Things (IoT) has opened the way for transformative advances in numerous fields, including healthcare. IoT-based healthcare systems provide unprecedented opportunities to gather patients’ real-time data and make appropriate decisions at the right time. 

healthcare duplicated data aggregation cluster head Internet of Things

1. Introduction

The Internet of Things (IoT) is undeniably an innovative technology [1] that brings ease to our lives [2] and reshapes the industrial sector [3]. It comprises innumerable sensors that can perceive data from various domains including temperature, humidity, fire detection, and many more. The real-time transmission of this aggregated data into the central server assists in examining and taking immediate action as per current circumstances. A variety of IoT-enabled wearable devices can be fixed in attires, entrenched on the body, or stuck to a patient’s skin. These implanted devices are used to remotely examine the health conditions of patients [4]. It frequently takes patients’ vital health-related data and performs trivial computations before transmitting it further to a central entity [5]. The doctors, nursing staff, or other medical professionals who have access can check the data anytime to analyze and make suitable decisions at the right time [6].
Data aggregation is a mechanism in which data are collected, combined, and summarized at the aggregator node and then forwarded further [7]. In IoT-enabled WSNs, the environment can be homogeneous or heterogeneous. In a homogenous network, all nodes are of the same type and generate the same type of data which makes it comparatively easy to handle. The heterogeneous environment is quite complex as nodes are different and produce data in different formats [8]. Based on network size, different data aggregation approaches are considered. Centralized data aggregation (CDA) and in-network data aggregation (IDA) are the most commonly used techniques. The CDA is most commonly used for small networks, in which there is only one aggregator that gathers data from all nodes of the network. The IDA approach eliminates the drawbacks of the former technique by providing multiple AN nodes distributed in the network to collect data from nodes and send it to the Base Station (BS) [9].
De-duplication is a mechanism to identify and remove similar copies of data and serves as a central approach to cope with the growing volumes of data efficiently [10]. The de-duplication process is employed by using two main strategies. The inline de-duplication approach occurs in real-time when data are checked for duplicates before storing it in the storage system. Background de-duplication works as a post-process once data has been primarily written to the storage system. It includes periodically scanning the present data to detect and remove duplicates [11]. In the healthcare sector, the vast collection of medical records, diagnostic reports, and patient histories can lead to data redundancy, which ultimately utilizes high storage space and increases energy consumption. Energy consumption is directly proportional to the network lifetime and unnecessary repeated data wastes the sensor device resources [12].
The necessity of developing a new technique in healthcare is because of the critical need for continuous improvement and innovation in medical monitoring systems. Existing healthcare schemes suffer from the challenges of ineffective data aggregation, inappropriate redundant data handling, energy consumption, and storage costs in heterogeneous environments [13]. In the healthcare sector, sensor devices collect data in detail and frequently transmit it to BS. This unnecessary replication of data drains the limited resources of sensor nodes [14]. These resources need to be used wisely, especially in the context of healthcare where precise information is required for better treatment. The development of the proposed technique becomes imperative to minimize these challenges comprehensively.

2. Clustering-Based Data Aggregation Schemes

In clustering-based data aggregation techniques, sensor nodes are grouped into clusters. Each cluster comprises a Cluster Head (CH) which is selected according to some conditions. The CH collects data from Cluster Members (CMs) and transmits it to the base station. For clustering, the K-means algorithm is employed and the numbers of clusters are determined by using the elbow method as mentioned in Equation (1) as it informs about total clusters. To enhance security and resolve the cost issue of healthcare aggregated data, clusters are further divided into sub-clusters.
S S E = k = 1 k x i E U R   s k x i c k x 2
SSE represents the sum of squared errors, x denotes the total sensors in the cluster, and ck is the kth cluster. Data are aggregated at the sensor level by considering extreme points only where the duplicated data is not transmitted to CH. The Extrema Point (EP) mechanism is not very flexible to use as it may not apply to all kinds of data [13][14]. The CH uses a Bayesian-fusion algorithm to calculate trust scores for CMs and transmits aggregated data to BS. In the same context, to enhance the security of healthcare data, an anonymity-based clustering method has been used. By employing the client-server model, anonymization is ensured before transmitting it to the aggregator node [15].
Ahmed et al. [16] proposed that IoT devices gather data from devices and transmit it to BS using a fuzzy matrix. The collected data are then sent to the edge server. The cloud server validates the edge server and blockchain technology is utilized to avoid malicious attacks. For security, sensor nodes transmit encrypted data to Aggregator Nodes (Ans). To reduce energy consumption, communication, and storage costs, Ans further transmit compressed data to fog servers [17]. While Ananth et al. [18] performed clustering by the glow swarm optimization method. The proposed scheme has lower latency and is suitable for medical applications but multi-layer clustering may increase complexity [19]. Basha et al. proposed a technique to enhance efficiency and security in WSNs while resolving energy consumption issues. The Conditional Tug of War Optimization is used to calculate node trust. Energy optimization was achieved by cluster-based data aggregation, yet the method only focused on node energy levels. Major factors, including node distance and degree, were neglected [20]. Abid et al. have employed multi-clustering in which the aggregator node’s energy level is checked from time to time. In case it is greater than the threshold factor, the Candidate Flag bit will be set as 1. In case energy is low, the Aggregator node is replaced with the nearest node. It has a better load-balancing strategy but if the nearest chosen aggregator has low energy, the whole network will be affected [21].
To overcome energy utilization in the healthcare sector, a fuzzy-based data aggregation scheme is introduced. By considering the heterogeneous environment, an appropriate parent node is selected for each node, afterwards when data is aggregated it is checked for duplicate values. If they exist, they are replaced with the Boolean digit 0 at Level 2. This significantly reduces data size, overcomes storage space, as well as increases the aggregation factor [22]. Randhawa et al. [23] employed K-means clustering and fuzzy logic for aggregation. The aggregation rate, energy utilization, and data persistence are taken as fuzzy input, and network lifetime is obtained as output. The scheme has reduced the duplicate ratio. To monitor patient data in the healthcare system. Yang et al. [24] have presented a centralized approach to reduce energy consumption and enhance efficiency. The selection of the CH is performed by BS and energy is preserved by switching idle sensors to a sleep state. The scheme has a better network lifetime but a high storage cost and some other parameters should also be considered for CH selection [24]. Dwivedi et al. [25] have introduced an energy improvement scheme for homogenous WSNs. To select an appropriate CH from nodes, the rank is calculated. The higher rank increases the likelihood of a node being selected as CH. The CMs select clusters by using a fuzzy system. When clustering is finalized, CMs transmit data to CH, which CH aggregates and forwards to BS. The proposed scheme gains better energy utilization and reduces the chances of hotspot issues. Some other parameters should be considered for the intelligent selection of CH to overcome latency [25].
To improve energy utilization and lessen the issue of congestion, Mohseni et al. have presented a cluster-based strategy that includes two crucial stages. The first stage involves establishing sensors in clusters. For transmitting data from sensors to CH, and CH to BS, the shortest path is selected by using the Capuchin Searching algorithm that helps in overcoming energy consumption. The scheme is simulated over MATLAB and it has a better network lifetime, lower delay, and higher rate of packet delivery. The scheme uses a Capuchin searching algorithm that is not very efficient in the case of large networks [26]. To enhance the network lifetime of sensors in the medical field, multi-hop routing is checked. The scheme has a better lifetime, but not enough parameters are considered for hop selection and the security aspect for maintenance is neglected [27][28]. To securely transmit healthcare data, active sensors are selected by the Archimedes algorithm and the shortest path is selected by an attribute-centered binary scheme [29]. Similarly, clustering and data aggregation is performed for underwater WSNs [30][31]. The first layer contains medical sensors to capture the patient’s condition and data is transferred to a fog server. The fog server prioritizes the data based on the health state and intensity of diseases. This task is performed by fog clusters. If the computing cost is greater than available resources then by offloading mechanism, data is transferred to the cloud. In the end, reports are created for medical staff to take better steps in treatment [32].

3. Tree-Based Data Aggregation Schemes

In this category, child sensor nodes transmit data to parent nodes and these parent nodes send aggregated data to the base station at the next level, creating a hierarchy. To efficiently aggregate data, a tree-based structure is employed. For fuzzification, min-max normalization is performed. A node with a lesser sum of weight and having a direct connection is elected as a parent. The scheme has considered the heterogeneous environment’s complexity, but as attributes increase, no procedure is used to control energy utilization [33].
Wang et al. [34] introduced a scheme in which sensor nodes are arranged in a grid-based structure and BS is the root node where aggregated data is transmitted. The child’s heads are gathered until all cell heads are added to the tree while using minimum energy. For energy utilization by cell head, Equations (2) and (3) are used.
C i j k = 2 E e l e c   k + E a m p k d i j 2
  C i B k = E e l e c   k + E a m p k d i B 2
C_ij is consumed energy for sending data packets from i-th cell to the j-th cell of the tree. d_ij represents the distance between two cells, C_iB shows the energy utilized for transmitting data packets from i-th cell to the base station. The simulation proved that nodes have a higher lifetime. The shortcoming of the scheme is that when the child nodes are high it creates a longer depth, which increases the energy consumption of the parent node. To overcome the drawbacks of the previous scheme, the present scheme passes through phases of grid construction, construction of trees, and data transmission. The whole network is divided into a grid of MxN area. Every sensor possesses a Global Positioning System (GPS) to learn about its environmental location and calculate coordinates about the grid, in which the sensor lies. Limited child nodes are allowed to avoid uneven energy consumption and hotspot problems. The scheme has better data aggregation and efficiently manages network load but no fault tolerance policy is introduced in case node failure occurs [35].
The whole network area is distributed into grids, and an aggregating node is elected by using fuzzy logic. For it, CH distance, and link quality metrics like neighboring Overlap and Algebraic Connectivity are used, which in output informs the status of selecting an aggregating node. For relocation purposes, the Fruit Fly Optimization Algorithm (FOA) is used, which involves relocation conditions and the path to which the sink is switched. The scheme has a lower packet loss ratio but FOA is not very efficient for large networks [36]. This three-layered scheme comprises of Smart meters (SMs), Fog nodes, and cloud servers. During data transmission, SMs reduce data size and send it to local fog nodes. Then, fog nodes gather this data after checking it for integrity. Then, it is transmitted to the last layer of the cloud. The cloud extracts this data and computes its hash to check whether it is original. If data is not altered it is saved or otherwise removed [37].
After a detailed analysis of the existing schemes, the major identified gaps are outlined as follows. Various schemes primarily focus on energy optimization as it is a crucial parameter for prolonging the network lifetime and reducing overall energy consumption. As a result, the energy consumption did not decrease as much as it should have. Many techniques employ various strategies to efficiently aggregate data, but limited work has been undertaken to address redundant data specifically in the healthcare sector. There is a persistent need for more research to improve the effectiveness of handling redundancy, particularly in the healthcare domain.
In comparison to existing data aggregation techniques, the proposed Energy-Efficient Fuzzy Data Aggregation System (EE-FDAS) showcases distinct strengths, particularly in addressing redundancy and energy consumption challenges within a healthcare environment. By reducing the data packet size through the use of Boolean digits for normal range data, the overall data size is minimized, facilitating efficient transmission with lower energy utilization compared to [35], which exhibits higher energy consumption. Simultaneously, the smaller packet size of data readings contributes to lower latency, contrasting with the higher latency observed in [25]. Additionally, EE-FDAS reduces system complexity relative to Ref. [33], enhancing scalability in large systems. Furthermore, the proposed scheme incurs lower communication and storage costs compared to [17]. In summary, EE-FDAS stands out as a comprehensive solution, effectively addressing key issues in healthcare data aggregation and offering notable advantages over existing techniques On the whole, it represents a promising advancement in data aggregation techniques, offering a comprehensive and efficient solution for healthcare applications.


  1. Tu, Y.; Chen, H.; Yan, L.; Zhou, X. Task Offloading Based on LSTM Prediction and Deep Reinforcement Learning for Efficient Edge Computing in IoT. Future Internet 2022, 14, 30.
  2. Stoyanova, M.; Nikoloudakis, Y.; Panagiotakis, S.; Pallis, E.; Markakis, E.K. A Survey on the Internet of Things (IoT) Forensics: Challenges, Approaches, and Open Issues. IEEE Commun. Surv. Tutor. 2020, 22, 1191–1221.
  3. Bin Zikria, Y.; Afzal, M.K.; Kim, S.W.; Marin, A.; Guizani, M. Deep learning for intelligent IoT: Opportunities, challenges and solutions. Comput. Commun. 2020, 164, 50–53.
  4. Aouedi, O.; Sacco, A.; Piamrat, K.; Marchetto, G. Handling Privacy-Sensitive Medical Data with Federated Learning: Challenges and Future Directions. IEEE J. Biomed. Health Inform. 2023, 27, 790–803.
  5. Arora, S. IoMT (Internet of Medical Things): Reducing Cost While Improving Patient Care. IEEE Pulse 2020, 11, 24–27.
  6. Saeedi, I.D.I.; Al-Qurabat, A.K.M. A Systematic Review of Data Aggregation Techniques in Wireless Sensor Networks. J. Phys. Conf. Ser. 2021, 1818, 012194.
  7. Zeb, A.; Islam, A.K.M.M.; Zareei, M.; Al Mamoon, I.; Mansoor, N.; Baharun, S.; Katayama, Y.; Komaki, S. Clustering Analysis in Wireless Sensor Networks: The Ambit of Performance Metrics and Schemes Taxonomy. Int. J. Distrib. Sens. Netw. 2016, 12, 4979142.
  8. Rani, A.; Kumar, S. A survey of security in wireless sensor networks. In Proceedings of the 2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT), Ghaziabad, India, 9–10 February 2017; pp. 1–5.
  9. Zhang, J.; Yin, H.; Wang, J.; Luan, S.; Liu, C. Severe Major Depression Disorders Detection Using AdaBoost-Collaborative Representation Classification Method. In Proceedings of the 2018 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC), Xi’an, China, 15–17 August 2018; pp. 584–588.
  10. Baligodugula, V.V.; Amsaad, F.; Tadepalli, V.V.; Radhika, V.; Sanjana, Y.; Shiva, S.; Meduri, S.; Maabreh, M.; Alsaadi, N.; Tashtoush, Y.; et al. A Comparative Study of Secure and Efficient Data Duplication Mechanisms for Cloud-Based IoT Applications. In Proceedings of the 2023 International Conference on Advances in Computing Research (ACR’23), Orlando, FL, USA, 8–10 May 2023; pp. 569–586.
  11. Pragash, K.; Jayabharathy, J. A survey on DE–Duplication schemes in cloud servers for secured data analysis in various applications. Measurement. Sensors 2022, 24, 100463.
  12. Aher, C.N. Trust Calculation for Improving Reliability of Routing and Data Aggregation in WSN. Int. J. Electron. Eng. 2019, 11, 386–392.
  13. Yousefpoor, M.S.; Yousefpoor, E.; Barati, H.; Barati, A.; Movaghar, A.; Hosseinzadeh, M. Secure data aggregation methods and countermeasures against various attacks in wireless sensor networks: A comprehensive review. J. Netw. Comput. Appl. 2021, 190, 103118.
  14. Kadiravan, G.; Sujatha, P.; Asvany, T.; Punithavathi, R.; Elhoseny, M.; Pustokhina, I.V.; Pustokhin, D.A.; Shankar, K. Metaheuristic Clustering Protocol for Healthcare Data Collection in Mobile Wireless Multimedia Sensor Networks. Comput. Mater. Contin. 2021, 66, 3215–3231.
  15. Onesimu, J.A.; Karthikeyan, J.; Sei, Y. An efficient clustering-based anonymization scheme for privacy-preserving data collection in IoT based healthcare services. Peer-to-Peer Netw. Appl. 2021, 14, 1629–1649.
  16. Ahmed, A.; Abdullah, S.; Bukhsh, M.; Ahmad, I.; Mushtaq, Z. An Energy-Efficient Data Aggregation Mechanism for IoT Secured by Blockchain. IEEE Access 2022, 10, 11404–11419.
  17. Ullah, A.; Said, G.; Sher, M.; Ning, H. Fog-assisted secure healthcare data aggregation scheme in IoT-enabled WSN. Peer-to-Peer Netw. Appl. 2020, 13, 163–174.
  18. Ny, S.R.; Ananth, A.G.; Reddy, L.S. Optimal Cluster-Based Data Aggregation in WSN for Healthcare Application. Adv. Dyn. Syst. Appl. 2021, 16, 683–701.
  19. Ranjani, N.Y.S.; Ananth, A.; Reddy, L.S. A Firebug Optimal Cluster based Data Aggregation for Healthcare Application. IOP Conf. Ser. Earth Environ. Sci. 2022, 1057, 012006.
  20. Basha, A.R. Energy efficient aggregation technique-based realisable secure aware routing protocol for wireless sensor network. IET Wirel. Sens. Syst. 2020, 10, 166–174.
  21. Abid, B.; Nguyen, T.T.; Seba, H. New data aggregation approach for time-constrained wireless sensor networks. J. Supercomput. 2015, 71, 1678–1693.
  22. Khan, M.N.U.; Tang, Z.; Cao, W.; Abid, Y.A.; Pan, W.; Ullah, A. Fuzzy based Efficient Healthcare Data Collection and Analysis Mechanism using Edge Nodes in IoMT. Sensors 2023, 23, 7799.
  23. Randhawa, S. Sukhchandan Jain, Data Aggregation in Wireless Sensor Networks; Springer: Singapore, 2020.
  24. Yang, G.; Jan, M.A.; Menon, V.G.; Shynu, P.G.; Aimal, M.M.; Alshehri, M.D. A Centralized Cluster-Based Hierarchical Approach for Green Communication in a Smart Healthcare System. IEEE Access 2020, 8, 101464–101475.
  25. Dwivedi, A.K.; Sharma, A.K. EE-LEACH: Energy Enhancement in LEACH using Fuzzy Logic for Homogeneous WSN. Wirel. Pers. Commun. 2021, 120, 3035–3055.
  26. Mohseni, M.; Amirghafouri, F.; Pourghebleh, B. CEDAR: A cluster-based energy-aware data aggregation routing protocol in the internet of things using capuchin search algorithm and fuzzy logic. Peer-to-Peer Netw. Appl. 2023, 16, 189–209.
  27. Sert, S.A.; Alchihabi, A.; Yazici, A. A Two-Tier Distributed Fuzzy Logic Based Protocol for Efficient Data Aggregation in Multihop Wireless Sensor Networks. IEEE Trans. Fuzzy Syst. 2018, 26, 3615–3629.
  28. Chavva, S.R.; Sangam, R.S. An energy-efficient multi-hop routing protocol for health monitoring in wireless body area networks. Netw. Model. Anal. Health Inform. Bioinform. 2019, 8, 21.
  29. Singh, S.; Kumar, D. Energy-efficient secure data fusion scheme for IoT based healthcare system. Futur. Gener. Comput. Syst. 2023, 143, 15–29.
  30. Joshi, S.; Anithaashri, T.; Rastogi, R.; Choudhary, G.; Dragoni, N. IEDA-HGEO: Improved Energy Efficient with Clustering-Based Data Aggregation and Transmission Protocol for Underwater Wireless Sensor Networks. Energies 2022, 16, 353.
  31. Omeke, K.G.; Mollel, M.; Shah, S.T.; Arshad, K.; Zhang, L.; Abbasi, Q.H.; Imran, M.A. Dynamic Clustering and Data Aggregation for the Internet-of-Underwater-Things Networks. In Proceedings of the 2022 14th International Conference on Computational Intelligence and Communication Networks (CICN), Al-Khobar, Saudi Arabia, 4–6 December 2022; pp. 322–328.
  32. Benila, S.; Usha Bhanu, N. Fog Managed Data Model for IoT based Healthcare Systems. J. Internet Technol. 2022, 23, 217–226.
  33. Bhushan, S.; Kumar, M.; Kumar, P.; Stephan, T.; Shankar, A.; Liu, P. FAJIT: A fuzzy-based data aggregation technique for energy efficiency in wireless sensor network. Complex Intell. Syst. 2021, 7, 997–1007.
  34. Wang, N.-C.; Chen, Y.-L.; Huang, Y.-F.; Chen, C.-M.; Lin, W.-C.; Lee, C.-Y. An Energy Aware Grid-Based Clustering Power Efficient Data Aggregation Protocol for Wireless Sensor Networks. Appl. Sci. 2022, 12, 9877.
  35. Wang, N.-C.; Lee, C.-Y.; Chen, Y.-L.; Chen, C.-M.; Chen, Z.-Z. An Energy Efficient Load Balancing Tree-Based Data Aggregation Scheme for Grid-Based Wireless Sensor Networks. Sensors 2022, 22, 9303.
  36. Gandhi, G.S.; Vikas, K.; Ratnam, V.; Babu, K.S. Grid clustering and fuzzy reinforcement-learning based energy-efficient data aggregation scheme for distributed WSN. IET Commun. 2020, 14, 2840–2848.
  37. Shruti; Rani, S.; Singh, A.; Alkanhel, R.; Hassan, D.S.M. SDAFA: Secure Data Aggregation in Fog-Assisted Smart Grid Environment. Sustainability 2023, 15, 5071.
Subjects: Telecommunications
Contributors MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to : , , , ,
View Times: 64
Revisions: 2 times (View History)
Update Date: 13 Mar 2024