Federated Learning Algorithms for IoT

Federated Learning Algorithms for IoT: Comparison

Please note this is a comparison between Version 2 by Nora Tang and Version 1 by Mehreen Tahir.

Federated Learning (FL) is a state-of-the-art technique used to build machine learning (ML) models based on distributed data sets. It enables In-Edge AI, preserves data locality, protects user data, and allows ownership. These characteristics of FL make it a suitable choice for IoT networks due to its intrinsic distributed infrastructure. However, FL presents a few unique challenges; the most noteworthy is training over largely heterogeneous data samples on IoT devices. The heterogeneity of devices and models in the complex IoT networks greatly influences the FL training process and makes traditional FL unsuitable to be directly deployed, while many recent research works claim to mitigate the negative impact of heterogeneity in FL networks, unfortunately, the effectiveness of these proposed solutions has never been studied and quantified.

federated learning
distributed machine learning
Internet of Things

1. Heterogeneity in Federated IoT Networks

IoT networks are intrinsically heterogeneous. In real-life scenarios, FL is deployed over an IoT network with different data samples, device capabilities, device availability, network quality, and battery levels. As a result, heterogeneity is evidentiary and impacts the performance of a federated network. This section breaks down the heterogeneity and briefly discusses the two main categories, statistical and system heterogeneity.

1.1. Statistical Heterogeneity

Distributed optimization problems are often modeled under the assumption that data is Independent and Identically Distributed (IID). However, IoT devices generate and collect data in a highly dependent and inconsistent fashion. The number of data points also varies significantly across devices which adds complexity to problem modeling, solution formulation, analysis, and optimization. Moreover, the devices could be distributed in association with each other, and there might be an underlying statistical structure capturing their relationship. With the aim of learning a single, globally shared model, statistical heterogeneity makes it difficult to achieve optimal performance.

1.2. System Heterogeneity

It is very likely for IoT devices in a network to have different underlying hardware (CPU, memory). These devices might also operate on different battery levels and use different communication protocols (WiFi, LTE, etc.) Conclusively, the computational storage and communication capabilities differ for each device in the network. Moreover, IoT networks have to cope with stragglers as well. Low-level IoT devices operate on low battery power and bandwidth and can become unavailable at any given time.

The aforementioned system-level characteristics can introduce many challenges when training ML models over the edge. For example, federated networks consist of hundreds of thousands of low-level IoT devices, but only a handful of active devices might take part in the training. Such situations can make trained models biased towards the active devices. Moreover, low participation can result in a long convergence time when training. Due to the reasons mentioned above, heterogeneity is one of the main challenges for federated IoT networks, and federated algorithms must be robust, heterogeneity-aware, and fault-tolerant. Recently a few studies have claimed to address the challenge of heterogeneity.

2. BackgHistoround and Relay and Developmented Work

In the recent few years, there has been a paradigm shift in the way ML is applied in applications, and FL has emerged as a victor in systems driven by privacy concerns and deep learning [9,10,11]^[1][2][3]. FL is being widely adopted due to its compliance with GDPR, and it can be said that it is laying the foundation for next-generation ML applications. Despite FL is showcasing promising results, however, it also brings in unique challenges; such as communication efficiency, heterogeneity, and privacy, which are thoroughly discussed in [12,13,14,15,16]^{[4][5][6][7][8]}. To mitigate these challenges, various techniques have been presented over the last few years. For example, [17]^[9] presented an adaptive averaging strategy, and authors in [18]^[10] presented an In-Edge AI framework to tackle the communication bottleneck in federated networks. To deal with the resource optimization problem, [19]^[11] focused on the design aspects for enabling FL at the network edge. In contrast, [20]^[12] presented the Dispersed Federated Learning (DFL) framework to provide resource optimization for FL networks.

Heterogeneity is one of the major challenges faced by federated IoT networks. However, early FL approaches neither consider system and statistical heterogeneity in their design [5,21]^[13][14] nor are straggler-aware. Instead, there is a major assumption of uniform participation from all clients and a sample fixed number of data parties in each learning epoch to ensure performance and fair contribution from all clients. Due to these unrealistic assumptions, FL approaches suffer significant performance loss and often lead to model divergence under heterogeneous network conditions.

Previously, many research works have tried to mitigate heterogeneity problems in distributed systems via the system and algorithmic solutions [22,23,24,25]^{[15][16][17][18]}. In this context, heterogeneity results from different hardware capabilities of devices (system heterogeneity) and results in performance degradation due to stragglers. However, these conventional methods cannot handle the scale of federated networks. Moreover, heterogeneity in FL settings is not limited to hardware and device capabilities. Various other system artifacts such as data distribution [26]^[19], client sampling [27]^[20], and user behavior also introduce heterogeneity (known as statistical heterogeneity) in the network.

Recently, various techniques have been presented to tackle heterogeneity in a federated network. In [28]^[21], the authors proposed to tackle heterogeneity via client sampling. Their approach uses a deadline-based approach to filter out all the stragglers. However, it does not consider how this approach affects the straggler parties in model training. Similarly, [29]^[22] proposed to reduce the total training time via adaptive client sampling while ignoring the model bias. FedProx [6]^[23] allows client devices to perform a variable amount of work depending on their available system resources and also adds a proximal term to the objective to account for the associated heterogeneity. A few other works in this area proposed reinforcement learning-based techniques to mitigate the negative effects of heterogeneity [30,31]^[24][25]. Furthermore, algorithmic solutions have also been proposed that mainly focus on tackling statistical heterogeneity in the federated network. In [7]^[26], the authors proposed a variance reduction technique to tackle the data heterogeneity. Similarly, [8]^[27] proposed a new design strategy from a primal-dual optimization perspective to achieve communication efficiency and adaptivity to the level of heterogeneity among the local data. However, these techniques do not consider the communication capabilities of the participating devices. Furthermore, they have not been tested in real-life scenarios which keep us in the dark regarding their actual performance in comparison to the reported performance. Comparing the conventional and the new upcoming federated systems in terms of heterogeneity and distribution helps us understand the open challenges as well as track the progress of federated systems [32]^[28].

A few studies have also been presented to understand the impact of heterogeneity in FL training. In [33]^[29], the author demonstrated the potential impact of system heterogeneity by allocating varied CPU resources to the participants. However, the author only focused on training time and did not consider the impact of model performance. In [34]^[30], the authors characterized the impact of heterogeneity on FL training, but they majorly focused on system heterogeneity while ignoring the other types of heterogeneity in the systems. Similarly, in [35]^[31], the authors used large-scale smartphone data to understand the impact of heterogeneity but did not account for stragglers. However, all of the studies mentioned above failed to analyze the effectiveness of state-of-the-art FL algorithms under heterogeneous network conditions.

References

Abdulrahman, S.; Tout, H.; Ould-Slimane, H.; Mourad, A.; Talhi, C.; Guizani, M. A Survey on Federated Learning: The Journey from Centralized to Distributed On-Site Learning and Beyond. IEEE Internet Things J. 2021, 8, 5476–5497.
Zhang, P.; Sun, H.; Situ, J.; Jiang, C.; Xie, D. Federated Transfer Learning for IIoT Devices with Low Computing Power Based on Blockchain and Edge Computing. IEEE Access 2021, 9, 98630–98638.
Zhang, P.; Wang, C.; Jiang, C.; Han, Z. Deep Reinforcement Learning Assisted Federated Learning Algorithm for Data Management of IIoT. IEEE Trans. Ind. Inform. 2021, 17, 8475–8484.
Bonawitz, K.; Eichner, H.; Grieskamp, W.; Huba, D.; Ingerman, A.; Ivanov, V.; Kiddon, C.; Konečný, J.; Mazzocchi, S.; Brendan, M.H.; et al. Towards Federated Learning at Scale: System Design. arXiv 2019, arXiv:1902.01046.
Kairouz, P.; Mcmahan, H.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and Open Problems in Federated Learning. arXiv 2019, arXiv:1912.04977.
Khan, L.U.; Saad, W.; Han, Z.; Hossain, E.; Hong, C.S. Federated Learning for Internet of Things: Recent Advances, Taxonomy, and Open Challenges. arXiv 2021, arXiv:2009.13012.
Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated Learning: Challenges, Methods, and Future Directions. arXiv 2019, arXiv:1908.07873.
Aledhari, M.; Razzak, R.; Parizi, R.M.; Saeed, F. Federated Learning: A Survey on Enabling Technologies, Protocols, and Applications. IEEE Access 2020, 8, 140699–140725.
Leroy, D.; Coucke, A.; Lavril, T.; Gisselbrecht, T.; Dureau, J. Federated Learning for Keyword Spotting. arXiv 2019, arXiv:1810.05512.
Wang, X.; Han, Y.; Wang, C.; Zhao, Q.; Chen, X.; Chen, M. In-Edge AI: Intelligentizing Mobile Edge Computing, Caching and Communication by Federated Learning. IEEE Netw. 2019, 33, 156–165.
Khan, L.U.; Pandey, S.R.; Tran, N.H.; Saad, W.; Han, Z.; Nguyen, M.N.H.; Hong, C.S. Federated Learning for Edge Networks: Resource Optimization and Incentive Mechanism. IEEE Commun. Mag. 2020, 58, 88–93.
Khan, L.U.; Alsenwi, M.; Yaqoob, I.; Imran, M.; Han, Z.; Hong, C.S. Resource Optimized Federated Learning-Enabled Cognitive Internet of Things for Smart Industries. IEEE Access 2020, 8, 168854–168864.
Federated Learning: Collaborative Machine Learning without Centralized Training Data, 2017. Available online: https://bigmedium.com/ideas/links/federated-learning.html (accessed on 18 April 2022).
Brendan, M.H.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A.y. Communication-Efficient Learning of Deep Networks from Decentralized Data. arXiv 2016, arXiv:1602.05629.
Chen, C.Y.; Choi, J.; Brand, D.; Agrawal, A.; Zhang, W.; Gopalakrishnan, K. AdaComp: Adaptive Residual Gradient Compression for Data-Parallel Distributed Training. arXiv 2017, arXiv:1712.02679.
Jiang, J.; Cui, B.; Zhang, C.; Yu, L. Heterogeneity-aware Distributed Parameter Servers. In Proceedings of the 2017 ACM International Conference on Management of Data, Chicago, IL, USA, 14–19 May 2017.
Schäfer, D.; Edinger, J.; VanSyckel, S.; Paluska, J.M.; Becker, C. Tasklets: Overcoming Heterogeneity in Distributed Computing Systems. In Proceedings of the 2016 IEEE 36th International Conference on Distributed Computing Systems Workshops (ICDCSW), Nara, Japan, 27–30 June 2016.
Thomas, J.; Sycara, K. Heterogeneity, stability, and efficiency in distributed systems. In Proceedings of the International Conference on Multi Agent Systems (Cat. No.98EX160), Paris, France, 3–7 July 1998.
Zawad, S.; Ali, A.; Chen, P.Y.; Anwar, A.; Zhou, Y.; Baracaldo, N.; Tian, Y.; Yan, F. Curse or Redemption? How Data Heterogeneity Affects the Robustness of Federated Learning. arXiv 2021, arXiv:2102.00655.
Cho, Y.J.; Wang, J.; Joshi, G. Client Selection in Federated Learning: Convergence Analysis and Power-of-Choice Selection Strategies. arXiv 2020, arXiv:2010.01243.
Nishio, T.; Yonetani, R. Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC), Shanghai, China, 20–24 May 2019.
Luo, B.; Xiao, W.; Wang, S.; Huang, J.; Tassiulas, L. Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling. arXiv 2021, arXiv:2112.11256.
Li, T.; Sahu, A.K.; Zaheer, M.; Sanjabi, M.; Talwalkar, A.; Smith, V. Federated Optimization in Heterogeneous Networks. arXiv 2020, arXiv:1812.06127.
Pang, J.; Huang, Y.; Xie, Z.; Han, Q.; Cai, Z. Realizing the Heterogeneity: A Self-Organized Federated Learning Framework for IoT. IEEE Internet Things J. 2021, 8, 3088–3098.
Wu, Q.; He, K.; Chen, X. Personalized Federated Learning for Intelligent IoT Applications: A Cloud-Edge based Framework. IEEE Comput. Graph. Appl. 2020, 1, 35–44.
Karimireddy, S.P.; Kale, S.; Mohri, M.; Reddi, S.J.; Stich, S.U.; Suresh, A.T. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning. arXiv 2021, arXiv:1910.06378.
Zhang, X.; Hong, M.; Dhople, S.; Yin, W.; Liu, Y. FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data. arXiv 2020, arXiv:2005.11418.
Li, Q.; Wen, Z.; Wu, Z.; Hu, S.; Wang, N.; Li, Y.; Liu, X.; He, B. A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection. arXiv 2021, arXiv:1907.09693.
Chai, Z.; Anwar, A.; Zhou, Y.; Baracaldo, N.; Ludwig, H.; Fayyaz, H.; Fayyaz, Z.; Cheng, Y. Towards Taming the Resource and Data Heterogeneity in Federated Learning, 2019. Available online: https://mason-leap-lab.github.io/docs/opml19-fl.pdf (accessed on 18 April 2022).
Abdelmoniem, A.M.; Ho, C.Y.; Papageorgiou, P.; Bilal, M.; Canini, M. On the Impact of Device and Behavioral Heterogeneity in Federated Learning. arXiv 2021, arXiv:2102.07500.
Yang, C.; Wang, Q.; Xu, M.; Chen, Z.; Bian, K.; Liu, Y.; Liu, X. Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021.