Federated Learning Models Based on DAG Blockchain

Federated Learning Models Based on DAG Blockchain: History

View Latest Version

Please note this is an old version of this entry, which may differ significantly from the current revision.

Contributor:

Tong Li

Chao Yang

Lei Wang

Tingting Li

Hai Zhao

Jiewei Chen

With the development of the power internet of things, the traditional centralized computing pattern has been difficult to apply to many power business scenarios, including power load forecasting, substation defect detection, and demand-side response. How to perform efficient and reliable machine learning tasks while ensuring that user data privacy is not violated has attracted the attention of the industry. Blockchain-based federated learning (FL), proposed as a new decentralized and distributed learning framework for building privacy-enhanced IoT systems, is receiving more and more attention from scholars.

federated learning
DAG
communication overhead

1. Introduction

With the deep integration of internet of things (IoT) technology and the power grid, the intelligent development of power IoT has gradually attracted people’s attention. Coordinated scheduling between the power generation side, user side, and distribution network is the key to the development of power systems. The scenarios include intelligent inspection, power load forecasting, and demand-side response. These tasks require a power system with trusted data sharing capabilities and big data mining capabilities. However, the development of new power systems faces some problems and challenges. First, the traditional centralized computing framework is vulnerable to attacks by third parties, and the data transmission process is at risk of data leakage and tampering ^[1]. Second, with the development of artificial intelligence technology, the number of model parameters increases significantly, and the limited resources in IoT devices make adapting to the development of large models challenging. Third, people’s awareness and concerns about privacy are growing. Governments have implemented data privacy legislation, such as the European Commission’s General Data Protection Regulation (GDPR) ^[2] and the U.S. Consumer Privacy Bill of Rights ^[3].

In recent years, federated learning (FL) has been proposed as a distributed learning framework for building a data privacy-enhanced power system. The authors in ^[4] suggested that FL can solve the privacy problem among data owners. Lu et al. ^[5] proposed a decentralized and secure FL model based on blockchain. This model integrates FL into the consensus process of blockchain, which improves the system’s security without the need for centralized trust. However, the traditional consensus mechanism causes extreme resource consumption.

To avoid the extra resource consumption caused by blockchain, Li et al. ^[6] proposed DAG consensus, a consensus mechanism designed based on the structure of a directed acyclic graph (DAG). A blockchain system using such a consensus mechanism is called a DAG blockchain. Compared with proof-of-work (PoW) and proof-of-stake (PoS), that have been widely used in blockchain, the consensus mechanism designed on DAG consensus can overcome some shortcomings, such as high resource consumption, high transaction fee, low transaction throughput, and long confirmation delay. An important aspect of DAG-based consensus mechanisms is the tip selection algorithm ^[7]^[8]. The algorithm determines the selection of tips that should be approved when the next transaction is issued. That is the parent node of the published new transaction connection. In the DAG consensus-based scheme, the traditional tip selection algorithm always chooses the highest weight when selecting, and its traditional transaction weight is calculated by counting the number of approved transactions. Cao et al. ^[9] first combined the DAG blockchain with FL (DAG-FL). DAG-FL adopts asynchronous FL to approve the nodes by verifying the accuracy of the tips and the local models, and the local models with considerable accuracy of the local models are selected to construct the global model.

However, there are two main problems in the current DAG blockchain-based federated learning framework: first, the system communication overhead increases as the number of federated learning model participants increases. Second, the model transmission process is vulnerable to gradient leakage attacks ^[10]^[11]^[12]. Therefore, how to realize an efficient and trustworthy federated learning framework with balanced learning accuracy has become an urgent problem to be solved.

2. Convergence Framework for DAG Blockchain and Federated Learning

Scholars have researched and proposed a framework for converging DAG blockchain and FL. The earliest one is DAG-FL, proposed by Mingrui Cao et al. ^[9] to solve the problem of device asynchrony and anomaly detection in an FL framework, avoiding the extra resource consumption brought by the blockchain. It proposes a framework for FL using a blockchain based on the direct acyclic graph (DAG), which achieves better performance in terms of training efficiency and modeling accuracy compared with the existing typical on-device FL systems. However, it needs to address the problem of communication overhead. Based on this, Beilharz et al. ^[7] proposed a framework called directed acyclic graph federated learning (SDAGFL). It not only overcomes the challenges of device heterogeneity, single point of failure, and poisoning attacks, but also creates a unified solution for decentralized and personalized FL. But again, it dose not consider the communication overhead.

However, in IoT scenarios, the computing nodes have limited computing and communication resources with strict energy constraints. In order to optimize the SDAG-FL system for IoT, Xiaofeng Xue et al. ^[13] proposed an energy-efficient SDAG-FL framework based on an event-triggered communication mechanism, i.e., ESDAG-FL. The ESDAG-FL can reasonably achieve the balance between the training accuracy and specialization of the model and reduces nearly half of the energy consumption. Inspired by this, this research proposes a new SDAG-FL scheme for efficient communication, called CDAG-FL. Researchers design an adaptive model compression method based on the k-means mechanism and an improved tips selection algorithm for the CDAG-FL system in power IoT. The relevant research analysis of blockchain and federated learning architecture is shown in Table 1.

Table 1. Comparing the proposed scheme with existing methods.

	Optimization of Training Efficiency	Optimization of Model Accuracy	Scalability	Model Compression	Optimization of Tips Selection
Literature ^[6]	✔	✔	☓	☓	☓
Literature ^[11]	☓	☓	✔	☓	☓
Literature ^[12]	✔	✔	☓	✔	✔
CDAG-FL	✔	✔	✔	✔	✔

3. Communication Overhead Issues

To address the problem of how to reduce the communication overhead in FL, Mingzhe Chen et al. ^[14] used stochastic gradient quantization to compress the local gradient. He optimized the quantization level of each device under the multi-access channel capacity constraints to minimize the optimality gap, which reduces the communication overhead of FL. However, researchers must consider the reduction of convergence speed brought by model compression to FL. Wei Yang et al. ^[15] analyzed the effect of fixed compression rate in model compression on the number of iterations and training error in the training process, proved that a suitable compression rate can better perform the compression algorithm, and proposed an adaptive gradient compression algorithm, which provides a unique compression rate for each client according to the actual characteristics of each client, to improve the communication performance. However, it does not consider the influence of the client training process. Peng Luo et al. ^[16] proposed a new ProbComp-LPAC algorithm. The ProbComp-LPAC algorithm selects the gradient with a probability equation and uses different compression rates in different layers of a deep neural network. In the same layer, the more parameters, the lower the compression rate with higher accuracy. ProbComp-LPAC is not only faster in training speed but also has high accuracy. However, the compression rate of each layer needs to be adjusted manually, and its effect is limited.

This entry is adapted from the peer-reviewed paper 10.3390/electronics12173712

References

Cui, L.; Yang, S.; Chen, Z.; Pan, Y.; Xu, M.; Xu, K. An Efficient and Compacted DAG-Based Blockchain Protocol for Industrial Internet of Things. IEEE Trans. Ind. Inform. 2020, 16, 4134–4145.
Custers, B.; Sears, A.M.; Dechesne, F.; Georgieva, I.; Tani, T.; Van der Hof, S. EU Personal Data Protection in Policy and Practice; T.M.C. Asser Press: The Hague, The Netherlands, 2019; Volume 29.
Gaff, B.M.; Sussman, H.E.; Geetter, J. Privacy and Big Data. Computer 2014, 47, 7–9.
Qi, Y.; Hossain, M.S.; Nie, J.; Li, X. Privacy-preserving blockchain-based federated learning for traffic flow prediction. Future Gener. Comput. Syst. 2021, 117, 328–337.
Lu, Y.; Huang, X.; Dai, Y.; Maharjan, S.; Zhang, Y. Blockchain and Federated Learning for Privacy-Preserved Data Sharing in Industrial IoT. IEEE Trans. Ind. Inform. 2020, 16, 4177–4186.
Li, Y.; Cao, B.; Peng, M.; Zhang, L.; Zhang, L.; Feng, D.; Yu, J. Direct Acyclic Graph-Based Ledger for Internet of Things: Performance and Security Analysis. IEEE/ACM Trans. Netw. 2020, 28, 1643–1656.
Beilharz, J.; Pfitzner, B.; Schmid, R.; Geppert, P.; Arnrich, B.; Polze, A. Implicit Model Specialization through DAG-Based Decentralized Federated Learning. In Proceedings of the Middleware’21: 22nd International Middleware Conference, Québec City, QC, Canada, 6–10 December 2021; pp. 310–322.
Schmid, R.; Pfitzner, B.; Beilharz, J.; Arnrich, B.; Polze, A. Tangle Ledger for Decentralized Learning. In Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), New Orleans, LA, USA, 18–22 May 2020; pp. 852–859.
Cao, M.; Zhang, L.; Cao, B. Toward On-Device Federated Learning: A Direct Acyclic Graph-Based Blockchain Approach. IEEE Trans. Neural Networks Learn. Syst. 2023, 34, 2028–2042.
Zhu, L.; Liu, Z.; Han, S. Deep Leakage from Gradients. arXiv 2019, arXiv:1906.08935.
Zhao, B.; Mopuri, K.R.; Bilen, H. idlg: Improved Deep Leakage from Gradients. arXiv 2020, arXiv:2001.02610.
Geiping, J.; Bauermeister, H.; Drge, H.; Moeller, M. Inverting Gradients—How Easy Is It to Break Privacy in Federated Learning? arXiv 2020, arXiv:2003.14053.
Xue, X.; Mao, H.; Li, Q.; Huang, F.; Abd El-Latif, A.A. An Energy Efficient Specializing DAG Federated Learning Based on Event-Triggered Communication. Mathematics 2022, 10, 4388.
Chen, M.; Yang, Z.; Saad, W.; Yin, C.; Poor, H.V.; Cui, S. A Joint Learning and Communications Framework for Federated Learning Over Wireless Networks. IEEE Trans. Wirel. Commun. 2021, 20, 269–283.
Yang, W.; Yang, Y.; Dang, X.; Jiang, H.; Zhang, Y.; Xiang, W. A Novel Adaptive Gradient Compression Approach for Communication-Efficient Federated Learning. In Proceedings of the 2021 China Automation Congress (CAC), Beijing, China, 22–24 October 2021; pp. 674–678.
Luo, P.; Yu, F.R.; Chen, J.; Li, J.; Leung, V.C.M. A Novel Adaptive Gradient Compression Scheme: Reducing the Communication Overhead for Distributed Deep Learning in the Internet of Things. IEEE Internet Things J. 2021, 8, 11476–11486.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.