Federated Learning Models Based on DAG Blockchain

Federated Learning Models Based on DAG Blockchain: Comparison

Please note this is a comparison between Version 2 by Rita Xu and Version 1 by ting ting Li.

随着电力物联网的发展，传统的集中式计算模式已经难以应用于电力负荷预测、变电站缺陷检测、需求侧响应等众多电力业务场景。如何在确保用户数据隐私不受侵犯的同时，高效可靠地执行机器学习任务，备受业界关注。基于区块链的联邦学习（With the development of the power internet of things, the traditional centralized computing pattern has been difficult to apply to many power business scenarios, including power load forecasting, substation defect detection, and demand-side response. How to perform efficient and reliable machine learning tasks while ensuring that user data privacy is not violated has attracted the attention of the industry. Blockchain-based federated learning (FL）作为一种用于构建隐私增强物联网系统的新型去中心化和分布式学习框架，正受到越来越多的学者关注。), proposed as a new decentralized and distributed learning framework for building privacy-enhanced IoT systems, is receiving more and more attention from scholars.

federated learning
DAG
communication overhead

1. 引言

随着物联网技术与电网的深度融合，电力物联网的智能化发展逐渐引起人们的关注。发电侧、用户侧、配电网之间的协调调度是电力系统发展的关键。场景包括智能巡检、电力负荷预测、需求侧响应等。这些任务需要具有可信数据共享能力和大数据挖掘能力的电力系统。然而，新型电力系统的发展面临着一些问题和挑战。首先，传统的中心化计算框架容易受到第三方攻击，数据传输过程存在数据泄露和篡改的风险[1]。其次，随着人工智能技术的发展，模型参数数量显著增加，物联网设备资源有限，使得适应大模型开发具有挑战性。第三，人们对隐私的意识和关注度正在提高。各国政府已经实施了数据隐私立法，例如欧盟委员会的《通用数据保护条例》（

Introduction

With the deep integration of internet of things (IoT) technology and the power grid, the intelligent development of power IoT has gradually attracted people’s attention. Coordinated scheduling between the power generation side, user side, and distribution network is the key to the development of power systems. The scenarios include intelligent inspection, power load forecasting, and demand-side response. These tasks require a power system with trusted data sharing capabilities and big data mining capabilities. However, the development of new power systems faces some problems and challenges. First, the traditional centralized computing framework is vulnerable to attacks by third parties, and the data transmission process is at risk of data leakage and tampering ^[1]. Second, with the development of artificial intelligence technology, the number of model parameters increases significantly, and the limited resources in IoT devices make adapting to the development of large models challenging. Third, people’s awareness and concerns about privacy are growing. Governments have implemented data privacy legislation, such as the European Commission’s General Data Protection Regulation (GDPR）[2]和美国《消费者隐私权利法案》[3]。

近年来，联邦学习（) ^[2] and the U.S. Consumer Privacy Bill of Rights ^[3]. In recent years, federated learning (FL) has been proposed as a distributed learning framework for building a data privacy-enhanced power system. The authors in ^[4] suggested that FL can solve the privacy problem among data owners. Lu et al. ^[5] proposed a decentralized and secure FL）被提议作为构建数据隐私增强电力系统的分布式学习框架。[4]的作者认为，联邦学习可以解决数据所有者的隐私问题。 model based on blockchain. This model integrates FL into the consensu等[5]提出了一种基于区块链的去中心化、安全的联邦学习模型。该模型将联邦学习集成到区块链的共识过程中，无需中心化信任即可提高系统的安全性。然而，传统的共识机制导致了极端的资源消耗。

为了避免区块链带来的额外资源消耗，s process of blockchain, which improves the system’s security without the need for centralized trust. However, the traditional consensus mechanism causes extreme resource consumption. To avoid the extra resource consumption caused by blockchain, Li等[6]提出了 et al. ^[6] proposed DAG共识，这是一种基于有向无环图（ consensus, a consensus mechanism designed based on the structure of a directed acyclic graph (DAG）结构设计的共识机制。使用这种共识机制的区块链系统称为DAG区块链。与区块链中广泛使用的工作量证明（PoW）和权益证明（PoS）相比，基于DAG共识设计的共识机制可以克服资源消耗高、交易费用高、交易吞吐量低、确认延迟长等缺点。基于DAG的共识机制的一个重要方面是尖端选择算法[7,8]。该算法确定在发出下一笔交易时应批准的提示选择。这是已发布的新事务连接的父节点。在基于). A blockchain system using such a consensus mechanism is called a DAG blockchain. Compared with proof-of-work (PoW) and proof-of-stake (PoS), that have been widely used in blockchain, the consensus mechanism designed on DAG consensus can overcome some shortcomings, such as high resource consumption, high transaction fee, low transaction throughput, and long confirmation delay. An important aspect of DAG-based consensus mechanisms is the tip selection algorithm ^[7][8]. The algorithm determines the selection of tips that should be approved when the next transaction is issued. That is the parent node of the published new transaction connection. In the DAG共识的方案中，传统的小费选择算法在选择时总是选择权重最高的，其传统的交易权重是通过统计批准的交易数量来计算的。 consensus-based scheme, the traditional tip selection algorithm always chooses the highest weight when selecting, and its traditional transaction weight is calculated by counting the number of approved transactions. Cao et al. ^[9] first co等[9]首先将mbined the DAG区块链与 blockchain with FL（ (DAG-FL）相结合。). DAG-FL采用异步联邦学习技术，通过验证TIPS和局部模型的精度来对节点进行批判，并选取局部模型精度相当的局部模型来构建全局模型。

然而，当前基于 adopts asynchronous FL to approve the nodes by verifying the accuracy of the tips and the local models, and the local models with considerable accuracy of the local models are selected to construct the global model. However, there are two main problems in the current DAG区块链的联邦学习框架存在两个主要问题：一是系统通信开销随着联邦学习模型参与者数量的增加而增加。其次，模型传输过程容易受到梯度泄漏攻击[10,11,12]。因此，如何实现高效可信、学习准确率均衡的联邦学习框架成为亟待解决的问题。 blockchain-based federated learning framework: first, the system communication overhead increases as the number of federated learning model participants increases. Second, the model transmission process is vulnerable to gradient leakage attacks ^[10][11][12]. Therefore, how to realize an efficient and trustworthy federated learning framework with balanced learning accuracy has become an urgent problem to be solved.

2. Convergence Framework for DAG区块链与联邦学习的融合框架 Blockchain and Federated Learning

学者们研究并提出了一个融合Scholars have researched and proposed a framework for converging DAG区块链和联邦学习的框架。最早的一种是 blockchain and FL. The earliest one is DAG-FL，由, proposed by Mingrui Cao et al. ^[9] to solve the problem of device asynchrony and anomaly detection in an FL framework, avo等[9]提出，旨在解决联邦学习框架中的设备异步和异常检测问题，避免区块链带来的额外资源消耗。它提出了一个使用基于直接无环图（iding the extra resource consumption brought by the blockchain. It proposes a framework for FL using a blockchain based on the direct acyclic graph (DAG）的区块链的联邦学习框架，与现有的典型设备端联邦学习系统相比，该框架在训练效率和建模准确性方面取得了更好的性能。但是，它需要解决通信开销问题。基于此，), which achieves better performance in terms of training efficiency and modeling accuracy compared with the existing typical on-device FL systems. However, it needs to address the problem of communication overhead. Based on this, Beilharz等[7]提出了一种称为有向无环图联邦学习（ et al. ^[7] proposed a framework called directed acyclic graph federated learning (SDAGFL）的框架。它不仅克服了设备异构性、单点故障和中毒攻击等挑战，还为去中心化和个性化的联邦学习创造了统一的解决方案。但同样，它没有考虑通信开销。). It not only overcomes the challenges of device heterogeneity, single point of failure, and poisoning attacks, but also creates a unified solution for decentralized and personalized FL. But again, it dose not consider the communication overhead. 然而，在物联网场景中，计算节点的计算和通信资源有限，能源约束严格。为了优化物联网However, in IoT scenarios, the computing nodes have limited computing and communication resources with strict energy constraints. In order to optimize the SDAG-FL系统，薛晓峰等[15]提出了一种基于事件触发通信机制的节能 system for IoT, Xiaofeng Xue et al. ^[13] proposed an energy-efficient SDAG-FL框架，即 framework based on an event-triggered communication mechanism, i.e., ESDAG-FL。. The ESDAG-FL可以合理地实现模型训练精度和专业化之间的平衡，降低近一半的能耗。受此启发，本文提出了一种新的 can reasonably achieve the balance between the training accuracy and specialization of the model and reduces nearly half of the energy consumption. Inspired by this, this research proposes a new SDAG-FL高效通信方案，称为 scheme for efficient communication, called CDAG-FL。我们设计了一种基于. Researchers design an adaptive model compression method based on the k-means机制的自适应模型压缩方法和一种改进的电力物联网 mechanism and an improved tips selection algorithm for the CDAG-FL系统尖端选择算法。区块链和联邦学习架构的相关研究分析如表 system in power IoT. The relevant research analysis of blockchain and federated learning architecture is shown in Table 1所示。.

表Table 1.将我们提出的方案与现有方法进行比较。 Comparing the proposed scheme with existing methods.

	优化培训效率Optimization of Training Efficiency	优化模型精度Optimization of Model Accuracy	可扩展性Scalability	模型Model 压缩Compression	优化吸头选择Optimization of Tips Selection
文献Literature ^[6]	✔	✔	☓	☓	☓
文献Literature ^[11]	☓	☓	✔	☓	☓
文献Literature ^[12]	✔	✔	☓	✔	✔
CDAG-FL型	✔	✔	✔	✔	✔

3. 通信开销问题Communication Overhead Issues

为了解决联邦学习中如何降低通信开销的问题，陈明哲等[16]使用随机梯度量化来压缩局部梯度。在多接入信道容量约束下，他优化了每个设备的量化水平，以最小化最优差距，从而降低了联邦学习的通信开销。To address the problem of how to reduce the communication overhead in FL, Mingzhe Chen et al. ^[14] used stochastic gradient quantization to compress the local gradient. He optimized the quantization level of each device under the multi-access channel capacity constraints to minimize the optimality gap, which reduces the communication overhead of FL. However, researchers must consider the reduction of convergence speed brought by model compression to FL. Wei Yang等[13]分析了模型压缩中固定压缩率对训练过程中迭代次数和训练误差的影响，证明了合适的压缩率可以更好地执行压缩算法，并提出了一种自适应梯度压缩算法， et al. ^[15] analyzed the effect of fixed compression rate in model compression on the number of iterations and training error in the training process, proved that a suitable compression rate can better perform the compression algorithm, and proposed an adaptive gradient compression algorithm, which provides a unique compression rate for each client according to the actual characteristics of each client, to improve the communication performance. However, it does not consider the influence of the client training process. 它根据每个客户端的实际特性为每个客户端提供唯一的压缩率，以提高通信性能。但是，它没有考虑客户培训过程的影响。Peng Luo等[14]提出了一种新的 et al. ^[16] proposed a new ProbComp-LPAC算法。 algorithm. The ProbComp-LPAC 算法使用概率方程选择梯度，并在深度神经网络的不同层中使用不同的压缩率。在同一层中，参数越多，压缩率越低，精度越高。algorithm selects the gradient with a probability equation and uses different compression rates in different layers of a deep neural network. In the same layer, the more parameters, the lower the compression rate with higher accuracy. ProbComp-LPAC不仅训练速度更快，而且精度高。但是，每层的压缩率需要手动调整，其效果有限。 is not only faster in training speed but also has high accuracy. However, the compression rate of each layer needs to be adjusted manually, and its effect is limited.

References

Cui, L.; Yang, S.; Chen, Z.; Pan, Y.; Xu, M.; Xu, K. An Efficient and Compacted DAG-Based Blockchain Protocol for Industrial Internet of Things. IEEE Trans. Ind. Inform. 2020, 16, 4134–4145.
Custers, B.; Sears, A.M.; Dechesne, F.; Georgieva, I.; Tani, T.; Van der Hof, S. EU Personal Data Protection in Policy and Practice; T.M.C. Asser Press: The Hague, The Netherlands, 2019; Volume 29.
Gaff, B.M.; Sussman, H.E.; Geetter, J. Privacy and Big Data. Computer 2014, 47, 7–9.
Qi, Y.; Hossain, M.S.; Nie, J.; Li, X. Privacy-preserving blockchain-based federated learning for traffic flow prediction. Future Gener. Comput. Syst. 2021, 117, 328–337.
Lu, Y.; Huang, X.; Dai, Y.; Maharjan, S.; Zhang, Y. Blockchain and Federated Learning for Privacy-Preserved Data Sharing in Industrial IoT. IEEE Trans. Ind. Inform. 2020, 16, 4177–4186.
Li, Y.; Cao, B.; Peng, M.; Zhang, L.; Zhang, L.; Feng, D.; Yu, J. Direct Acyclic Graph-Based Ledger for Internet of Things: Performance and Security Analysis. IEEE/ACM Trans. Netw. 2020, 28, 1643–1656.
Beilharz, J.; Pfitzner, B.; Schmid, R.; Geppert, P.; Arnrich, B.; Polze, A. Implicit Model Specialization through DAG-Based Decentralized Federated Learning. In Proceedings of the Middleware’21: 22nd International Middleware Conference, Québec City, QC, Canada, 6–10 December 2021; pp. 310–322.
Schmid, R.; Pfitzner, B.; Beilharz, J.; Arnrich, B.; Polze, A. Tangle Ledger for Decentralized Learning. In Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), New Orleans, LA, USA, 18–22 May 2020; pp. 852–859.
Cao, M.; Zhang, L.; Cao, B. Toward On-Device Federated Learning: A Direct Acyclic Graph-Based Blockchain Approach. IEEE Trans. Neural Networks Learn. Syst. 2023, 34, 2028–2042.
Zhu, L.; Liu, Z.; Han, S. Deep Leakage from Gradients. arXiv 2019, arXiv:1906.08935.
Zhao, B.; Mopuri, K.R.; Bilen, H. idlg: Improved Deep Leakage from Gradients. arXiv 2020, arXiv:2001.02610.
Geiping, J.; Bauermeister, H.; Drge, H.; Moeller, M. Inverting Gradients—How Easy Is It to Break Privacy in Federated Learning? arXiv 2020, arXiv:2003.14053.
Xue, X.; Mao, H.; Li, Q.; Huang, F.; Abd El-Latif, A.A. An Energy Efficient Specializing DAG Federated Learning Based on Event-Triggered Communication. Mathematics 2022, 10, 4388.
Chen, M.; Yang, Z.; Saad, W.; Yin, C.; Poor, H.V.; Cui, S. A Joint Learning and Communications Framework for Federated Learning Over Wireless Networks. IEEE Trans. Wirel. Commun. 2021, 20, 269–283.
Yang, W.; Yang, Y.; Dang, X.; Jiang, H.; Zhang, Y.; Xiang, W. A Novel Adaptive Gradient Compression Approach for Communication-Efficient Federated Learning. In Proceedings of the 2021 China Automation Congress (CAC), Beijing, China, 22–24 October 2021; pp. 674–678.
Luo, P.; Yu, F.R.; Chen, J.; Li, J.; Leung, V.C.M. A Novel Adaptive Gradient Compression Scheme: Reducing the Communication Overhead for Distributed Deep Learning in the Internet of Things. IEEE Internet Things J. 2021, 8, 11476–11486.