基于深度强化学习的联邦学习

基于深度强化学习的联邦学习: Comparison

Please note this is a comparison between Version 1 by xutao Meng and Version 2 by Rita Xu.

Federated learning (FL) is a distributed machine learning paradigm that enables a large number of clients to collaboratively train models without sharing data.

联邦学习 (FL) 是一种分布式机器学习范例，使大量客户端能够在不共享数据的情况下协作训练模型。

federated learning
deep reinforcement learning
client selection

1. Introduction

The application of deep learning technology in the

一、简介

深度学习技术在物联网（Internet of ThingsT）中的应用非常普遍，在智慧医疗、智慧交通、智慧城市等领域均有应用[ (IoT)1 is very common, with uses in smart healthcare, smart transportation, and smart cities ^[1]. However, the massive amounts of data in IoT impose limitations on traditional centralized machine learning in terms of network resources, data privacy, etc. The proposal of federated learning (]。然而，物联网中的海量数据在网络资源、数据隐私等方面给传统的中心化机器学习带来了限制。联邦学习（FL)）的提出为涉及数据隐私问题的深度学习问题提供了有效的解决方案。客户可以与其他客户协作训练全局模型，而无需共享本地数据[ provides2 an effective solution to deep learning problems involving data privacy issues. Clients can collaborate with other clients in training a global model without the need to share their local data ^[2]. ]。FL has been successfully applied in many domains ^{[3][4][5][6][7]}.已成功应用于许多领域[ 3,4,5,6,7 ]。_ 然而，客户端之间数据异构性的存在会对 However, the presence of data heterogeneity among clients adversely affects the model convergence and training accuracy of FL. In real-world scenarios, the local datasets among different clients exhibit heterogeneity, indicating that their local data distribution differs from the global data distribution within the entire federated learning (的模型收敛性和训练准确性产生不利影响。

在现实场景中，不同客户端之间的本地数据集表现出异质性，这表明它们的本地数据分布与整个联邦学习（FL)）系统内的全局数据分布不同。多项研究表明，客户之间数据的异质性会显着影响 system. Several studies have demonstrated that the heterogeneity of data among clients significantly impacts the effectiveness of FL methods, leadingFL 方法的有效性，导致模型精度大幅降低 [ to8 a substantial reduction in model accuracy ^[8][9]. Specifically, heterogeneity9 among]。具体来说，客户之间的异质性可能会导致其本地培训的收敛目标不一致。将具有有偏差的收敛目标的局部模型进行聚合自然会产生具有有偏差的收敛目标的全局模型。因此，与 clients can lead to inconsistent convergence targets for their local training. Aggregating local models with biased convergence targets will naturally result in a global model with biased convergence targets as well. Therefore, the divergence of the global model obtained from non-IID datasets as opposed to IID datasets continuesIID 数据集相比，从非 IID 数据集获得的全局模型的分歧持续增长，这可能导致收敛速度较慢和学习性能较差 [ to10 grow, which may lead to slower convergence and poorer learning performance ^[10]. Effectively mitigating the adverse effects of data heterogeneity on ]。有效减轻数据异构性对FL系统模型的不利影响仍然是当前联邦学习优化的核心挑战之一。本研究将主要关注水平联邦学习领域的数据异构性问题。

一些研究人员只考虑单一类别的非独立同分布环境，并且没有在不同类别的非独立同分布环境中提供稳定的性能改进[ system11、12、13、14、15、16 models]。此外，[ 17 remains one of the central challenges in current federated learning optimization. Some researchers consider only a single category of non-IID environments and do not provide stable performance improvements in different categories of non-IID environments ^{[11][12][13][14][15][16]}. Furthermore, The authors in ^[17] restrain the local model update and mitigate the degree of ]中的作者通过引入各种近端项来限制局部模型更新并减轻“client drift客户端漂移”的程度。他们引入近端项的方法是有效的，但也本质上限制了本地模型收敛的潜力，同时在客户端上产生大量的通信、计算和存储开销，这对于现实世界的分布式系统来说是无法容忍的。此外，大多数先前的工作[ by11,12,13,14,15,16,17,18,19 ]假设 FL 系统中的所有客户端都可以参与每轮迭代。当所有客户都参与时，客户数量通常会较少，而在实践中，客户数量通常会较多。由于通信或计算能力等因素的差异，所有客户都参与每一轮 introducing a variety of proximal terms. Their approach of introducing proximal terms is effective, but also inherently limits the potential for local model convergence while incurring considerable communication, computation, and storage overhead on the clients, which are intolerable for real-world distributed systems. Moreover, most previous work ^{[11][12][13][14][15][16][17][18][19]} assumes that all clients in the FL system的大量参与者的场景是不可行的。在常见的方法中，服务器通常采用随机选择策略来选择参与者，并使用模型聚合算法来更新全局模型的权重。随机选择参与模型聚合过程的客户端会增加全局模型的偏差，加剧数据异构性的负面影响。因此，设计一个对 can participate in each round of iteration. The number of clients is usually smaller when all clients participate, while in practice, the number of clients is typically larger. The scenario with a multitude of participants, where all clients participate in each FL round, is not feasible due toFL 来说稳健的最佳客户选择策略至关重要。

许多研究侧重于设计客户选择策略以缓解 differences in communication or computational power, among other factors. In common approaches, the server typically employs a random selection policy to select participants and uses a model aggregation algorithm to update the weights of the global model. The random selection of clients for participation in the model aggregation process can increase the bias of the global model and exacerbate the negative effects of data heterogeneity. Therefore, it is crucial to design an optimal client selection strategy that is robust for FL. Numerous studies中的数据异质性问题。一些作者利用局部和全局模型参数之间的差异来衡量局部数据集的偏差程度，以制定客户选择策略[ focus20,21,22 ]。这些方法要么依赖于全球共享数据集，要么造成巨大的资源浪费。本地训练过程中产生的训练损失自然反映了不同客户端之间本地数据分布和训练进度的倾斜程度。除此之外，损失值的计算和上传不会产生新的计算或存储负担。一些研究使用基于客户端本地训练损失值的偏差选择并取得了良好的结果[ on23,24,25 ]。他们认为，偏向具有较高本地损失值的客户可以加速异构环境中 devising client selection strategies to alleviate the issue of data heterogeneity in FL. Some authors measure the degree of local dataset skews by utilizing the discrepancy between local and global model parameters for the development of client selection strategies ^[20][21][22]. These methods either rely on a global shared dataset or cause a huge waste of resources. The training loss generated during local training naturally reflects the degree of skew in the local data distribution and training progress between different clients. Other than that, The calculation and upload of the loss value will not generate new computational or storage burdens. Some studies use that biased selection based on client-side local training loss values and achieve good results ^[23][24][25]. They believe that favoring clients with higher local loss values can accelerate the convergence of 的收敛。直观上，在FL的早期阶段，具有高损失值的客户将有助于全局模型更快地收敛。然而，当全局模型接近收敛时，选择具有高损失值的客户可能会对准确性提高产生负面影响。论文[ in26 heterogeneous]指出，总是选择优先的客户端往往会导致性能次优；在选择优先客户和选择更多样化的客户之间需要权衡。设计一个平衡开发和探索的 environments. Intuitively, in the early stages of FL, clients with high loss values will help the global model converge faster. However, choosing clients with high loss values may negatively impact accuracy improvement when the global model is close to convergence. FL 客户端选择机制具有挑战性。

深度强化学习（Deep reinforcement learning (DRL) excels at handling optimal decisions in complex dynamic environments, where the agent repeatedly observes the environment, performs actions to maximize its goals, and receives rewards from the environment. Constructing an agent for the server in FL, the agent adaptively selects clients with high or low loss values to participate in the global model aggregation process by designing a suitable reward function, thus alleviating the problem that client selection strategies are difficult to formulate in dynamic environments. ）擅长在复杂的动态环境中处理最优决策，其中代理反复观察环境，执行行动以最大化其目标，并从环境中获得奖励。在FL中为服务器构建一个代理，代理通过设计合适的奖励函数自适应地选择损失值高或低的客户端参与全局模型聚合过程，从而缓解动态环境下客户端选择策略难以制定的问题。

2. Data-Based Approaches基于数据的方法

Several一些研究试图缓解客户中的非独立同分布问题。赵等人。[ studies28 have attempted to alleviate the non-IID issue among clients. Zhao et al. ^[26] improved training on non-IID data by constructing a small, globally shared, uniformly distributed data subset for all clients. ]通过为所有客户端构建一个小型的、全球共享的、均匀分布的数据子集，改进了非独立同分布数据的训练。同样，Similarly, Seo et等人。[ al.29 ^[27] mitigated the quality degradation problem in ]通过数据共享缓解了FL中的质量下降问题，使用拍卖方法有效降低成本，同时满足最大化模型质量和资源效率的系统要求。在[ via30 data sharing, using an auction approach to effectively reduce the]中，作者假设一小部分客户端愿意共享他们的数据集，并且服务器以集中的方式从这些客户端收集数据以帮助更新全局模型。虽然此类基于数据共享的方法获得了显着的性能提升，但它们违背了 cost, while satisfying system requirements for maximizing model quality and resource efficiency. In ^[28], the authors assume that a small segment of clients are willing to share their datasets, and the server collects data from these clients in a centralized manner to aid in updating the global model. Although such data-sharing-based methods have obtained significant performance improvements, they go against the original intention of FL and pose a threat to privacy. And in the absence of the client’s original data, the server cannot obtain the global data distribution information and use it to build a globally shared 的初衷，并对隐私构成威胁。并且在没有客户端原始数据的情况下，服务器无法获取全局数据分布信息并利用其构建全局共享的IID dataset.数据集。

3. Algorithm-Based Approaches基于算法的方法

Another另一个研究方面侧重于通过设计算法来增强本地训练阶段或改进全局聚合过程来解决异构数据的负面影响。在[ research11 aspect]中，作者介绍了一种称为 focuses on addressing the negative impact of heterogeneous data by designing algorithms to enhance the local training phase or improve the global aggregation process. In ^[11], the authors introduce a new algorithm called SCAFFOLD. The algorithm uses control variables to correct for local updates, preventing “client drift”, and leverages the similarity in client data to accelerate the convergence of FL. Li的新算法。该算法使用控制变量来纠正本地更新，防止“客户端漂移”，并利用客户端数据的相似性来加速 FL 的收敛。李等人。[ et12 al.]使用正则化项平衡全局目标和局部目标之间的优化差异。此外，作者[ ^[12]13 balances the optimization differences between global and local objectives using a regularization term. In addition, the authors ^[13] introduced a normalized averaging algorithm called ]引入了一种称为FedNove.的归一化平均算法。该算法通过每个客户端的本地训练迭代次数来标准化本地更新。它确保错误快速收敛，同时保持客观一致性。[ 14 This algorithm normalizes local updates by the number of local training iterations per client. It ensures rapid error convergence while maintaining objective consistency. The authors of ^[14] propose the ]的作者提出了FedRS method, which constrains the updates of missing category weights during local training via a classification layer in a neural network. 方法，该方法通过神经网络中的分类层在本地训练期间限制缺失类别权重的更新。MOON ^[15][ is15 proposed]被提出作为模型对比联邦学习。它为客户端引入了对比损失，利用全局模型和历史本地模型的表示进行学习，以纠正每个客户端的本地模型更新。同样，[ 16 as model-contrastive federated learning. It introduces a contrastive loss for the clients, utilizing the representations of the global model and historical local models for learning, to correct the local model updates of each client. Similarly, the authors of ^[16] proposed ]的作者提出了FedProc,，一种典型的对比联邦学习方法。作者为本地网络训练设计了一个全局原型对比损失，并使用原型作为全局知识来纠正每个客户的本地训练。[ 18 a prototypical contrastive federated learning approach. The authors design a global prototypical contrastive loss for local network training and use prototypes as global knowledge to correct local training for each client. The authors of ^[18] demonstrate a contribution-dependent weighting design, named ]的作者展示了一种依赖于贡献的权重设计，名为FedAdp. It calculates the association between the client’s local goals and the global goal of the overall 。它根据训练过程中的梯度信息计算客户端的局部目标与整个FL系统的全局目标之间的关联，为每个参与的客户端分配不同的权重。张等人。[ system19 based]通过无数据蒸馏将知识从局部模型转移到全局模型，解决了直接模型聚合的挑战。龙等人。[ on31 the gradient information during the training process, assigning different weights to each participating client. Zhang et al. ^[19] address the challenge of direct model aggregation by transferring knowledge from the local model to the global model through data-free distillation. Long et al. ^[29] propose ]提出FedCD, which removes classifier bias from non-IID data by introducing hierarchical prototype comparison learning, global information distillation, and other methods to understand the class distribution of clients.，通过引入分层原型比较学习、全局信息蒸馏等方法来了解客户的类别分布，从而消除非独立同分布数据中的分类器偏差。

4. System-Based Approaches基于系统的方法

In此外，一些研究尝试设计服务器的客户端选择策略。在[ addition,20 several]中，作者通过分析本地模型权重的差异来确定客户端之间的 studies have attempted to design client selection policies for servers. In ^[20], the authors determine the level of IIID data数据水平。他们为非 among clients by analyzing differences in local model weights. They assign a higher probability of selection to clients with lower degrees of non-IID, ensuring their more frequent participation in FL training. But the assumption of accessible IID publicIID 程度较低的客户分配了更高的选择概率，确保他们更频繁地参与 FL 培训。但 IID 公共数据可访问的假设在现实世界中很难满足。吴等人。[ data21 is]使用局部模型梯度和全局模型梯度的内积作为衡量标准来确定参与模型聚合的客户端子集，确保对减少全局损失贡献更大的客户端有更高的被选择概率。一些研究通过考虑本地训练损失值来设计客户选择策略。戈茨等人。[ challenging25 to meet in the real world. Wu et al. ^[21] use the inner product of the local model gradient and the global model gradient as a measure to determine the subset of clients participating in model aggregation, ensuring that clients contributing more to reducing the global loss have a higher probability of being selected. Some studies have designed a client selection strategy by considering the local training loss values. Goetz et al. ^[25] evaluate the contribution of different client data in each ]根据局部损失值评估不同客户端数据在每个FL轮中的贡献，计算相应的评估分数，并根据评估值选择优化的客户端子集。曹等人。[ round23 according]从理论上证明，与随机客户端选择相比，有利于具有较大局部损失值的客户端选择可以提高收敛速度。其他研究采用强化学习来选择服务器的客户端。陈等人。[ to32 the local loss value, calculate the corresponding evaluation score, and select an optimized subset of clients according to the evaluation value. Cho et al. ^[23] theoretically demonstrate that favoring client selection with larger local loss values can improve the convergence rate compared to random client selection. Other studies employ reinforcement learning to select clients for servers. Chen et al. ^[30] use an ]使用UCB approach to heuristically select participating clients during each round of optimization, utilizing the cosine distance weights (方法在每轮优化期间启发式选择参与客户，利用历史全局模型和当前局部模型的余弦距离权重（CDW)）来衡量客户的贡献并分配奖励。此外，[ 22 of the historical global model and the current local model to measure the client’s contribution and assign rewards. Moreover, the author of ^[22] proposed an experience-driven control framework that uses a deep reinforcement learning algorithm to intelligently select clients in each round of federated learning (]的作者提出了一种经验驱动的控制框架，该框架使用深度强化学习算法，通过降低客户端本地模型权重的维度并将其用作状态，在每轮联邦学习（FL)）中智能地选择客户端提高全球模型的性能。肖等人。[ by33 reducing the dimensionality of clients’ local model weights and using them as states to enhance the performance of the global model. Xiao et al. ^[31] proposed a client selection strategy based on clustering and bi-level sampling. Firstly, a subset of candidate clients is constructed using ]提出了一种基于聚类和双层抽样的客户选择策略。首先，使用MD sampling, and then a 采样构建候选客户端的子集，然后提出WPCS mechanism is proposed to collect the weighted per-label mean class scores of the clients to perform clustering and select the final client. 机制来收集客户端的加权每标签平均类别分数以进行聚类并选择最终客户端。