<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">65978</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2025.065978</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Differential Privacy Integrated Federated Learning for Power Systems: An Explainability-Driven Approach</article-title>
<alt-title alt-title-type="left-running-head">Differential Privacy Integrated Federated Learning for Power Systems: An Explainability-Driven Approach</alt-title>
<alt-title alt-title-type="right-running-head">Differential Privacy Integrated Federated Learning for Power Systems: An Explainability-Driven Approach</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Liu</surname><given-names>Zekun</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-2" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Ma</surname><given-names>Junwei</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><xref ref-type="aff" rid="aff-2">2</xref><email>majw_sgit@163.com</email></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Gong</surname><given-names>Xin</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Liu</surname><given-names>Xiu</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-5" contrib-type="author">
<name name-style="western"><surname>Liu</surname><given-names>Bingbing</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-6" contrib-type="author">
<name name-style="western"><surname>An</surname><given-names>Long</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<aff id="aff-1"><label>1</label><institution>State Grid Information &#x0026; Telecommunication Co of SEPC</institution>, <addr-line>Taiyuan, 030021</addr-line>, <country>China</country></aff>
<aff id="aff-2"><label>2</label><institution>Department of Computer Science, Technische Universit&#x00E4;t Dortmund</institution>, <addr-line>Dortmund, 44227</addr-line>, <country>Germany</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Junwei Ma. Email: <email>majw_sgit@163.com</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2025</year>
</pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>29</day><month>08</month><year>2025</year>
</pub-date>
<volume>85</volume>
<issue>1</issue>
<fpage>983</fpage>
<lpage>999</lpage>
<history>
<date date-type="received">
<day>26</day>
<month>3</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>24</day>
<month>6</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2025 The Authors.</copyright-statement>
<copyright-year>2025</copyright-year>
<copyright-holder>Published by Tech Science Press.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_65978.pdf"></self-uri>
<abstract>
<p>With the ongoing digitalization and intelligence of power systems, there is an increasing reliance on large-scale data-driven intelligent technologies for tasks such as scheduling optimization and load forecasting. Nevertheless, power data often contains sensitive information, making it a critical industry challenge to efficiently utilize this data while ensuring privacy. Traditional Federated Learning (FL) methods can mitigate data leakage by training models locally instead of transmitting raw data. Despite this, FL still has privacy concerns, especially gradient leakage, which might expose users&#x2019; sensitive information. Therefore, integrating Differential Privacy (DP) techniques is essential for stronger privacy protection. Even so, the noise from DP may reduce the performance of federated learning models. To address this challenge, this paper presents an explainability-driven power data privacy federated learning framework. It incorporates DP technology and, based on model explainability, adaptively adjusts privacy budget allocation and model aggregation, thus balancing privacy protection and model performance. The key innovations of this paper are as follows: (1) We propose an explainability-driven power data privacy federated learning framework. (2) We detail a privacy budget allocation strategy: assigning budgets per training round by gradient effectiveness and at model granularity by layer importance. (3) We design a weighted aggregation strategy that considers the SHAP value and model accuracy for quality knowledge sharing. (4) Experiments show the proposed framework outperforms traditional methods in balancing privacy protection and model performance in power load forecasting tasks.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Power data</kwd>
<kwd>federated learning</kwd>
<kwd>differential privacy</kwd>
<kwd>explainability</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>The rapid development of digital technologies has driven a profound transformation in the power industry [<xref ref-type="bibr" rid="ref-1">1</xref>,<xref ref-type="bibr" rid="ref-2">2</xref>]. As a key component of smart grids, the power system is increasingly reliant on data-driven intelligent technologies for tasks such as scheduling optimization, load forecasting, and demand-side management [<xref ref-type="bibr" rid="ref-3">3</xref>&#x2013;<xref ref-type="bibr" rid="ref-5">5</xref>]. Large-scale data processing and analysis provide robust support for the optimization and automation of power systems, making their operation more efficient, flexible, and intelligent [<xref ref-type="bibr" rid="ref-6">6</xref>]. However, with the widespread use of power data, privacy issues have become increasingly prominent. Power data contains a significant amount of sensitive user information, such as electricity consumption patterns, geographic locations, living habits, and consumption preferences. If misused or leaked, this data could pose serious threats to user privacy. Therefore, as the power system becomes more intelligent, ensuring the protection of user privacy has become a critical issue that must be addressed during the data-driven transformation of the power industry [<xref ref-type="bibr" rid="ref-7">7</xref>].</p>
<p>Traditional privacy protection methods, such as data encryption and anonymization, often suffer from high computational overhead and low processing efficiency, making them inadequate for the real-time processing demands of large-scale power data. Federated Learning (FL), an emerging distributed machine learning method, offers an innovative solution to privacy concerns [<xref ref-type="bibr" rid="ref-8">8</xref>]. By training models locally on devices and only sharing model parameters or gradients, FL avoids the central storage and transmission of raw data, thereby reducing the risk of data leakage. FL not only improves model accuracy but also reduces bandwidth consumption during data transmission, offering significant advantages in multi-party collaborative learning.</p>
<p>However, in federated learning, although the central server is prevented from directly contacting user data, the server can still use gradient attacks to restore the user&#x2019;s local training data from the exchanged parameter gradients. The method is to first randomly generate virtual training data and use it to generate virtual gradients, and then repeatedly iterate through gradient descent with the optimization goal of narrowing the gap between the virtual gradient and the real gradient, so as to restore the user&#x2019;s private data. Therefore, relying solely on FL does not fully ensure privacy protection, especially in high-sensitivity scenarios like power data, where further enhancement of privacy protection is crucial [<xref ref-type="bibr" rid="ref-9">9</xref>].</p>
<p>To address this challenge, Differential Privacy (DP) has emerged as an effective privacy-preserving technique [<xref ref-type="bibr" rid="ref-10">10</xref>]. DP adds noise to the output results, ensuring that even if an attacker gains access to the model&#x2019;s output, they cannot infer individual sensitive information. Integrating DP within the FL framework can effectively protect data privacy and prevent the leakage of sensitive information. There have been some works applying this method to the power field. However, existing DP methods usually rely on static noise addition strategies, which are very unfriendly to model performance. First, the privacy budget allocated in each round of training is the same, which leads to the fact that noise will have a non-negligible impact on the convergence. Then, in a multi-layer neural network model, the traditional strategy allocates the same privacy budget to each layer. However, in a multi-layer neural network model, some layers contribute more to the expressiveness of the model while others contribute less. It is unfair to allocate the same privacy budget to them.</p>
<p>In this context, this paper proposes an explainability-driven and privacy-preserving FL framework. Different from traditional privacy-preserving methods, this framework combines DP with model explainability analysis. The explainability in this paper mainly includes two aspects. After each client completes local training, the client calculates the layer importance of the model. We allocate larger privacy budgets to important layers to retain the knowledge of the model to a greater extent, and for unimportant layers, we allocate smaller privacy budgets to ensure the privacy of the model. Considering that noise will cause gradient distortion, on the central server, we calculate the Shapley Additive explanations (SHAP) value vector for each server&#x2019;s local model, and prioritize the aggregation of local models with high SHAP value vector similarity. When aggregating, we use indicators such as model accuracy as a reference to avoid the damage of low-quality models to model aggregation.</p>
<p>The main innovations of this paper are: First, we propose a dynamic privacy optimization framework that combines explainability analysis and DP to achieve the best balance between privacy protection and model performance. Second, we design a strategy for allocating privacy budgets between different training rounds and different layers of the model. Then, we design an aggregation strategy based on SHAP value and model performance indicators. Finally, through theoretical analysis and experimental verification in the power load forecasting task, we demonstrate that the proposed framework not only enhances privacy protection but also improves model performance, outperforming traditional methods. Compared with static privacy protection methods, the proposed framework better addresses the trade-off between model performance and privacy protection in practical applications of power data.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Work</title>
<p>To tackle the &#x201C;data silos&#x201D; issue in the power sector, researchers have introduced FL, a distributed learning approach that enables local model training without centralized data storage, effectively protecting user privacy. Chen et al. [<xref ref-type="bibr" rid="ref-11">11</xref>] first proposed the FL framework, demonstrating how deep neural networks can be collaboratively trained through efficient communication and local computation. This framework theoretically supports user privacy protection and reduces data transmission costs. Subsequent studies focused on FL model optimization, heterogeneity, and communication efficiency [<xref ref-type="bibr" rid="ref-12">12</xref>&#x2013;<xref ref-type="bibr" rid="ref-14">14</xref>].</p>
<p>FL&#x2019;s application in the power sector has grown, especially in smart grid management and load forecasting [<xref ref-type="bibr" rid="ref-15">15</xref>&#x2013;<xref ref-type="bibr" rid="ref-18">18</xref>]. Smart grid data is often dispersed across devices and sensors, involving large amounts of private information. Thus, ensuring data privacy while achieving effective modeling and forecasting is crucial. Cheng et al. [<xref ref-type="bibr" rid="ref-19">19</xref>] discussed FL&#x2019;s potential in energy systems for addressing privacy and data security concerns, highlighting its ability to enhance system performance without compromising data confidentiality. Zheng et al. [<xref ref-type="bibr" rid="ref-20">20</xref>] explored how privacy-preserving FL could improve power system services like distributed energy resource management, fault detection, and load forecasting. Husnoo et al. [<xref ref-type="bibr" rid="ref-21">21</xref>] proposed the FedDiSC framework for distinguishing disturbances and cyber-attacks in power systems, reducing computational load and enhancing resilience. Wen et al. [<xref ref-type="bibr" rid="ref-22">22</xref>] proposed a novel privacy-preserving federated learning framework for energy theft detection is proposed. A federated learning system consisting of a data center (DC), a control center (CC), and multiple detection stations is considered. This work designs a secure protocol so that the detection stations can send encrypted training parameters to CC and DC, and then CC and DC calculate the aggregate parameters using homomorphic encryption and return the updated model parameters to the detection stations. Mahmoud et al. [<xref ref-type="bibr" rid="ref-23">23</xref>] repurpose an efficient inner-product functional encryption (IPFE) scheme for implementing secure data aggregation to preserve the customers&#x2019; privacy by encrypting their models&#x2019; parameters during the FL training.</p>
<p>Despite FL&#x2019;s privacy benefits through local data processing, it still faces data leakage risks during model parameter upload and aggregation. Malicious participants can infer sensitive data by analyzing model updates like gradients or weights. Zhu et al. [<xref ref-type="bibr" rid="ref-24">24</xref>] showed that gradient updates in FL could leak participant data, demonstrating how training data could be partially recovered through gradient reverse-engineering. Shokri et al. [<xref ref-type="bibr" rid="ref-25">25</xref>] proposed membership inference attacks that allow attackers to determine if specific data was used in training by analyzing model gradient changes.</p>
<p>To enhance privacy protection, DP has become essential in FL [<xref ref-type="bibr" rid="ref-26">26</xref>&#x2013;<xref ref-type="bibr" rid="ref-28">28</xref>]. Truex et al. [<xref ref-type="bibr" rid="ref-26">26</xref>] introduced the &#x201C;LDP-Fed&#x201D; framework, combining Local Differential Privacy (LDP) with FL to ensure data privacy without centralized sensitive data storage, balancing privacy protection and computational efficiency. DP adds noise to model updates to prevent individual data inference from model outputs. El et al. [<xref ref-type="bibr" rid="ref-29">29</xref>] discussed DP&#x2019;s application in deep learning and FL, focusing on privacy protection technologies. Wei et al. [<xref ref-type="bibr" rid="ref-30">30</xref>] proposed a DP-based FL algorithm, analyzing the trade-off between privacy protection and convergence performance.</p>
</sec>
<sec id="s3">
<label>3</label>
<title>Method</title>
<p>To tackle the trade-off trade-off between model performance and privacy protection in Federated Learning (FL), and the trust issues stemming from the model&#x2019;s black-box nature, we propose an explainable privacy-preserving FL framework as shown in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>. This framework dynamically adjusts privacy protection strength based on layer importance and uses SHAP values and model performance for aggregation.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Explainability-driven and privacy-preserving FL framework for power data</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_65978-fig-1.tif"/>
</fig>
<p>The explainability-driven and privacy-preserving FL framework is based a machine learning paradigm that ensures privacy protection. It enables multiple clients (such as institutions and mobile devices) to collaboratively train a global model while keeping data localized, thereby safeguarding data privacy.</p>
<p>The key principle of FL is as follows:</p>
<p>Clients download the global model parameters <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, train using local data <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, and update local parameters via Stochastic Gradient Descent (SGD):
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B7;</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:mi>&#x03B7;</mml:mi></mml:math></inline-formula> denotes the learning rate, and <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>w</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munderover><mml:mi>&#x2113;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo>;</mml:mo><mml:mi>w</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the local loss function at client <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:mi>k</mml:mi></mml:math></inline-formula>, with <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> being the local data size.</p>
<p>The server collects the parameter updates from each client <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:mo fence="false" stretchy="false">{</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo fence="false" stretchy="false">}</mml:mo></mml:math></inline-formula>, and computes a weighted average based on the data size to obtain the new global model:
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:munderover><mml:mfrac><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mi>n</mml:mi></mml:mfrac><mml:mo>&#x22C5;</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:math></disp-formula>where <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the total data size. This process is referred to as Federated Averaging (FedAvg).</p>
<p>FL aims to minimize the weighted sum of the losses from all clients:
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:munder><mml:mo movablelimits="true" form="prefix">min</mml:mo><mml:mrow><mml:mi>w</mml:mi></mml:mrow></mml:munder><mml:mi>L</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>w</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:munderover><mml:mfrac><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mi>n</mml:mi></mml:mfrac><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>w</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>The framework consists of participating clients and a central server. The client is responsible for performing local training, calculating the privacy budget allocation strategy, adding noise to the gradient and uploading the noisy gradient to the central server, and the central server is responsible for aggregating the global model with reference to the explainability and model performance of the local model.</p>
<p>The training process of framework includes the following steps:
<list list-type="order">
<list-item>
<p>Each participant performs local training based on the global model.</p></list-item>
<list-item>
<p>Each participant calculates the privacy budget for this round of local training.</p></list-item>
<list-item>
<p>Each participant calculates the privacy budget of each layer of the model based on the layer importance.</p></list-item>
<list-item>
<p>Each participant uploads the noisy gradient to the central server.</p></list-item>
<list-item>
<p>The central server performs aggregation with reference to the model explainability and model performance to generate a global model.</p></list-item>
<list-item>
<p>The central server distributes the global model to all participants.</p></list-item>
</list></p>
<p>First, we introduce the training process of neural networks to guide privacy budget allocation. Neural networks are usually composed of the following components:</p>
<p>Input Layer: Receives raw data (e.g., image pixels, text vectors).</p>
<p>Hidden Layer Computation: Each neuron performs a linear transformation on the input (weight matrix multiplication &#x002B; bias term), followed by a nonlinear activation function (e.g., ReLU, Sigmoid).</p>
<p>Output Layer: Generates the final prediction (e.g., classification probability, regression value).</p>
<p>In a neural network, forward propagation refers to the process of passing data from the input layer to the output layer to compute the model&#x2019;s predictions. Through multiple layers of nonlinear transformations, the original input is mapped to the target output space.</p>
<p>Backward propagation refers to the process of computing gradients layer by layer from the output layer to the input layer, based on prediction errors, to update network parameters. The gradients guide the adjustment of network parameters in the direction of minimizing prediction errors.</p>
<p>The backward propagation process includes the following steps:</p>
<p>Loss Calculation: The error between predicted values and true values is quantified using a loss function (e.g., cross-entropy, mean squared error).
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mrow><mml:mtext>Loss</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Gradient Calculation for Backpropagation:</p>
<p>Firstly, compute the derivative of the loss with respect to the input of the output layer:
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:msup><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>L</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:msup><mml:mi>z</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>L</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2299;</mml:mo><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>z</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>L</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Then, using the chain rule, the gradient for each layer is computed (<inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mi>L</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo></mml:math></inline-formula> &#x2026; , 1):
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:msup><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>W</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x22A4;</mml:mi></mml:mrow></mml:mrow></mml:msup><mml:msup><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>&#x2299;</mml:mo><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>z</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Next, the gradients of the weights and biases are computed based on <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:msup><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:math></inline-formula>:
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:mfrac><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:msup><mml:mi>W</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:msup><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>a</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x22A4;</mml:mi></mml:mrow></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:mfrac><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:msup><mml:mi>b</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:msup><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:math></disp-formula></p>
<p>The parameters are updated using optimization algorithms such as gradient descent, where <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:mi>&#x03B7;</mml:mi></mml:math></inline-formula> notes the learning rate.
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:msup><mml:mi>W</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo stretchy="false">&#x2190;</mml:mo><mml:msup><mml:mi>W</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B7;</mml:mi><mml:mfrac><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:msup><mml:mi>W</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>In the training process of FL combined with DP, the traditional privacy budget allocation method usually adopts a fixed allocation or uniform allocation strategy, that is, the total privacy budget <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is evenly allocated to each round of training iteration. However, in the initial stage of training, the gradient parameters are still unstable and the direction is not precise enough. If the privacy budget is small, it may cause excessive noise interference, making it difficult for the model to learn effective features. As the number of training rounds increases, the gradient gradually stabilizes and approaches the optimal solution, but the fixed privacy budget may cause the budget to be exhausted during the convergence training phase, affecting the final model performance. Therefore, in this paper, we dynamically adjust the privacy budget allocation to make model training more efficient and improve accuracy while ensuring privacy.</p>
<p>Our method is based on the following ideas. When the gradient descent direction is not stable enough, appropriately increase the privacy budget to reduce the impact of noise and make the gradient update more reliable. After the gradient direction tends to be stable, reduce the privacy budget consumption to ensure that the budget can support more rounds of training.</p>
<p>When the client performs local training for round t, taking Laplace noise as an example, the true gradient <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is calculated based on the current model parameter <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:msub><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, and the gradient is noised using the currently allocated privacy budget <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:msub><mml:mo>&#x003B5;</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>:
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:msub><mml:mrow><mml:mover><mml:mi>g</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mrow><mml:mtext>Lap</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:msub><mml:mo>&#x003B5;</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Use the Armijo condition to check whether the updated objective function value meets the &#x201C;sufficient decrease&#x201D; requirement [<xref ref-type="bibr" rid="ref-31">31</xref>]. Let the objective function be <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, when it satisfies:
<disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B7;</mml:mi><mml:msub><mml:mrow><mml:mover><mml:mi>g</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>c</mml:mi><mml:mi>&#x03B7;</mml:mi><mml:mo>&#x2225;</mml:mo><mml:msub><mml:mrow><mml:mover><mml:mi>g</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mo>&#x2225;</mml:mo><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:mi>c</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mi>c</mml:mi></mml:math></inline-formula> is a preset constant and <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:mi>&#x03B7;</mml:mi></mml:math></inline-formula> is the learning rate, indicating that the update direction under the current noise level is sufficient to reduce the objective function value, and the noisy gradient can be sent to the central server.</p>
<p>Otherwise, it is considered that the current noise is too large, causing the update direction to &#x201C;deviate&#x201D; from the ideal descent direction, and the privacy budget needs to be adjusted.</p>
<p>The privacy budget adjustment formula is:
<disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block"><mml:msubsup><mml:mo>&#x003B5;</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>&#x03BB;</mml:mi><mml:msub><mml:mo>&#x003B5;</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03BB;</mml:mi><mml:mo>&#x003E;</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:mi>&#x03BB;</mml:mi></mml:math></inline-formula> is the coefficient for adjusting the privacy budget. Adjusting the privacy budget will reduce the noise level. Use the new privacy budget <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:msubsup><mml:mo>&#x003B5;</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> to recalculate the noise gradient:
<disp-formula id="eqn-12"><label>(12)</label><mml:math id="mml-eqn-12" display="block"><mml:msubsup><mml:mrow><mml:mover><mml:mi>g</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mrow><mml:mtext>Lap</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:msubsup><mml:mo>&#x003B5;</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msubsup></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Check again whether the noisy gradient meets the Armijo condition. If it still does not meet the condition, continue to increase the privacy budget and repeat the above steps until an update direction that meets the descent condition is found or the preset budget limit is reached.</p>
<p>In order to make full use of the results of the two calculations, the GRADAVG algorithm can be used to fuse the gradients:
<disp-formula id="eqn-13"><label>(13)</label><mml:math id="mml-eqn-13" display="block"><mml:msubsup><mml:mrow><mml:mover><mml:mi>g</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>&#x2217;</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:msub><mml:mrow><mml:mover><mml:mi>g</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msubsup><mml:mrow><mml:mover><mml:mi>g</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msubsup></mml:math></disp-formula>where <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> is the fusion weight.</p>
<p>In a deep neural network (DNN), different layers contribute differently to the final model decision. Some layers are critical to the core reasoning of the model, while others have relatively little impact. If all layers apply the same privacy budget allocation strategy, it may lead to unnecessary information loss, thereby reducing the accuracy of the model. Therefore, we introduce the concept of layer importance as a reference for privacy budget allocation. Layer importance is one perspective of model explainability.</p>
<p>For high-importance layers, a larger privacy budget is allocated to reduce the impact of noise and ensure that the information of these key layers is retained to the greatest extent;</p>
<p>For low-importance layers, a smaller privacy budget is allocated and more noise is applied to weaken its impact on the final model output while improving the privacy protection strength.</p>
<p>Based on this inter-layer privacy budget allocation strategy, the learning effect of the model can be improved as much as possible while ensuring privacy.</p>
<p>Next, we use layer importance to allocate privacy budgets between layers.</p>
<p>When calculating layer importance, first, each client receives the global model sent by the central server and uses local data for local training. The goal of this stage is to make the model initially adapt to the local data distribution and lay the foundation for the subsequent layer importance calculation.</p>
<p>After the local training is completed, the client uses the layer importance evaluation data to further evaluate the importance of each layer of the model. Taking the Sigmoid activation function as an example, the value range of the Sigmoid activation function is between 0 and 1. When the input of the Sigmoid activation function approaches negative infinity, the value of the neuron after the activation function tends to 0, and the neuron close to 0 contributes little to the output of the subsequent layer and the final result. The calculation formula of layer importance is as follows:
<disp-formula id="eqn-14"><label>(14)</label><mml:math id="mml-eqn-14" display="block"><mml:msub><mml:mi>s</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>M</mml:mi></mml:mrow></mml:munderover><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mi>y</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>m</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula>
<disp-formula id="eqn-15"><label>(15)</label><mml:math id="mml-eqn-15" display="block"><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="left left" rowspacing=".2em" columnspacing="1em" displaystyle="false"><mml:mtr><mml:mtd><mml:mn>1</mml:mn><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mspace width="thinmathspace" /><mml:mi>x</mml:mi><mml:mo>&#x003E;</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mn>2</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mspace width="thinmathspace" /><mml:mi>x</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mn>2</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>For the <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:mi>m</mml:mi><mml:mrow><mml:mtext>th</mml:mtext></mml:mrow></mml:math></inline-formula> validation sample, <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:msubsup><mml:mi>y</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>m</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the input value of the Sigmoid activation function of the <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:mi>j</mml:mi><mml:mrow><mml:mtext>th</mml:mtext></mml:mrow></mml:math></inline-formula> neuron in the <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:mi>i</mml:mi><mml:mrow><mml:mtext>th</mml:mtext></mml:mrow></mml:math></inline-formula> layer model. <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:mi>M</mml:mi></mml:math></inline-formula> is the number of verification samples, and <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:mi>N</mml:mi></mml:math></inline-formula> is the number of neurons in the <inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:mi>i</mml:mi><mml:mrow><mml:mtext>th</mml:mtext></mml:mrow></mml:math></inline-formula> layer model. According to experience, when the input value of the Sigmoid activation function is greater than &#x2212;2, we believe that the neuron is successfully activated. When the frequency of activation of neurons in a layer is high, we believe that the importance of this layer is higher.</p>
<p>After completing the calculation of layer importance, we refer to the layer importance to allocate privacy budgets to each layer, and allocate larger privacy budgets to layers with large importance values, thereby adding less noise to them. For layers with small importance values, smaller privacy budgets are allocated, and more noise is added to them.</p>
<p>The privacy budget allocation of each layer follows the following formula:
<disp-formula id="eqn-16"><label>(16)</label><mml:math id="mml-eqn-16" display="block"><mml:msubsup><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mfrac><mml:msub><mml:mi>s</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mi>s</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>&#x00D7;</mml:mo><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></disp-formula></p>
<p>In the formula, <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:mi>L</mml:mi></mml:math></inline-formula> represents the total number of layers in the model, <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:mi>i</mml:mi></mml:math></inline-formula> represents the <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:mi>i</mml:mi><mml:mrow><mml:mtext>th</mml:mtext></mml:mrow></mml:math></inline-formula> layer in the model, and <inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the privacy budget of the current round.</p>
<p>In combining DP with FL, gradient clipping is a key operation. Its main purpose is to control the norm of the gradient uploaded by each client, thereby limiting the sensitivity and ensuring that the added noise is sufficient to protect privacy while preventing abnormal gradients from affecting model training. Gradient clipping follows the following formula:
<disp-formula id="eqn-17"><label>(17)</label><mml:math id="mml-eqn-17" display="block"><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">&#x2190;</mml:mo><mml:mfrac><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo movablelimits="true" form="prefix">max</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mfrac><mml:msub><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mi>C</mml:mi></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p><italic>C</italic> represents the clipping value, and the sensitivity is usually set to:
<disp-formula id="eqn-18"><label>(18)</label><mml:math id="mml-eqn-18" display="block"><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mi>C</mml:mi></mml:math></disp-formula></p>
<p>In some existing works, the clipping value <inline-formula id="ieqn-33"><mml:math id="mml-ieqn-33"><mml:mi>C</mml:mi></mml:math></inline-formula> is usually fixed. However, when the <inline-formula id="ieqn-34"><mml:math id="mml-ieqn-34"><mml:mi>C</mml:mi></mml:math></inline-formula> value is set too large, the gradient will retain too much of its own value, and the noise ratio is too high, reducing the model accuracy; if the <inline-formula id="ieqn-35"><mml:math id="mml-ieqn-35"><mml:mi>C</mml:mi></mml:math></inline-formula> value is set too small, the gradient is over-truncated, affecting the convergence speed. Both situations lead to reduced model accuracy. In this paper, we consider the different importance of each layer in the neural network model and set different gradient clipping values <inline-formula id="ieqn-36"><mml:math id="mml-ieqn-36"><mml:mi>C</mml:mi></mml:math></inline-formula> for each layer.</p>
<p>We calculated the statistical information about the gradient of each layer obtained by the client during local training, including the mean, variance, and maximum value. After experimental screening, we selected the average gradient as the gradient clipping value.</p>
<p>Assuming that the mean value of the gradient of the <inline-formula id="ieqn-37"><mml:math id="mml-ieqn-37"><mml:mi>i</mml:mi><mml:mrow><mml:mtext>th</mml:mtext></mml:mrow></mml:math></inline-formula> layer obtained by training on the local dataset on the client is <inline-formula id="ieqn-38"><mml:math id="mml-ieqn-38"><mml:mover><mml:mrow><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msup><mml:mi>g</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula>, the gradient clipping value of the <italic>i</italic>th layer is set to:
<disp-formula id="eqn-19"><label>(19)</label><mml:math id="mml-eqn-19" display="block"><mml:msup><mml:mi>C</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mover><mml:mrow><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msup><mml:mi>g</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:mo>|</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>By adaptively clipping different layers, not only more valid values in the gradient are retained, but also some gradients are prevented from adding too much noise, thereby improving the accuracy of the model.</p>
<p>After completing the gradient clipping, the sensitivity of the gradient of the <inline-formula id="ieqn-39"><mml:math id="mml-ieqn-39"><mml:mi>i</mml:mi><mml:mrow><mml:mtext>th</mml:mtext></mml:mrow></mml:math></inline-formula> layer model on the adjacent data set can be calculated by the following formula:
<disp-formula id="eqn-20"><label>(20)</label><mml:math id="mml-eqn-20" display="block"><mml:mi mathvariant="normal">&#x0394;</mml:mi><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:msup><mml:mi>C</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msup><mml:mo>&#x00D7;</mml:mo><mml:mi>&#x03B7;</mml:mi></mml:math></disp-formula></p>
<p>Considering that the added noise will cause the distortion of gradients, we designed an aggregation strategy based on model explainability and model performance when aggregating models at the central server. Specifically, an evaluation data set is maintained at the central server. We presume customer data is (approximately) independent and identically distributed (i.i.d.), and that the evaluation dataset&#x2019;s distribution resembles the client dataset&#x2019;s. After receiving the gradient from the client, the central server will calculate the local model for each client as the initial set of models to be aggregated based on the global model of the previous round of aggregation.</p>
<p>The model aggregation process is shown in <xref ref-type="fig" rid="fig-2">Fig. 2</xref>. In each round of aggregation, the two models with the highest SHAP value vector similarity in the models to be aggregated are selected to aggregate and generate a temporary model to be added to the set of models to be aggregated. The above iterative process is repeated until the final model is obtained.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Model aggregation method based on SHAP vector and accuracy</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_65978-fig-2.tif"/>
</fig>
<p>Based on the evaluation data set, the central server calculates the SHAP value vector and a performance evaluation index (taking accuracy as an example) for the temporary models of all clients.</p>
<p>SHAP value is a model explainability method based on Shapley value in game theory. It reveals the impact of each feature by quantifying its marginal contribution to the model prediction. The core idea is to decompose the model&#x2019;s prediction value into the sum of all feature contributions. The specific calculation formula is:
<disp-formula id="eqn-21"><label>(21)</label><mml:math id="mml-eqn-21" display="block"><mml:msub><mml:mi>&#x03D5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>S</mml:mi><mml:mo>&#x2286;</mml:mo><mml:mi>N</mml:mi><mml:mo>&#x2216;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mi>i</mml:mi><mml:mo>}</mml:mo></mml:mrow></mml:mrow></mml:munder><mml:mfrac><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>S</mml:mi><mml:mo>|</mml:mo></mml:mrow><mml:mo>!</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mi>S</mml:mi><mml:mo>|</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mo>!</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>!</mml:mo></mml:mrow></mml:mfrac><mml:mrow><mml:mo>[</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>S</mml:mi><mml:mo>&#x222A;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mi>i</mml:mi><mml:mo>}</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>S</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Among them, <inline-formula id="ieqn-40"><mml:math id="mml-ieqn-40"><mml:msub><mml:mi>&#x03D5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the SHAP value of feature <inline-formula id="ieqn-41"><mml:math id="mml-ieqn-41"><mml:mi>i</mml:mi></mml:math></inline-formula>, <inline-formula id="ieqn-42"><mml:math id="mml-ieqn-42"><mml:mi>N</mml:mi></mml:math></inline-formula> is the full set of features, and <inline-formula id="ieqn-43"><mml:math id="mml-ieqn-43"><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>S</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the model prediction value when only the feature subset <inline-formula id="ieqn-44"><mml:math id="mml-ieqn-44"><mml:mi>S</mml:mi></mml:math></inline-formula> is used.</p>
<p>All temporary models are regarded as a set, and the two models with the highest SHAP vector cosine similarity are selected for aggregation each time. When aggregating, in addition to considering the explainability of the model, the performance of the model should also be considered as a factor when aggregating.</p>
<p>Assuming that the accuracy of the temporary model of client <inline-formula id="ieqn-45"><mml:math id="mml-ieqn-45"><mml:mi>k</mml:mi></mml:math></inline-formula> is <inline-formula id="ieqn-46"><mml:math id="mml-ieqn-46"><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, the accuracy of the temporary model of client <inline-formula id="ieqn-47"><mml:math id="mml-ieqn-47"><mml:mi>j</mml:mi></mml:math></inline-formula> is <inline-formula id="ieqn-48"><mml:math id="mml-ieqn-48"><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, the temporary model parameters of client <inline-formula id="ieqn-49"><mml:math id="mml-ieqn-49"><mml:mi>k</mml:mi></mml:math></inline-formula> are <inline-formula id="ieqn-50"><mml:math id="mml-ieqn-50"><mml:msup><mml:mi mathvariant="bold-italic">W</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, and the temporary model parameters of client <inline-formula id="ieqn-51"><mml:math id="mml-ieqn-51"><mml:mi>j</mml:mi></mml:math></inline-formula> are <inline-formula id="ieqn-52"><mml:math id="mml-ieqn-52"><mml:msup><mml:mi mathvariant="bold-italic">W</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>. The temporary model of client <italic>k</italic> and the temporary model of client <italic>j</italic> are aggregated based on the following formula:
<disp-formula id="eqn-22"><label>(22)</label><mml:math id="mml-eqn-22" display="block"><mml:msup><mml:mi mathvariant="bold-italic">W</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>&#x00D7;</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">W</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x00D7;</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">W</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>The aggregated model <inline-formula id="ieqn-53"><mml:math id="mml-ieqn-53"><mml:msup><mml:mi mathvariant="bold-italic">W</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> will be added back to the temporary model collection. The above process is repeated until all models are aggregated. The obtained model will be issued as a new round of global models.</p>
<p>The overall framework is shown in Algorithm 1.</p>
<fig id="fig-8">
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_65978-fig-8.tif"/>
</fig>
<p>Next, we theoretically prove the privacy protection ability of the proposed method.</p>
<p>For client <inline-formula id="ieqn-72"><mml:math id="mml-ieqn-72"><mml:mi>k</mml:mi></mml:math></inline-formula>, on adjacent datasets <inline-formula id="ieqn-73"><mml:math id="mml-ieqn-73"><mml:mi>D</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-74"><mml:math id="mml-ieqn-74"><mml:msup><mml:mi>D</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, let <inline-formula id="ieqn-75"><mml:math id="mml-ieqn-75"><mml:mo movablelimits="true" form="prefix">Pr</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula id="ieqn-76"><mml:math id="mml-ieqn-76"><mml:mo movablelimits="true" form="prefix">Pr</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi>D</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represent the probability that the output of the local model layer <inline-formula id="ieqn-77"><mml:math id="mml-ieqn-77"><mml:mi>i</mml:mi></mml:math></inline-formula> is <inline-formula id="ieqn-78"><mml:math id="mml-ieqn-78"><mml:mi>Y</mml:mi></mml:math></inline-formula>. <inline-formula id="ieqn-79"><mml:math id="mml-ieqn-79"><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the privacy budget of the <inline-formula id="ieqn-80"><mml:math id="mml-ieqn-80"><mml:mi>i</mml:mi></mml:math></inline-formula>th layer of the model.</p>
<p>The formula indicates that the <inline-formula id="ieqn-81"><mml:math id="mml-ieqn-81"><mml:mi>i</mml:mi></mml:math></inline-formula>th layer of the model of client <inline-formula id="ieqn-82"><mml:math id="mml-ieqn-82"><mml:mi>k</mml:mi></mml:math></inline-formula> satisfies <inline-formula id="ieqn-83"><mml:math id="mml-ieqn-83"><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>-differential privacy. According to the serial principle, the model of client <inline-formula id="ieqn-84"><mml:math id="mml-ieqn-84"><mml:mi>k</mml:mi></mml:math></inline-formula> satisfies <inline-formula id="ieqn-85"><mml:math id="mml-ieqn-85"><mml:msup><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>-differential privacy, which <inline-formula id="ieqn-86"><mml:math id="mml-ieqn-86"><mml:msup><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. Therefore, the method proposed in this paper has a privacy protection capability that is strictly defined mathematically, and the privacy protection level of the scheme can be controlled by adjusting the privacy budget value. The smaller the value of the privacy budget, the stronger the privacy protection capability.
<disp-formula id="eqn-23"><label>(23)</label><mml:math id="mml-eqn-23" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd><mml:mfrac><mml:mrow><mml:mo movablelimits="true" form="prefix">Pr</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo movablelimits="true" form="prefix">Pr</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi>D</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>=</mml:mo></mml:mtd><mml:mtd><mml:mfrac><mml:mrow><mml:mo movablelimits="true" form="prefix">Pr</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>L</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo movablelimits="true" form="prefix">Pr</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi>D</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>L</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>=</mml:mo></mml:mtd><mml:mtd><mml:mfrac><mml:mrow><mml:mo movablelimits="true" form="prefix">Pr</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>L</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>Y</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo movablelimits="true" form="prefix">Pr</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>L</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>Y</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi>D</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>=</mml:mo></mml:mtd><mml:mtd><mml:mfrac><mml:mrow><mml:mfrac><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:mfrac><mml:mo>&#x00D7;</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mfrac><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mi>Y</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mfrac><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:mfrac><mml:mo>&#x00D7;</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mfrac><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mi>Y</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi>D</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>=</mml:mo></mml:mtd><mml:mtd><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mfrac><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mi>Y</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mi>Y</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi>D</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:msup></mml:mtd></mml:mtr><mml:mtr><mml:mtd /><mml:mtd><mml:mi></mml:mi><mml:mo>&#x2264;</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mfrac><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi>D</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:msup><mml:mo>&#x2264;</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow></mml:msup></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
</sec>
<sec id="s4">
<label>4</label>
<title>Experimental Results</title>
<p>The method proposed in this paper aims to enhance the privacy of FL based on power data in the electricity sector, while ensuring the accuracy of the released models. In the experimental section, we use the GEFC2012 dataset. The GEFC2012 dataset contains electricity consumption data from 20 power stations, spanning from January 2004 to July 2008, covering a total of 1650 days. In this experiment, the prediction target is the power load of each power station. Each power station in the GEFC2012 dataset is treated as a separate client. The simulation parameters used in the experiment are summarized in the <xref ref-type="table" rid="table-1">Table 1</xref> below.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Experimental parameters</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Parameter</th>
<th>Symbol</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Number of clients</td>
<td><inline-formula id="ieqn-87"><mml:math id="mml-ieqn-87"><mml:mi>n</mml:mi></mml:math></inline-formula></td>
<td>20</td>
</tr>
<tr>
<td>Number of federated training rounds</td>
<td><inline-formula id="ieqn-88"><mml:math id="mml-ieqn-88"><mml:mi>G</mml:mi><mml:mi>E</mml:mi></mml:math></inline-formula></td>
<td>20</td>
</tr>
<tr>
<td>Number of local training rounds</td>
<td><inline-formula id="ieqn-89"><mml:math id="mml-ieqn-89"><mml:mi>L</mml:mi><mml:mi>E</mml:mi></mml:math></inline-formula></td>
<td>10</td>
</tr>
<tr>
<td>Learning rate</td>
<td><inline-formula id="ieqn-90"><mml:math id="mml-ieqn-90"><mml:mi>&#x03B3;</mml:mi></mml:math></inline-formula></td>
<td>0.05</td>
</tr>
<tr>
<td>Client selection ratio</td>
<td><inline-formula id="ieqn-91"><mml:math id="mml-ieqn-91"><mml:mi>F</mml:mi><mml:mi>r</mml:mi></mml:math></inline-formula></td>
<td>1</td>
</tr>
<tr>
<td>Total privacy budget</td>
<td><inline-formula id="ieqn-92"><mml:math id="mml-ieqn-92"><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula></td>
<td>25</td>
</tr>
<tr>
<td>Batch size</td>
<td><italic>B</italic></td>
<td>32</td>
</tr>
<tr>
<td>Momentum</td>
<td><italic>/</italic></td>
<td>0.9</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In the experiment, we use Long Short-Term Memory (LSTM) as the base model and select the classic FL algorithm FedAvg, as well as Differentially Private FL (DP-FL) [<xref ref-type="bibr" rid="ref-32">32</xref>], as comparison algorithms. When adding noise, we use Laplace noise</p>
<p>We first compare the mean squared error (MSE) of the global models trained under the three methods as an evaluation metric for training effectiveness. The MSE measures the fluctuation degree of the data; a smaller MSE implies a better alignment between the predictive model and the experimental data. We ensure that the parameter settings for all three methods are consistent, as shown in <xref ref-type="fig" rid="fig-3">Fig. 3</xref>.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Comparison of maximum, minimum, and average MSE under different algorithms</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_65978-fig-3.tif"/>
</fig>
<p>Analysis of the global model MSE post-training reveals that, compared to FedAvg, our algorithm shows minimal differences in maximum, minimum, and average MSE values. This indicates our algorithm effectively reduces noise impact on training, preserving model performance while maintaining power data privacy. Conversely, the DP-FL method demonstrates significantly inferior performance across these MSE metrics, with substantial MSE variance, highlighting the considerable negative effect of its added noise on model performance.</p>
<p>In the next step, we compare the performance of DP-FL and our algorithm under varying total privacy budgets. Considering that GEFC2012 is a heterogeneous dataset, for our method, we set two control groups with client selection ratio of 1 and client selection ratio of 0.8 to observe the performance of our method under different client selection ratios. We set the total privacy budget to 10, 15, 20, 25, and 30, respectively, and observe the average MSE performance of the global model under both algorithms.</p>
<p>As shown in <xref ref-type="fig" rid="fig-4">Fig. 4</xref>, our method significantly outperforms DP-FL in mean MSE, with this advantage growing more pronounced as the total privacy budget decreases. This confirms our method&#x2019;s effectiveness in minimizing model performance loss while ensuring FL privacy. Through dynamic privacy budget allocation and an aggregation strategy based on explainability and model performance, we achieve an effective balance between model utility and privacy protection. In addition, adjusting the client selection ratio has no significant effect on the performance after convergence, but our method shows faster convergence when the client selection ratio is higher.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>Comparison of average MSE under different overall privacy budgets</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_65978-fig-4.tif"/>
</fig>
<p>Then, we set the client selection ratio to 1, ensuring all 20 clients participate in the FL process. We then record loss values for 20 global training rounds under three different methods and compare model performance and training efficiency across these methods.</p>
<p><xref ref-type="fig" rid="fig-5">Fig. 5</xref> shows the loss changes of three FL algorithms&#x2014;FedAvg, DP-FL, and our algorithm&#x2014;over different training epochs. As training epochs increase, the loss of all three methods drops rapidly at first, then slows down, and finally stabilizes at similar levels, showing comparable global model performance. Although DP-FL and our algorithm add noise during training, its impact fades in later stages, allowing successful model convergence. DP-FL has the slowest loss reduction due to noise in model updates that hinders training. Its initial loss is also higher than the other two methods because of the greater impact of early noise. FedAvg shows the fastest convergence throughout training, indicating it can quickly reduce model error for better training results. Our algorithm&#x2019;s loss reduction is similar to FedAvg&#x2019;s, and its training efficiency is much higher than DP-FL&#x2019;s. This means our algorithm adds less noise to updates of more important layers, keeping these updates authentic while effectively balancing privacy protection and training efficiency.</p>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>Loss reduction of different algorithms</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_65978-fig-5.tif"/>
</fig>
<p>Next, we adjust the client selection ratio to change the number of clients in federated training. By comparing loss values over 20 global training rounds with different client numbers, we assess the impact of client count variations on model performance and training efficiency.</p>
<p><xref ref-type="fig" rid="fig-6">Fig. 6</xref> presents the loss variations of three client configurations across different training epochs. The five-client setup shows the poorest training performance, with consistently higher loss and more noticeable curve fluctuations compared to the ten- and twenty-client setups. The loss reduction trends for ten and twenty clients are similar, with the latter slightly outperforming the former. At convergence, the five-client case has a slightly higher loss than the ten- and twenty-client cases.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>Loss reduction with different number of clients</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_65978-fig-6.tif"/>
</fig>
<p>This can be explained by the small number of clients in the five-client setup, which introduces training uncertainty due to noise in each client&#x2019;s model update. Fewer clients may also cause biased global model updates toward specific clients, affecting the overall training. As the number of clients increases, the FL process becomes more stable. In our algorithm, aggregating reference explainability with model performance makes the FL process approach a noise-free scenario.</p>
<p>Next, we compare the ability of our proposed method to defend against gradient leakage attacks when using different privacy budgets. The indicators we refer to include the sum of squared gradient differences between the fake data restored by the attacker and the real data (data Loss) and the mean squared error (data Mse) between the fake data and the real data. Data Loss and data Mse quantify the ability to defend against gradient leakage attacks. When data Loss and data Mse are larger, it means that it is difficult for the attacker to recover the accurate original data through gradient leakage attacks. We use FedAvg without privacy protection as the baseline model. Set the batch size of the client local model update to 1.</p>
<p>By observing <xref ref-type="fig" rid="fig-7">Fig. 7</xref>, we can see that when using the non-privacy fedavg baseline method, data loss and data mse converge quickly and decrease to a level close to 0, which means that the attacker can restore the original data with high quality. When using the method proposed in this paper, data loss and data mse converge to a higher level, which means that the data restored by the attacker is inaccurate, verifying that our proposed method can effectively resist gradient leakage attacks. As the privacy budget decreases, data loss and data mse converge to a higher level, and the privacy protection ability of the method is enhanced.</p>
<fig id="fig-7">
<label>Figure 7</label>
<caption>
<title>Data loss and MSE under different privacy budgets</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_65978-fig-7.tif"/>
</fig>
</sec>
<sec id="s5">
<label>5</label>
<title>Conclusion</title>
<p>This paper proposes an explainability-driven and privacy-preserving FL framework to address the trade-off between data privacy protection and model performance in practical power industry applications. The framework carefully considers the layer importance in power data and dynamically adjusts the noise addition process by combining DP techniques. This approach ensures strong privacy protection while improving model performance. At the same time, the model aggregation is performed based on the explainability and performance of the reference model, which reduces the damage of the aggregation process by low-quality models. Unlike traditional static privacy protection methods, this framework uses explainability analysis tools to assist FL.</p>
<p>Experiments on the power load forecasting task show that the proposed framework significantly improves prediction accuracy and training efficiency while maintaining data privacy. Compared with traditional methods, the proposed method better balances model utility and privacy protection, showing considerable advantages in privacy protection. This study provides new insights into privacy protection and intelligent applications in the power industry, and provides theoretical and practical guidance.</p>
</sec>
</body>
<back>
<ack>
<p>The authors declare that there are no individuals or organizations to acknowledge for their contributions to this research.</p>
</ack>
<sec>
<title>Funding Statement</title>
<p>The authors received no specific funding for this study.</p>
</sec>
<sec>
<title>Author Contributions</title>
<p>Zekun Liu and Junwei Ma conceived the study and developed the framework. Xin Gong designed the privacy budget allocation strategy. Xiu Liu implemented the weighted aggregation strategy. Long An performed the experiments and analyzed the data. Bingbing Liu contributed to resource provision and supervision. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability">
<title>Availability of Data and Materials</title>
<p>The data that support the findings of this study are available from the corresponding author, Junwei Ma, upon reasonable request.</p>
</sec>
<sec>
<title>Ethics Approval</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement">
<title>Conflicts of Interest</title>
<p>The authors declare no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Miao</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Yan</surname> <given-names>X</given-names></string-name>, <string-name><surname>Li</surname> <given-names>X</given-names></string-name>, <string-name><surname>Xu</surname> <given-names>S</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>X</given-names></string-name>, <string-name><surname>Li</surname> <given-names>H</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>RFed: robustness-enhanced privacy-preserving federated learning against poisoning attack</article-title>. <source>IEEE Trans Inform Forensic Secur</source>. <year>2024</year>;<volume>19</volume>:<fpage>5814</fpage>&#x2013;<lpage>27</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tifs.2024.3402113</pub-id>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Alamer</surname> <given-names>A</given-names></string-name>, <string-name><surname>Basudan</surname> <given-names>S</given-names></string-name></person-group>. <article-title>A privacy-preserving federated learning with a feature of detecting forged and duplicated gradient model in autonomous vehicle</article-title>. <source>IEEE Access</source>. <year>2025</year>;<volume>13</volume>:<fpage>38484</fpage>&#x2013;<lpage>501</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2025.3545786</pub-id>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Formery</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Mendiboure</surname> <given-names>L</given-names></string-name>, <string-name><surname>Villain</surname> <given-names>J</given-names></string-name>, <string-name><surname>Deniau</surname> <given-names>V</given-names></string-name>, <string-name><surname>Gransart</surname> <given-names>C</given-names></string-name></person-group>. <article-title>A framework to design efficent blockchain-based decentralized federated learning architectures</article-title>. <source>IEEE Open J Comput Soc</source>. <year>2024</year>;<volume>5</volume>(<issue>37</issue>):<fpage>705</fpage>&#x2013;<lpage>23</lpage>. doi:<pub-id pub-id-type="doi">10.1109/ojcs.2024.3488512</pub-id>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gong</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>L</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>LY</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Bai</surname> <given-names>G</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>AgrAmplifier: defending federated learning against poisoning attacks through local update amplification</article-title>. <source>IEEE Trans Inform Forensic Secur</source>. <year>2024</year>;<volume>19</volume>:<fpage>1241</fpage>&#x2013;<lpage>50</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tifs.2023.3333555</pub-id>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Xu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>J</given-names></string-name>, <string-name><surname>Gu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Lukasiewicz</surname> <given-names>T</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>PhaCIA-TCNs: short-term load forecasting using temporal convolutional networks with parallel hybrid activated convolution and input attention</article-title>. <source>IEEE Trans Netw Sci Eng</source>. <year>2024</year>;<volume>11</volume>(<issue>1</issue>):<fpage>427</fpage>&#x2013;<lpage>38</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tnse.2023.3300744</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Li</surname> <given-names>S</given-names></string-name>, <string-name><surname>Ngai</surname> <given-names>ECH</given-names></string-name>, <string-name><surname>Voigt</surname> <given-names>T</given-names></string-name></person-group>. <article-title>An experimental study of Byzantine-robust aggregation schemes in federated learning</article-title>. <source>IEEE Trans Big Data</source>. <year>2024</year>;<volume>10</volume>(<issue>6</issue>):<fpage>975</fpage>&#x2013;<lpage>88</lpage>. doi:<pub-id pub-id-type="doi">10.1109/TBDATA.2023.3237397</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Li</surname> <given-names>X</given-names></string-name>, <string-name><surname>Qu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zhao</surname> <given-names>S</given-names></string-name>, <string-name><surname>Tang</surname> <given-names>B</given-names></string-name>, <string-name><surname>Lu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>LoMar: a local defense against poisoning attack on federated learning</article-title>. <source>IEEE Trans Dependable Secure Comput</source>. <year>2023</year>;<volume>20</volume>(<issue>1</issue>):<fpage>437</fpage>&#x2013;<lpage>50</lpage>. doi:<pub-id pub-id-type="doi">10.1109/TDSC.2021.3135422</pub-id>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Geng</surname> <given-names>G</given-names></string-name>, <string-name><surname>Cai</surname> <given-names>T</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>Z</given-names></string-name></person-group>. <article-title>Better safe than sorry: constructing Byzantine-robust federated learning with synthesized trust</article-title>. <source>Electronics</source>. <year>2023</year>;<volume>12</volume>(<issue>13</issue>):<fpage>2926</fpage>. doi:<pub-id pub-id-type="doi">10.3390/electronics12132926</pub-id>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Xing</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>ZA</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Russello</surname> <given-names>G</given-names></string-name></person-group>. <article-title>No vandalism: privacy-preserving and byzantine-robust federated learning</article-title>. In: <conf-name>Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security (CCS); 2024 Oct 14&#x2013;18</conf-name>; <publisher-loc>Salt Lake City, USA</publisher-loc>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yan</surname> <given-names>X</given-names></string-name>, <string-name><surname>Miao</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Li</surname> <given-names>X</given-names></string-name>, <string-name><surname>Choo</surname> <given-names>KR</given-names></string-name>, <string-name><surname>Meng</surname> <given-names>X</given-names></string-name>, <string-name><surname>Deng</surname> <given-names>RH</given-names></string-name></person-group>. <article-title>Privacy-preserving asynchronous federated learning framework in distributed IoT</article-title>. <source>IEEE Internet Things J</source>. <year>2023</year>;<volume>10</volume>(<issue>15</issue>):<fpage>13281</fpage>&#x2013;<lpage>91</lpage>. doi:<pub-id pub-id-type="doi">10.1109/JIOT.2023.3262546</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Yan</surname> <given-names>L</given-names></string-name>, <string-name><surname>Ai</surname> <given-names>D</given-names></string-name></person-group>. <article-title>An robust secure blockchain-based hierarchical asynchronous federated learning scheme for Internet of Things</article-title>. <source>IEEE Access</source>. <year>2024</year>;<volume>12</volume>:<fpage>165280</fpage>&#x2013;<lpage>97</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2024.3493112</pub-id>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Vyas</surname> <given-names>A</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>PC</given-names></string-name>, <string-name><surname>Hwang</surname> <given-names>RH</given-names></string-name>, <string-name><surname>Tripathi</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Privacy-preserving federated learning for intrusion detection in IoT environments: a survey</article-title>. <source>IEEE Access</source>. <year>2024</year>;<volume>12</volume>:<fpage>127018</fpage>&#x2013;<lpage>50</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2024.3454211</pub-id>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Alebouyeh</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Bidgoly</surname> <given-names>AJ</given-names></string-name></person-group>. <article-title>Benchmarking robustness and privacy-preserving methods in federated learning</article-title>. <source>Future Gener Comput Syst</source>. <year>2024</year>;<volume>155</volume>(<issue>1</issue>):<fpage>18</fpage>&#x2013;<lpage>38</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.future.2024.01.009</pub-id>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>McMahan</surname> <given-names>B</given-names></string-name>, <string-name><surname>Moore</surname> <given-names>E</given-names></string-name>, <string-name><surname>Ramage</surname> <given-names>D</given-names></string-name>, <string-name><surname>Hampson</surname> <given-names>S</given-names></string-name>, <string-name><surname>Arcas</surname> <given-names>AB</given-names></string-name></person-group>. <article-title>Communication-efficient learning of deep networks from decentralized data</article-title>. <comment>arXiv:1602.05629. 2016</comment>. doi:<pub-id pub-id-type="doi">10.48550/arXiv.1602.05629</pub-id>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Taik</surname> <given-names>A</given-names></string-name>, <string-name><surname>Cherkaoui</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Electrical load forecasting using edge computing and federated learning</article-title>. In: <conf-name>Proceedings of the ICC 2020&#x2013;2020 IEEE International Conference on Communications (ICC); 2020 Jun 7&#x2013;11</conf-name>; <publisher-loc>Dublin, Ireland</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1109/icc40277.2020.9148937</pub-id>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Briggs</surname> <given-names>C</given-names></string-name>, <string-name><surname>Fan</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Andras</surname> <given-names>P</given-names></string-name></person-group>. <article-title>Federated learning for short-term residential load forecasting</article-title>. <source>IEEE Open Access J Power Energy</source>. <year>2022</year>;<volume>9</volume>:<fpage>573</fpage>&#x2013;<lpage>83</lpage>. doi:<pub-id pub-id-type="doi">10.1109/oajpe.2022.3206220</pub-id>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Su</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Luan</surname> <given-names>TH</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>N</given-names></string-name>, <string-name><surname>Li</surname> <given-names>F</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>T</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Secure and efficient federated learning for smart grid with edge-cloud collaboration</article-title>. <source>IEEE Trans Ind Inf</source>. <year>2022</year>;<volume>18</volume>(<issue>2</issue>):<fpage>1333</fpage>&#x2013;<lpage>44</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tii.2021.3095506</pub-id>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>X</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>H</given-names></string-name></person-group>. <article-title>A federated learning framework for smart grids: securing power traces in collaborative learning</article-title>. <comment>arXiv:2103.11870. 2021</comment>. doi:<pub-id pub-id-type="doi">10.48550/arXiv.2103.11870</pub-id>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Cheng</surname> <given-names>X</given-names></string-name>, <string-name><surname>Li</surname> <given-names>C</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>X</given-names></string-name></person-group>. <article-title>A review of federated learning in energy systems</article-title>. In: <conf-name>Proceedings of the 2022 IEEE/IAS Industrial and Commercial Power System Asia (I&#x0026;CPS Asia); 2022 Jul 8&#x2013;11</conf-name>; <publisher-loc>Shanghai, China</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1109/ICPSAsia55496.2022.9949863</pub-id>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zheng</surname> <given-names>R</given-names></string-name>, <string-name><surname>Sumper</surname> <given-names>A</given-names></string-name>, <string-name><surname>Arag&#x00FC;&#x00E9;s-Pe&#x00F1;alba</surname> <given-names>M</given-names></string-name>, <string-name><surname>Galceran-Arellano</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Advancing power system services with privacy-preserving federated learning techniques: a review</article-title>. <source>IEEE Access</source>. <year>2024</year>;<volume>12</volume>:<fpage>76753</fpage>&#x2013;<lpage>80</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2024.3407121</pub-id>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Husnoo</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Anwar</surname> <given-names>A</given-names></string-name>, <string-name><surname>Reda</surname> <given-names>HT</given-names></string-name>, <string-name><surname>Hosseinzadeh</surname> <given-names>N</given-names></string-name>, <string-name><surname>Islam</surname> <given-names>SN</given-names></string-name>, <string-name><surname>Mahmood</surname> <given-names>AN</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>FedDiSC: a computation-efficient federated learning framework for power systems disturbance and cyber attack discrimination</article-title>. <source>Energy AI</source>. <year>2023</year>;<volume>14</volume>(<issue>2</issue>):<fpage>100271</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.egyai.2023.100271</pub-id>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wen</surname> <given-names>M</given-names></string-name>, <string-name><surname>Xie</surname> <given-names>R</given-names></string-name>, <string-name><surname>Lu</surname> <given-names>K</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>K</given-names></string-name></person-group>. <article-title>FedDetect: a novel privacy-preserving federated learning framework for energy theft detection in smart grid</article-title>. <source>IEEE Internet Things J</source>. <year>2022</year>;<volume>9</volume>(<issue>8</issue>):<fpage>6069</fpage>&#x2013;<lpage>80</lpage>. doi:<pub-id pub-id-type="doi">10.1109/JIOT.2021.3110784</pub-id>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Badr</surname> <given-names>MM</given-names></string-name>, <string-name><surname>Mahmoud</surname> <given-names>MMEA</given-names></string-name>, <string-name><surname>Fang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Abdulaal</surname> <given-names>M</given-names></string-name>, <string-name><surname>Aljohani</surname> <given-names>AJ</given-names></string-name>, <string-name><surname>Alasmary</surname> <given-names>W</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Privacy-preserving and communication-efficient energy prediction scheme based on federated learning for smart grids</article-title>. <source>IEEE Internet Things J</source>. <year>2023</year>;<volume>10</volume>(<issue>9</issue>):<fpage>7719</fpage>&#x2013;<lpage>36</lpage>. doi:<pub-id pub-id-type="doi">10.1109/jiot.2022.3230586</pub-id>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Zhu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Han</surname> <given-names>S</given-names></string-name></person-group>. <chapter-title>Deep leakage from gradients</chapter-title>. In: <person-group person-group-type="editor"><string-name><surname>Wallach</surname> <given-names>H</given-names></string-name>, <string-name><surname>Larochelle</surname> <given-names>H</given-names></string-name>, <string-name><surname>Beygelzimer</surname> <given-names>A</given-names></string-name>, <string-name><surname>d&#x0027;Alch&#x00E9;-Buc</surname> <given-names>F</given-names></string-name>, <string-name><surname>Fox</surname> <given-names>E</given-names></string-name>, <string-name><surname>Garnett</surname> <given-names>R</given-names></string-name></person-group>, editors. <source>Advances in Neural Information Processing Systems 32. Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019); 2019 Dec 8&#x2013;14</source>. <publisher-loc>Vancouver, BC, Canada; Houston, TX, USA</publisher-loc>: <publisher-name>NeurIPS</publisher-name>; <year>2019</year>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Shokri</surname> <given-names>R</given-names></string-name>, <string-name><surname>Stronati</surname> <given-names>M</given-names></string-name>, <string-name><surname>Song</surname> <given-names>C</given-names></string-name>, <string-name><surname>Shmatikov</surname> <given-names>V</given-names></string-name></person-group>. <article-title>Membership inference attacks against machine learning models</article-title>. In: <conf-name>Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP); 2017 May 22&#x2013;26</conf-name>; <publisher-loc>San Jose, CA, USA</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1109/SP.2017.41</pub-id>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Truex</surname> <given-names>S</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Chow</surname> <given-names>KH</given-names></string-name>, <string-name><surname>Gursoy</surname> <given-names>ME</given-names></string-name>, <string-name><surname>Wei</surname> <given-names>W</given-names></string-name></person-group>. <article-title>LDP-Fed: federated learning with local differential privacye</article-title>. In: <conf-name>Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking; 2020 Apr 27</conf-name>; <publisher-loc>Heraklion, Greece</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1145/3378679.3394533</pub-id>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Triastcyn</surname> <given-names>A</given-names></string-name>, <string-name><surname>Faltings</surname> <given-names>B</given-names></string-name></person-group>. <article-title>Federated learning with Bayesian differential privacy</article-title>. In: <conf-name>Proceedings of the 2019 IEEE International Conference on Big Data (Big Data); 2019 Dec 9&#x2013;12</conf-name>; <publisher-loc>Los Angeles, CA, USA</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1109/bigdata47090.2019.9005465</pub-id>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wu</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Shi</surname> <given-names>M</given-names></string-name>, <string-name><surname>Li</surname> <given-names>P</given-names></string-name>, <string-name><surname>Li</surname> <given-names>R</given-names></string-name>, <string-name><surname>Xiong</surname> <given-names>NN</given-names></string-name></person-group>. <article-title>An adaptive federated learning scheme with differential privacy preserving</article-title>. <source>Future Gener Comput Syst</source>. <year>2022</year>;<volume>127</volume>(<issue>8</issue>):<fpage>362</fpage>&#x2013;<lpage>72</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.future.2021.09.015</pub-id>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>El Ouadrhiri</surname> <given-names>A</given-names></string-name>, <string-name><surname>Abdelhadi</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Differential privacy for deep and federated learning: a survey</article-title>. <source>IEEE Access</source>. <year>2022</year>;<volume>10</volume>(<issue>2</issue>):<fpage>22359</fpage>&#x2013;<lpage>80</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2022.3151670</pub-id>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wei</surname> <given-names>K</given-names></string-name>, <string-name><surname>Li</surname> <given-names>J</given-names></string-name>, <string-name><surname>Ding</surname> <given-names>M</given-names></string-name>, <string-name><surname>Ma</surname> <given-names>C</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>HH</given-names></string-name>, <string-name><surname>Farokhi</surname> <given-names>F</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Federated learning with differential privacy: algorithms and performance analysis</article-title>. <source>IEEE Trans Inform Forensic Secur</source>. <year>2020</year>;<volume>15</volume>:<fpage>3454</fpage>&#x2013;<lpage>69</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tifs.2020.2988575</pub-id>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Armijo</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Minimization of functions having Lipschitz continuous first partial derivatives</article-title>. <source>Pacific J Math</source>. <year>1966</year>;<volume>16</volume>(<issue>1</issue>):<fpage>1</fpage>&#x2013;<lpage>3</lpage>. doi:<pub-id pub-id-type="doi">10.2140/pjm.1966.16.1</pub-id>.</mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Geyer</surname> <given-names>RC</given-names></string-name>, <string-name><surname>Klein</surname> <given-names>T</given-names></string-name>, <string-name><surname>Nabi</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Differentially private federated learning: a client level perspective</article-title>. <comment>arXiv:1712.07557. 2017</comment>. doi:<pub-id pub-id-type="doi">10.48550/arXiv.1712.07557</pub-id>.</mixed-citation></ref>
</ref-list>
</back></article>