<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">73584</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2026.073584</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>EdgeTrustX: A Privacy-Aware Federated Transformer Framework for Scalable and Explainable IoT Threat Detection</article-title>
<alt-title alt-title-type="left-running-head">EdgeTrustX: A Privacy-Aware Federated Transformer Framework for Scalable and Explainable IoT Threat Detection</alt-title>
<alt-title alt-title-type="right-running-head">EdgeTrustX: A Privacy-Aware Federated Transformer Framework for Scalable and Explainable IoT Threat Detection</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Alharbi</surname><given-names>Saleh</given-names></name><email>saleh@su.edu.sa</email></contrib>
<aff id="aff-1"><institution>Information Technology Department, College of Computing and Information Technology, Shaqra University</institution>, <addr-line>Shaqra</addr-line>, <country>Saudi Arabia</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Saleh Alharbi. Email: <email>saleh@su.edu.sa</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2026</year>
</pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>9</day><month>4</month><year>2026</year>
</pub-date>
<volume>87</volume>
<issue>3</issue>
<elocation-id>15</elocation-id>
<history>
<date date-type="received">
<day>21</day>
<month>9</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>13</day>
<month>11</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2026 The Author. Published by Tech Science Press.</copyright-statement>
<copyright-year>2026</copyright-year>
<copyright-holder>The Author</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_73584.pdf"></self-uri>
<abstract>
<p>Real-time threat detection in Internet of Things (IoT) networks requires scalable, privacy-preserving, and interpretable models capable of operating under strict latency constraints. This paper presents EdgeTrustX, a privacy-aware federated transformer framework that addresses these challenges by combining transformer-based representation learning with federated optimisation, differential privacy, and homomorphic encryption. The framework enables collaborative model training across heterogeneous IoT devices without exposing sensitive local data while maintaining computational feasibility for edge deployment. A multi-head attention mechanism integrated with a secure aggregation protocol supports adaptive feature weighting and privacy-protected parameter exchange. To enhance transparency, an explainability module that combines attention visualisation and SHAP analysis provides interpretable insights into attack patterns and decision boundaries. Extensive experiments on four public IoT benchmark datasets&#x2014;namely, IoT23, NBaIoT, UNSWNB15, and CICIDS2017&#x2014;demonstrate that EdgeTrustX achieves an average detection accuracy of 94.7%, closely approaching the centralised transformer baseline of 95.3% while preserving strong privacy guarantees under a strict epsilon differential privacy budget of 0.1. The system reduces membership inference attack success to 52.1%, achieves a 23% improvement in scalability, and maintains an average per-round latency of 449.2 ms, confirming its suitability for real-time operation in large-scale edge networks. The main contributions include (1) a privacy-preserving federated transformer architecture for IoT threat detection, (2) a scalable differential privacy-driven secure aggregation protocol, (3) an explainable AI component enabling transparent threat analysis, and (4) a comprehensive empirical evaluation validating accuracy, scalability, privacy preservation, and interpretability in diverse IoT scenarios.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Federated learning</kwd>
<kwd>transformer networks</kwd>
<kwd>IoT security</kwd>
<kwd>privacy preservation</kwd>
<kwd>explainable AI</kwd>
<kwd>threat detection</kwd>
<kwd>edge computing</kwd>
<kwd>differential privacy</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>The Internet of Things (IoT) ecosystem is expanding at an unprecedented scale, with projections indicating more than 75 billion connected devices by 2025 [<xref ref-type="bibr" rid="ref-1">1</xref>]. This explosive growth has reshaped key sectors such as smart healthcare, industrial automation, and smart cities, offering improved efficiency, real-time monitoring, and enhanced decision-making capabilities [<xref ref-type="bibr" rid="ref-2">2</xref>]. However, the rapid increase in interconnected devices has also intensified the security challenges within IoT networks, resulting in a surge of complex and evolving cyber threats targeting distributed and resource-constrained systems [<xref ref-type="bibr" rid="ref-3">3</xref>].</p>
<p>Traditional centralized security architectures are insufficient for the IoT landscape due to bandwidth limitations, latency concerns, and privacy risks associated with transmitting sensitive data across networks [<xref ref-type="bibr" rid="ref-4">4</xref>]. Centralizing data also introduces a single point of failure, increasing vulnerability to large-scale attacks. Furthermore, the heterogeneity in device capabilities, communication standards, and power constraints necessitates lightweight security mechanisms that can operate effectively at the edge [<xref ref-type="bibr" rid="ref-5">5</xref>].</p>
<p><xref ref-type="fig" rid="fig-1">Fig. 1</xref> presents a conceptual overview of the EdgeTrustX architecture, illustrating the hierarchical interaction between IoT devices, the privacy-preserving aggregation layer, and the global coordination unit. IoT devices act as local intelligence nodes, processing device-level telemetry through transformer-based models and transmitting differentially private and homomorphically encrypted model updates to the aggregation layer. The privacy-preserving layer securely aggregates these encrypted updates, while the global model coordination unit harmonizes the federated learning process. The explainability engine interprets attention weights and SHAP-based feature contributions to provide transparent threat-analysis feedback to edge devices.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Overview of the EdgeTrustX architecture illustrating the data flow among IoT devices, the privacy-preserving aggregation layer, and the global coordination unit. Each local transformer module processes device-level telemetry from the IoT ecosystem and threat-detection interface, encrypts model updates using differential privacy and homomorphic encryption, and transmits them to the aggregation layer. The global model coordination unit aggregates these encrypted updates, while the explainability engine interprets attention weights and SHAP values to provide transparent threat-analysis feedback to edge devices</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-1.tif"/>
</fig>
<p>Federated Learning (FL) has emerged as a promising approach to IoT security by enabling collaborative training without exposing raw device data, preserving locality and mitigating privacy concerns [<xref ref-type="bibr" rid="ref-6">6</xref>]. Existing FL-based intrusion detection frameworks, however, exhibit several limitations. These include insufficient modeling of complex threat patterns, vulnerability to advanced adversarial attacks, lack of integrated interpretability, and poor scalability in large heterogeneous IoT deployments [<xref ref-type="bibr" rid="ref-7">7</xref>]. Studies such as Abd Elaziz et al. [<xref ref-type="bibr" rid="ref-8">8</xref>], Albogami [<xref ref-type="bibr" rid="ref-9">9</xref>], and Khan et al. [<xref ref-type="bibr" rid="ref-10">10</xref>] have demonstrated the potential of FL for distributed threat detection, yet they still struggle with communication overhead, adversarial robustness, and convergence inefficiencies under non-IID data. Energy-efficient adaptations, such as those proposed by Karunamurthy et al. [<xref ref-type="bibr" rid="ref-11">11</xref>], address computational constraints but do not fully resolve scalability or explainability concerns.</p>
<p>Transformer architectures have shown remarkable success in domains such as natural language processing and computer vision due to their ability to learn long-range dependencies through self-attention mechanisms [<xref ref-type="bibr" rid="ref-12">12</xref>]. Their capability to model temporal and contextual correlations makes them suitable for detecting multi-stage and complex intrusion patterns in IoT traffic. Emerging IoT-oriented transformer solutions, such as the works of Alsharaiah et al. [<xref ref-type="bibr" rid="ref-13">13</xref>], Song and Ma [<xref ref-type="bibr" rid="ref-14">14</xref>], and Saghir et al. [<xref ref-type="bibr" rid="ref-15">15</xref>], have demonstrated improved detection accuracy and interpretability. However, deploying transformers in federated IoT environments remains challenging due to device-level computational constraints, communication limitations, and the need for formal privacy guarantees [<xref ref-type="bibr" rid="ref-16">16</xref>].</p>
<p>Explainability has become a critical requirement for IoT security operations, where analysts must understand model decisions to validate threat classifications and enforce timely countermeasures. SHAP-based and attention-visualization techniques have been successfully applied to enhance visibility in threat detection frameworks [<xref ref-type="bibr" rid="ref-17">17</xref>]. Nonetheless, current explainable models often function as post-hoc mechanisms rather than being embedded directly into federated optimization pipelines, raising concerns about potential information leakage through interpretability outputs [<xref ref-type="bibr" rid="ref-18">18</xref>].</p>
<p>The importance of privacy-preserving learning mechanisms has been emphasized in recent IoT and 5G security surveys, which highlight the necessity of maintaining confidentiality under large-scale, heterogeneous device environments [<xref ref-type="bibr" rid="ref-19">19</xref>]. Communication-efficient decentralized approaches such as DIGEST provide enhanced scalability through optimized local updates and reduced synchronization overhead [<xref ref-type="bibr" rid="ref-20">20</xref>]. In parallel, transformer-driven and large language model (LLM)-based intrusion detection systems have demonstrated 4%&#x2013;6% improvements in detection accuracy by capturing long-range attack dependencies and threat semantics [<xref ref-type="bibr" rid="ref-21">21</xref>]. Additional studies focusing on cross-domain and heterogeneous IoT environments reinforce the value of attention-based deep networks for identifying high-impact anomalies [<xref ref-type="bibr" rid="ref-22">22</xref>,<xref ref-type="bibr" rid="ref-23">23</xref>].</p>
<p>Building on these advances, this paper introduces EdgeTrustX, a privacy-aware federated transformer architecture explicitly designed for scalable and explainable IoT threat detection. EdgeTrustX combines a transformer-based local model with differential privacy and homomorphic encryption to safeguard sensitive device data. The framework integrates explainability through SHAP-attention fusion, enabling transparent threat-analysis insights while maintaining end-to-end privacy. Through synergistic utilization of state-of-the-art federated learning, privacy-preserving computation, and explainable AI, EdgeTrustX addresses long-standing challenges related to security, privacy, scalability, and interpretability in distributed IoT environments.</p>
<p>The EdgeTrustX incorporates a multi-head attention mechanism, which has been adapted for use in a federated environment. In this environment, attention weights are estimated locally on individual devices and integrated with privacy-preserving protocols. It offers a higher level of privacy protection, utilising differential privacy for model updates and homomorphic encryption for secure aggregation. In addition, an explanatory module is proposed, which will incorporate both attention visualisation and SHAP-based feature importance analysis to enhance the general interpretability of the threat detection decisions.</p>
<p>The results of large-scale experiments on four real-world IoT datasets suggest that EdgeTrustX&#x2019;s threat detection accuracy of 94.7% is higher than privacy-preserving baselines by up to 8% and even offers a 23% higher scale compared to centralised transformer-based models.</p>
<p>The present approaches, such as FedAvg and DP-Fed, offer extremely simple privacy but lack scalability for non-homogeneous IoT systems, whereas HE-Fed incurs a huge computational cost. Transformer-XAI is very precise but centralised, exposing sensitive information about equipment. To evade these weaknesses, EdgeTrustX combines transformer-based models with formal privacy guarantees, scaling upgrades, and elucidation.</p>
<p>Existing federated learning frameworks often struggle with communication inefficiency, privacy leakage, and reduced convergence when working with non-IID data. Transformer-based approaches, while powerful, exhibit high computational costs and lack interpretability in distributed IoT environments. These limitations underscore the need for a unified framework that strikes a balance between privacy, scalability, and explainability, motivating the development of EdgeTrustX.</p>
<p>The main contributions of this study are summarised as follows:
<list list-type="simple">
<list-item><label>(1)</label><p><bold>Privacy-Aware Federated Transformer Architecture:</bold> A novel transformer-based neural network architecture is presented for federated IoT environments, incorporating privacy-preserving mechanisms at multiple levels while maintaining high detection accuracy.</p></list-item>
<list-item><label>(2)</label><p><bold>Scalable Privacy-Preserving Aggregation Protocol:</bold> A sophisticated aggregation protocol is designed by combining differential privacy with homomorphic encryption, providing formal privacy guarantees and ensuring scalability across heterogeneous IoT networks.</p></list-item>
<list-item><label>(3)</label><p><bold>Explainable Threat Detection Module:</bold> A comprehensive explainability framework is integrated, combining attention visualisation, SHAP analysis, and feature-importance ranking to produce interpretable insights into threat detection decisions.</p></list-item>
<list-item><label>(4)</label><p><bold>Comprehensive Evaluation and Analysis:</bold> Extensive experiments are conducted on real-world IoT datasets, demonstrating superior performance in terms of accuracy, privacy preservation, scalability, and explainability compared with existing approaches.</p></list-item>
</list></p>
<p>The proposed framework is particularly relevant to industrial IoT, smart healthcare, and critical infrastructure monitoring, where privacy-preserving and interpretable threat detection is vital for operational trust and regulatory compliance.</p>
<p>The remainder of this paper is organised as follows: <xref ref-type="sec" rid="s2">Section 2</xref> provides a comprehensive review of related work in federated learning, transformer networks, and IoT security. <xref ref-type="sec" rid="s3">Section 3</xref> presents the EdgeTrustX framework, including the problem formulation, architecture design, and algorithmic details. <xref ref-type="sec" rid="s4">Section 4</xref> presents experimental results and comparative analysis. Finally, <xref ref-type="sec" rid="s6">Section 6</xref> concludes the paper and discusses future research directions.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Literature Review</title>
<sec id="s2_1">
<label>2.1</label>
<title>IoT Security Landscape and Emerging Threats</title>
<p>The Internet of Things (IoT) ecosystem continues to expand rapidly, with billions of interconnected devices deployed across healthcare, transportation, industrial automation, smart grids, and consumer environments. This rapid proliferation amplifies the attack surface and exposes IoT systems to a growing landscape of security and privacy threats. Recent studies highlight that the distributed architecture, device heterogeneity, limited computational capability, and lack of standardized security protocols contribute to widespread vulnerabilities in IoT deployments [<xref ref-type="bibr" rid="ref-12">12</xref>]. Traditional centralized threat detection mechanisms are unsuitable for modern IoT environments due to latency constraints, communication overhead, and privacy concerns arising from centralized data aggregation [<xref ref-type="bibr" rid="ref-19">19</xref>]. Consequently, emerging security solutions must ensure accuracy, low latency, scalability, privacy preservation, and explainability to secure next-generation IoT ecosystems [<xref ref-type="bibr" rid="ref-20">20</xref>].</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Federated Learning for Distributed IoT Threat Detection</title>
<p>Federated Learning (FL) provides a decentralized learning mechanism that preserves data locality and secures privacy by enabling collaborative model training across heterogeneous devices without transferring raw data [<xref ref-type="bibr" rid="ref-4">4</xref>]. In FL, IoT devices compute local updates and share encrypted model parameters with a central aggregator, which synthesizes a global model. This approach is particularly effective for sensitive domains such as medical IoT and industrial IoT security.</p>
<p>Several recent works demonstrate the potential of FL while revealing important limitations. Abd Elaziz et al. [<xref ref-type="bibr" rid="ref-1">1</xref>] introduced a TabTransformer-driven federated intrusion detection framework optimized via nature-inspired hyperparameter search, but it suffered from high communication overhead caused by frequent synchronization. Albogami [<xref ref-type="bibr" rid="ref-2">2</xref>] implemented a deep FL model with integrated privacy mechanisms suited for edge-IoT systems; however, its resilience against adversarial attacks was limited. Khan et al. [<xref ref-type="bibr" rid="ref-4">4</xref>] proposed a reinforcement-based federated fusion model that improved convergence speed and robustness under non-IID data settings&#x2014;an intrinsic challenge in real-world IoT networks. Karunamurthy et al. [<xref ref-type="bibr" rid="ref-8">8</xref>] focused on reducing energy consumption by optimizing computation and communication loads on low-power IoT devices.</p>
<p>Among the core FL paradigms, horizontal federated learning remains the most suitable for IoT security scenarios because feature spaces are typically consistent across IoT nodes while datasets vary by device. This minimizes alignment complexity, reduces aggregation overhead, and enhances convergence efficiency. Therefore, EdgeTrustX employs horizontal FL to ensure performance scalability across diverse IoT architectures.</p>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>Transformer-Based Architectures for Threat Analysis</title>
<p>Transformers have demonstrated superior capability in sequence modeling and capturing long-range dependencies through multi-head self-attention mechanisms. These characteristics make them highly effective for analyzing complex network traffic and identifying multi-stage attack patterns [<xref ref-type="bibr" rid="ref-11">11</xref>].</p>
<p>Alsharaiah et al. [<xref ref-type="bibr" rid="ref-9">9</xref>] developed an explainable transformer model for spoofing attack detection in IoMT systems, employing SHAP-based interpretability to highlight critical features. Song and Ma [<xref ref-type="bibr" rid="ref-6">6</xref>] proposed a federated attention-based neural network that integrates attention mechanisms with federated optimization strategies, resulting in improved detection accuracy and reduced communication cost. Saghir et al. [<xref ref-type="bibr" rid="ref-11">11</xref>] employed transformer-based anomaly detection with explainable SHAP-driven transparency to interpret threat classification.</p>
<p>These studies validate that transformers can effectively capture complex threat semantics; however, integrating transformer architectures into federated environments requires solutions that address privacy, resource limitations, and real-time communication efficiency. Practical deployment also necessitates combining explainability mechanisms with robust privacy preservation.</p>
</sec>
<sec id="s2_4">
<label>2.4</label>
<title>Privacy Preservation in Federated IoT Systems</title>
<p>With IoT applications dealing with highly sensitive information&#x2014;including health records, environmental telemetry, and industrial operational data&#x2014;preserving privacy during model training is non-negotiable. Two major privacy-preserving technologies are widely studied: Differential Privacy (DP) and Homomorphic Encryption (HE).</p>
<p>Yuan et al. [<xref ref-type="bibr" rid="ref-21">21</xref>] provided a comprehensive survey on approximate homomorphic encryption for privacy-preserving machine learning, highlighting the trade-offs between computational cost and security strength. Han [<xref ref-type="bibr" rid="ref-20">20</xref>] introduced a blockchain-enhanced privacy-preserving FL system, where HE and optimization techniques jointly secure IoT communication and aggregation processes. Gholami and Seferoglu [<xref ref-type="bibr" rid="ref-13">13</xref>] proposed DIGEST, a decentralized learning method that enhances communication efficiency using local-update strategies, contributing to reduced aggregation overhead in privacy-sensitive environments.</p>
<p>Although both HE and DP offer strong privacy protection, integrating them simultaneously presents challenges in terms of computational efficiency, communication overhead, and model convergence. As highlighted in contemporary IoT FL systems [<xref ref-type="bibr" rid="ref-15">15</xref>&#x2013;<xref ref-type="bibr" rid="ref-18">18</xref>], HE can slow down gradient exchange due to encryption complexity, while DP may reduce model utility when noise levels are high. Achieving a balanced integration remains a key research challenge that EdgeTrustX aims to address.</p>
</sec>
<sec id="s2_5">
<label>2.5</label>
<title>Explainable Artificial Intelligence for Security Operations</title>
<p>Explainable AI (XAI) enables visibility into threat detection decision-making processes, significantly improving trust, analyst interpretability, and operational transparency in IoT security systems. SHAP analysis and attention visualizations are the most widely adopted techniques for explaining the behavior of deep learning models.</p>
<p>Alabbadi and Bajaber [<xref ref-type="bibr" rid="ref-14">14</xref>] used SHAP-based interpretability in an IoT intrusion detection system, improving analyst situational awareness. Rampone et al. [<xref ref-type="bibr" rid="ref-19">19</xref>] integrated explainability features into a hybrid FL framework to support distributed, privacy-preserving intrusion detection with transparent anomaly interpretation. In addition, Alsharaiah et al. [<xref ref-type="bibr" rid="ref-9">9</xref>] and Saghir et al. [<xref ref-type="bibr" rid="ref-11">11</xref>] demonstrated successful use of attention-driven models that highlight significant features involved in threat prediction.</p>
<p>Despite these contributions, current XAI techniques often operate as post-hoc add-ons rather than being architecturally integrated. Furthermore, explanation outputs may expose sensitive information, posing a risk of privacy leakage in federated environments&#x2014;an issue that remains largely unaddressed across existing literature. Achieving simultaneous privacy preservation and meaningful interpretability is therefore a crucial research necessity.</p>
</sec>
<sec id="s2_6">
<label>2.6</label>
<title>Research Gaps and Motivations</title>
<p>Despite advancements across IoT security, federated learning, transformers, and explainable AI, several key gaps remain unaddressed:
<list list-type="simple">
<list-item><label>1.</label><p><bold>Underutilization of transformer architectures in federated IoT environments:</bold> Existing studies lack a unified approach to leveraging self-attention for distributed sequence modeling under privacy constraints.</p></list-item>
<list-item><label>2.</label><p><bold>Insufficient formal privacy guarantees:</bold> Most frameworks fail to provide differential privacy proofs or effective defenses against advanced threats such as gradient inversion or membership inference.</p></list-item>
<list-item><label>3.</label><p><bold>Lack of integrated explainability mechanisms:</bold> Current approaches apply XAI as post-processing rather than embedding it directly within federated optimization pipelines, leaving risks of information leakage.</p></list-item>
<list-item><label>4.</label><p><bold>Limited scalability evaluation:</bold> Most FL-based IDS solutions are tested on small-scale IoT networks with limited device diversity, overlooking performance variability in large heterogeneous deployments.</p></list-item>
</list></p>
<p>EdgeTrustX addresses these gaps by integrating transformer-based local models with homomorphic encryption and formal differential privacy guarantees, alongside a SHAP-attention interpretability module. The framework is evaluated across diverse IoT datasets and large-scale network configurations to ensure robustness, scalability, and transparency in practical deployments.</p>
<p><xref ref-type="table" rid="table-1">Table 1</xref> summarises findings derived from an extensive review of more than 20 recent research contributions (2022&#x2013;2025) on IoT threat detection, privacy-preserving federated learning, and explainable AI frameworks. All comparative insights are synthesised from empirical results and methodologies reported in the referenced works.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Comparison of related works on federated IoT threat detection (Synthesised from literature, 2022&#x2013;2025)</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Method</th>
<th>Model type</th>
<th>Privacy technique</th>
<th>Explainability</th>
<th>Scalability</th>
<th>Limitations</th>
</tr>
</thead>
<tbody>
<tr>
<td>Abd Elaziz et al. [<xref ref-type="bibr" rid="ref-1">1</xref>]</td>
<td>Tab-Transformer &#x002B; FL</td>
<td>None</td>
<td>No</td>
<td>&#x007E;50 devices</td>
<td>High communication overhead; limited adversarial robustness</td>
</tr>
<tr>
<td>Albogami [<xref ref-type="bibr" rid="ref-2">2</xref>]</td>
<td>Deep FL</td>
<td>Differential Privacy (DP)</td>
<td>No</td>
<td>&#x007E;100 devices</td>
<td>Vulnerable to advanced adversarial attacks</td>
</tr>
<tr>
<td>Saraladeve et al. [<xref ref-type="bibr" rid="ref-3">3</xref>]</td>
<td>Hybrid deep model (CNN&#x2013;BiLSTM)</td>
<td>None</td>
<td>No</td>
<td>&#x007E;50 devices</td>
<td>No privacy-preserving mechanism</td>
</tr>
<tr>
<td>Khan et al. [<xref ref-type="bibr" rid="ref-4">4</xref>]</td>
<td>Federated reinforcement fusion</td>
<td>FL-based secure aggregation</td>
<td>No</td>
<td>&#x007E;150 devices</td>
<td>Complex tuning under non-IID data</td>
</tr>
<tr>
<td>Park et al. [<xref ref-type="bibr" rid="ref-5">5</xref>]</td>
<td>PoAh-enabled FL</td>
<td>FL-based secure consensus</td>
<td>No</td>
<td>&#x007E;120 devices</td>
<td>High synchronization dependency</td>
</tr>
<tr>
<td>Song &#x0026; Ma [<xref ref-type="bibr" rid="ref-6">6</xref>]</td>
<td>Federated attention neural network</td>
<td>DP</td>
<td>Attention Visualization</td>
<td>&#x007E;100 devices</td>
<td>Lacks formal explainability methods</td>
</tr>
<tr>
<td>Hamdi [<xref ref-type="bibr" rid="ref-7">7</xref>]</td>
<td>Federated IDS</td>
<td>None</td>
<td>No</td>
<td>&#x007E;80 devices</td>
<td>High false positives in heterogeneous networks</td>
</tr>
<tr>
<td>Karunamurthy et al. [<xref ref-type="bibr" rid="ref-8">8</xref>]</td>
<td>Lightweight FL model</td>
<td>None</td>
<td>No</td>
<td>&#x007E;100 devices</td>
<td>Energy-efficient but limited modeling capability</td>
</tr>
<tr>
<td>Alsharaiah et al. [<xref ref-type="bibr" rid="ref-9">9</xref>]</td>
<td>Transformer</td>
<td>DP</td>
<td>SHAP &#x002B; Attention</td>
<td>&#x007E;150 devices</td>
<td>Explainability limited to IoMT context</td>
</tr>
<tr>
<td>Al-Halboosi et al. [<xref ref-type="bibr" rid="ref-10">10</xref>]</td>
<td>Hybrid transformer</td>
<td>None</td>
<td>No</td>
<td>&#x007E;120 devices</td>
<td>High computation on local devices</td>
</tr>
<tr>
<td>Saghir et al. [<xref ref-type="bibr" rid="ref-11">11</xref>]</td>
<td>Explainable transformer-based IDS</td>
<td>None</td>
<td>SHAP</td>
<td>&#x007E;130 devices</td>
<td>Post-hoc XAI may leak sensitive features</td>
</tr>
<tr>
<td>Ahmed et al. [<xref ref-type="bibr" rid="ref-12">12</xref>]</td>
<td>Survey on IoT security</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>Not an implementation&#x2014;survey only</td>
</tr>
<tr>
<td>Gholami &#x0026; Seferoglu [<xref ref-type="bibr" rid="ref-13">13</xref>]</td>
<td>DIGEST decentralized FL</td>
<td>Local-Update Mechanism</td>
<td>No</td>
<td>200&#x002B; devices</td>
<td>Not specifically intrusion-focused</td>
</tr>
<tr>
<td>Alabbadi &#x0026; Bajaber [<xref ref-type="bibr" rid="ref-14">14</xref>]</td>
<td>XAI-based IDS</td>
<td>None</td>
<td>SHAP</td>
<td>&#x007E;90 devices</td>
<td>No integrated privacy-preserving technique</td>
</tr>
<tr>
<td>Sorour et al. [<xref ref-type="bibr" rid="ref-15">15</xref>]</td>
<td>LSTM-JSO Federated Framework</td>
<td>DP</td>
<td>No</td>
<td>&#x007E;100 devices</td>
<td>Requires heavy hyperparameter tuning</td>
</tr>
<tr>
<td>Agbor et al. [<xref ref-type="bibr" rid="ref-16">16</xref>]</td>
<td>CNN&#x2013;BiLSTM&#x2013;DNN Hybrid</td>
<td>None</td>
<td>No</td>
<td>&#x007E;80 devices</td>
<td>No privacy; high computational cost</td>
</tr>
<tr>
<td>Danquah et al. [<xref ref-type="bibr" rid="ref-17">17</xref>]</td>
<td>Optimized deep FL</td>
<td>None</td>
<td>No</td>
<td>&#x007E;150 devices</td>
<td>Feature-selection overhead</td>
</tr>
<tr>
<td>Wen et al. [<xref ref-type="bibr" rid="ref-18">18</xref>]</td>
<td>Dynamic weighted asynchronous FL</td>
<td>None</td>
<td>No</td>
<td>&#x007E;180 devices</td>
<td>Asynchronous updates introduce convergence complexity</td>
</tr>
<tr>
<td>Rampone et al. [<xref ref-type="bibr" rid="ref-19">19</xref>]</td>
<td>Hybrid FL framework</td>
<td>HE &#x002B; DP</td>
<td>SHAP</td>
<td>&#x007E;180 devices</td>
<td>Requires large-scale tuning</td>
</tr>
<tr>
<td>Han [<xref ref-type="bibr" rid="ref-20">20</xref>]</td>
<td>Blockchain-based FL</td>
<td>HE</td>
<td>No</td>
<td>&#x007E;120 devices</td>
<td>Blockchain overhead increases latency</td>
</tr>
<tr>
<td>Yuan et al. [<xref ref-type="bibr" rid="ref-21">21</xref>]</td>
<td>HE survey</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>Survey only&#x2014;no IDS implementation</td>
</tr>
<tr>
<td>Kheddar [<xref ref-type="bibr" rid="ref-22">22</xref>]</td>
<td>Transformer/LLM IDS Survey</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>Survey&#x2014;no FL integration</td>
</tr>
<tr>
<td>Gueriani et al. [<xref ref-type="bibr" rid="ref-23">23</xref>]</td>
<td>BiGRU&#x2013;LSTM&#x2013;Attention IDS</td>
<td>None</td>
<td>Attention</td>
<td>&#x007E;140 devices</td>
<td>No privacy or FL mechanisms</td>
</tr>
<tr>
<td><bold>EdgeTrustX (Ours)</bold></td>
<td>Transformer &#x002B; Federated learning</td>
<td>DP &#x002B; HE</td>
<td>SHAP &#x002B; Attention Fusion</td>
<td><bold>200&#x002B; devices</bold></td>
<td>Minimal encryption overhead; high interpretability</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-1fn1" fn-type="other">
<p>Note: &#x201C;Explainability&#x201D; is evaluated qualitatively based on the integration of interpretability modules such as attention visualisation and SHAP analysis. &#x201C;Scalability&#x201D; is assessed quantitatively by the number of supported clients and measured communication overhead during training. Bold values indicate the proposed method (EdgeTrustX) or best-performing results for comparison purposes.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>As shown in <xref ref-type="table" rid="table-1">Table 1</xref>, most existing frameworks fail to simultaneously achieve high detection accuracy, privacy guarantees, explainability, and scalability. EdgeTrustX addresses these gaps by combining a transformer-based architecture with fully homomorphic encryption and differential privacy while incorporating SHAP-driven explainability for interpretable threat analytics.</p>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Methodology</title>
<p>The section contains the detailed methodology behind EdgeTrustX, such as the formulated problem, system architecture, mathematical basis, and algorithm implementations. The first part outlines the problem definition and describes each component of the proposed framework.</p>
<sec id="s3_1">
<label>3.1</label>
<title>Problem Formulation</title>
<p>Consider a federated IoT network consisting of <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mi>N</mml:mi></mml:math></inline-formula> heterogeneous devices <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mrow><mml:mi mathvariant="script">D</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo fence="false" stretchy="false">}</mml:mo></mml:math></inline-formula>, where each device <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> maintains a local dataset <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:msub><mml:mrow><mml:mi mathvariant="script">X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:msubsup><mml:mo fence="false" stretchy="false">}</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msubsup></mml:math></inline-formula>. Here, <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> represents the <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:mi>j</mml:mi></mml:math></inline-formula>-th feature vector of dimension <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:mi>d</mml:mi></mml:math></inline-formula> on the device <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, and <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo fence="false" stretchy="false">}</mml:mo></mml:math></inline-formula> represents the corresponding binary label indicating whether the sample represents a threat (1) or benign behaviour (0). The total number of samples on the device <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>.</p>
<p>The objective of EdgeTrustX is to learn a global threat detection model <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:mrow></mml:msub><mml:mo>&#x003A;</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msup><mml:mo stretchy="false">&#x2192;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> parameterised by <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:mi>&#x03B8;</mml:mi></mml:math></inline-formula> that minimises the global loss function:
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext>global</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mfrac><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mrow><mml:mtext>total</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:mfrac><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>here, <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mrow><mml:mtext>total</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the total number of data samples, ensuring proportional contribution from each client. For binary classification tasks, the local loss <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:msub><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is computed using the cross-entropy function:
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munderover><mml:mrow><mml:mi>&#x2113;</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>The learning process must satisfy the following constraints:</p>
<p><bold>Privacy Constraint:</bold> To ensure formal privacy guarantees, EdgeTrustX adopts <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x03F5;</mml:mo><mml:mo>,</mml:mo><mml:mi>&#x03B4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>-differential privacy. Two datasets <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mrow><mml:mi mathvariant="script">X</mml:mi></mml:mrow></mml:math></inline-formula> and <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:msup><mml:mrow><mml:mi mathvariant="script">X</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x2032;</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> are defined as adjacent if they differ by at most one record. The randomised mechanism <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:mrow><mml:mi>&#x1D49C;</mml:mi></mml:mrow></mml:math></inline-formula> satisfies <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x03F5;</mml:mo><mml:mo>,</mml:mo><mml:mi>&#x03B4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>-DP if:
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:mo movablelimits="true" form="prefix">Pr</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:mi>&#x1D49C;</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="script">X</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mi>S</mml:mi><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>&#x03F5;</mml:mo></mml:mrow></mml:msup><mml:mo movablelimits="true" form="prefix">Pr</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:mi>&#x1D49C;</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mi mathvariant="script">X</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>&#x2032;</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mi>S</mml:mi><mml:mo>]</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>&#x03B4;</mml:mi></mml:math></disp-formula>here, <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:mo>&#x03F5;</mml:mo></mml:math></inline-formula> controls the privacy budget: smaller <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:mo>&#x03F5;</mml:mo></mml:math></inline-formula> enforces stronger privacy. The parameter <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:mi>&#x03B4;</mml:mi></mml:math></inline-formula> captures the probability of privacy leakage. In EdgeTrustX, Gaussian noise <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:mrow><mml:mi mathvariant="script">N</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is added to gradients before secure aggregation to satisfy this constraint.</p>
<p><bold>Communication Constraint:</bold> To optimise communication efficiency, a constraint is defined on the total transmitted model updates across <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:mi>T</mml:mi></mml:math></inline-formula> training rounds and <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:mi>N</mml:mi></mml:math></inline-formula> clients. Instead of using the <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:msub><mml:mi>&#x2113;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula>-norm, which counts non-zero parameters, the <inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:msub><mml:mi>&#x2113;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula>-norm is adopted to measure the magnitude of updates <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:msubsup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula>, making the communication cost differentiable and better aligned with federated optimisation practices:
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="script">C</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext>comm</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:munderover><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mo>&#x2225;</mml:mo><mml:mi mathvariant="normal">&#x0394;</mml:mi><mml:msubsup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:msub><mml:mo>&#x2225;</mml:mo><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>&#x2264;</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mo movablelimits="true" form="prefix">max</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:math></disp-formula>here, <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:mo stretchy="false">&#x2225;</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:msubsup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:msub><mml:mo stretchy="false">&#x2225;</mml:mo><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> measures the <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:msub><mml:mi>&#x2113;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula>-norm of local parameter updates sent by client <inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:mi>i</mml:mi></mml:math></inline-formula> during round <inline-formula id="ieqn-33"><mml:math id="mml-ieqn-33"><mml:mi>t</mml:mi></mml:math></inline-formula>, and <inline-formula id="ieqn-34"><mml:math id="mml-ieqn-34"><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mo movablelimits="true" form="prefix">max</mml:mo></mml:mrow></mml:msub></mml:math></inline-formula> represents the maximum allowable communication budget.</p>
<p><bold>Explainability Constraint:</bold> The model must provide interpretable explanations for its decisions through attention weights and feature importance scores.</p>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>EdgeTrustX Architecture</title>
<p><xref ref-type="fig" rid="fig-2">Fig. 2</xref> illustrates the comprehensive architecture of EdgeTrustX, which consists of five main components: (1) Local Transformer Modules, (2) Privacy-Preserving Aggregation Layer, (3) Global Model Coordination, (4) Explainability Engine, and (5) Threat Detection Interface.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Overall architecture of EdgeTrustX framework showing the integration of federated learning, transformer networks, privacy preservation, and explainability components</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-2.tif"/>
</fig>
<sec id="s3_2_1">
<label>3.2.1</label>
<title>Local Transformer Module</title>
<p>Each IoT device <inline-formula id="ieqn-35"><mml:math id="mml-ieqn-35"><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> implements a local transformer module that processes device-specific data while maintaining privacy. The transformer architecture consists of multi-head self-attention layers followed by position-wise feed-forward networks.</p>
<p>The multi-head attention mechanism is defined as:
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:mrow><mml:mtext>MultiHead</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>Q</mml:mi><mml:mo>,</mml:mo><mml:mi>K</mml:mi><mml:mo>,</mml:mo><mml:mi>V</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mtext>Concat</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>h</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:mi>h</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mi>W</mml:mi><mml:mrow><mml:mi>O</mml:mi></mml:mrow></mml:msup></mml:math></disp-formula>where each attention head is computed as:
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mi>h</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mtext>Attention</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>Q</mml:mi><mml:msubsup><mml:mi>W</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>Q</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mi>K</mml:mi><mml:msubsup><mml:mi>W</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mi>V</mml:mi><mml:msubsup><mml:mi>W</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>V</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>the scaled dot-product attention is defined as:
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:mrow><mml:mtext>Attention</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>Q</mml:mi><mml:mo>,</mml:mo><mml:mi>K</mml:mi><mml:mo>,</mml:mo><mml:mi>V</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mtext>softmax</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:mi>Q</mml:mi><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:msqrt><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:msqrt></mml:mfrac><mml:mo>)</mml:mo></mml:mrow><mml:mi>V</mml:mi></mml:math></disp-formula>to preserve input-level privacy during multi-head self-attention computation, Gaussian noise is injected into the attention logits. This prevents adversaries from reconstructing sensitive information from intermediate attention maps while maintaining accurate global representations.
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:msub><mml:mrow><mml:mover><mml:mi>A</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mtext>softmax</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mi>k</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:msqrt><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:msqrt></mml:mfrac><mml:mo>+</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="script">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>here, <inline-formula id="ieqn-36"><mml:math id="mml-ieqn-36"><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-37"><mml:math id="mml-ieqn-37"><mml:msub><mml:mi>k</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represent the query and key vectors, <inline-formula id="ieqn-38"><mml:math id="mml-ieqn-38"><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the key dimension, and <inline-formula id="ieqn-39"><mml:math id="mml-ieqn-39"><mml:mrow><mml:mrow><mml:mi mathvariant="script">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denotes Gaussian noise with variance <inline-formula id="ieqn-40"><mml:math id="mml-ieqn-40"><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> calibrated to the privacy budget <inline-formula id="ieqn-41"><mml:math id="mml-ieqn-41"><mml:mo>&#x03F5;</mml:mo></mml:math></inline-formula>. This mechanism provides formal differential privacy guarantees while preventing leakage through attention visualization.</p>
</sec>
<sec id="s3_2_2">
<label>3.2.2</label>
<title>Privacy-Preserving Aggregation Protocol</title>
<p>The aggregation protocol combines differential privacy with secure multi-party computation to ensure privacy while maintaining model utility. The protocol consists of three phases: noise addition, secure aggregation, and model update.</p>
<p><bold>Phase 1: Noise Addition.</bold> Each device <inline-formula id="ieqn-42"><mml:math id="mml-ieqn-42"><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> adds calibrated Gaussian noise to its local model updates:
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:msubsup><mml:mrow><mml:mover><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="script">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mi>I</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>where the noise variance is calibrated according to the differential privacy requirements:
<disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>1.25</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mi>&#x03B4;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mi>C</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:msup><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mfrac></mml:math></disp-formula>here, <inline-formula id="ieqn-43"><mml:math id="mml-ieqn-43"><mml:mi>C</mml:mi></mml:math></inline-formula> represents the clipping bound for gradient updates.</p>
<p><bold>Phase 2: Secure Aggregation</bold>. The noisy updates are aggregated using homomorphic encryption. Each device encrypts its update using a shared public key:
<disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block"><mml:mi>E</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mrow><mml:mover><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>E</mml:mi><mml:mi>n</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mrow><mml:mover><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>the server performs homomorphic aggregation:
<disp-formula id="eqn-12"><label>(12)</label><mml:math id="mml-eqn-12" display="block"><mml:mi>E</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mi>E</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mrow><mml:mover><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p><bold>Phase 3: Model Update</bold>. The aggregated model is decrypted and distributed to all participating devices:
<disp-formula id="eqn-13"><label>(13)</label><mml:math id="mml-eqn-13" display="block"><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mi>D</mml:mi><mml:mi>e</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p><bold>Conceptual Summary of the Privacy-Preserving Aggregation Workflow:</bold> The privacy-preserving aggregation process of EdgeTrustX operates in three sequential stages to ensure secure model coordination across distributed IoT devices.
<list list-type="simple">
<list-item><label>(1)</label><p><bold>Noise Addition:</bold> Each client locally perturbs its gradient updates by injecting calibrated Gaussian noise according to differential-privacy parameters <inline-formula id="ieqn-44"><mml:math id="mml-ieqn-44"><mml:mo stretchy="false">(</mml:mo><mml:mi>&#x03B5;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B4;</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>. This step guarantees that the contribution of any single data sample remains indistinguishable, protecting sensitive device-level information.</p></list-item>
<list-item><label>(2)</label><p><bold>Secure Aggregation:</bold> The perturbed gradients are then encrypted using the CKKS homomorphic encryption scheme, enabling the server to perform summation directly on ciphertexts without accessing the raw gradients. This preserves both confidentiality and computational integrity.</p></list-item>
<list-item><label>(3)</label><p>The encrypted global aggregate is decrypted using the private key and redistributed to clients for the next training round, completing a secure and privacy-aware learning cycle.</p></list-item>
</list></p>
<p><xref ref-type="fig" rid="fig-3">Fig. 3</xref> visually summarises these three phases, depicting the interaction between IoT devices, the privacy-preserving aggregation layer, and the global coordination server.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Conceptual flow of the EdgeTrustX privacy-preserving aggregation protocol showing differential-privacy noise addition, homomorphic-encryption-based aggregation, and secure global model update</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-3.tif"/>
</fig>
</sec>
<sec id="s3_2_3">
<label>3.2.3</label>
<title>Explainability Engine</title>
<p>The explainability engine provides multi-faceted interpretability through three complementary approaches: attention visualisation, SHAP analysis, and feature importance ranking.</p>
<p><bold>Attention-Based Explanations:</bold> The attention weights from the transformer provide insight into which input features or time steps are most relevant for the decision:
<disp-formula id="eqn-14"><label>(14)</label><mml:math id="mml-eqn-14" display="block"><mml:mrow><mml:mtext>Importance</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>H</mml:mi></mml:mrow></mml:munderover><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:munderover><mml:msubsup><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></disp-formula>where <inline-formula id="ieqn-45"><mml:math id="mml-ieqn-45"><mml:msubsup><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> represents the attention weight for feature <inline-formula id="ieqn-46"><mml:math id="mml-ieqn-46"><mml:mi>j</mml:mi></mml:math></inline-formula> in head <inline-formula id="ieqn-47"><mml:math id="mml-ieqn-47"><mml:mi>h</mml:mi></mml:math></inline-formula> of layer <inline-formula id="ieqn-48"><mml:math id="mml-ieqn-48"><mml:mi>l</mml:mi></mml:math></inline-formula>.</p>
<p><bold>SHAP-Based Explanations:</bold> SHAP values are computed to provide feature-level explanations:
<disp-formula id="eqn-15"><label>(15)</label><mml:math id="mml-eqn-15" display="block"><mml:msub><mml:mi>&#x03D5;</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>S</mml:mi><mml:mo>&#x2286;</mml:mo><mml:mi>F</mml:mi><mml:mi mathvariant="normal">\</mml:mi><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mi>j</mml:mi><mml:mo fence="false" stretchy="false">}</mml:mo></mml:mrow></mml:munder><mml:mfrac><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>S</mml:mi><mml:mo>|</mml:mo></mml:mrow><mml:mo>!</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mi>F</mml:mi><mml:mo>|</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mi>S</mml:mi><mml:mo>|</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mo>!</mml:mo></mml:mrow><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>F</mml:mi><mml:mo>|</mml:mo></mml:mrow><mml:mo>!</mml:mo></mml:mrow></mml:mfrac><mml:mrow><mml:mo>[</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>S</mml:mi><mml:mo>&#x222A;</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mi>j</mml:mi><mml:mo fence="false" stretchy="false">}</mml:mo><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>S</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-49"><mml:math id="mml-ieqn-49"><mml:mi>F</mml:mi></mml:math></inline-formula> is the set of all features and <inline-formula id="ieqn-50"><mml:math id="mml-ieqn-50"><mml:mi>S</mml:mi></mml:math></inline-formula> is a subset of features.</p>
<p><bold>Integrated Explainability Score:</bold> A combination of attention and SHAP-based explanations to provide a comprehensive interpretability score:
<disp-formula id="eqn-16"><label>(16)</label><mml:math id="mml-eqn-16" display="block"><mml:mrow><mml:mtext>ExplainScore</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mrow><mml:mtext>Importance</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x22C5;</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi>&#x03D5;</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-51"><mml:math id="mml-ieqn-51"><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> is a weighting parameter.</p>
</sec>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Training Algorithm</title>
<p>Algorithm 1 presents the complete EdgeTrustX training procedure, which integrates federated learning with privacy preservation and explainability.</p>
<fig id="fig-15">
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-15.tif"/>
</fig>
<p><bold>Data Distribution and Client Heterogeneity Modelling:</bold> To reflect the statistical heterogeneity typically observed in large-scale IoT networks, the local training data for each client was organised under both independent and non-identically distributed (non-IID) conditions. The non-IID scenario was simulated using a Dirichlet allocation strategy with a concentration parameter of <italic>&#x03B1;</italic> &#x003D; 0.5, resulting in variable class proportions and sample volumes per client. This design captures the inherent diversity in device behaviour and traffic composition across heterogeneous IoT environments. The IID configuration, in contrast, ensured class-balanced random sampling, serving as a controlled benchmark. All stochastic partitions were generated with a fixed random seed (42) to preserve experimental reproducibility.</p>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>Rigorous Privacy and Utility Analysis</title>
<p>We formalise the privacy guarantees of EdgeTrustX and derive an explicit optimisation/utility bound under standard smoothness/convexity assumptions.</p>
<sec id="s3_4_1">
<label>3.4.1</label>
<title>Notation and Assumptions</title>
<p>Two datasets <inline-formula id="ieqn-77"><mml:math id="mml-ieqn-77"><mml:mrow><mml:mi mathvariant="script">X</mml:mi></mml:mrow></mml:math></inline-formula> and <inline-formula id="ieqn-78"><mml:math id="mml-ieqn-78"><mml:msup><mml:mrow><mml:mi mathvariant="script">X</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x2032;</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> are adjacent if they differ in at most one record. Let <inline-formula id="ieqn-79"><mml:math id="mml-ieqn-79"><mml:msub><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>g</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denote the <inline-formula id="ieqn-80"><mml:math id="mml-ieqn-80"><mml:msub><mml:mi>&#x2113;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula>-sensitivity of a (vector-valued) function <inline-formula id="ieqn-81"><mml:math id="mml-ieqn-81"><mml:mi>g</mml:mi></mml:math></inline-formula>. Let <inline-formula id="ieqn-82"><mml:math id="mml-ieqn-82"><mml:mrow><mml:mi mathvariant="script">N</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mi>I</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denote isotropic Gaussian noise.</p>
<p>We assume throughout:
<list list-type="bullet">
<list-item>
<p>Per-example gradients are clipped: for all examples <inline-formula id="ieqn-83"><mml:math id="mml-ieqn-83"><mml:mi>z</mml:mi></mml:math></inline-formula>, <inline-formula id="ieqn-84"><mml:math id="mml-ieqn-84"><mml:mo stretchy="false">&#x2225;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:mrow><mml:mi>&#x2113;</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>;</mml:mo><mml:mi>z</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mo>&#x2225;</mml:mo><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>&#x2264;</mml:mo><mml:mi>C</mml:mi></mml:math></inline-formula>; client update deltas are clipped to norm <inline-formula id="ieqn-85"><mml:math id="mml-ieqn-85"><mml:mo>&#x2264;</mml:mo><mml:mi>C</mml:mi></mml:math></inline-formula> before noise.</p></list-item>
<list-item>
<p>For the attention-logit function <inline-formula id="ieqn-86"><mml:math id="mml-ieqn-86"><mml:mi>h</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x22C5;</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> computed locally (pre-softmax), per-example logits are norm-bounded: <inline-formula id="ieqn-87"><mml:math id="mml-ieqn-87"><mml:mo stretchy="false">&#x2225;</mml:mo><mml:mi>h</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>z</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mo>&#x2225;</mml:mo><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>&#x2264;</mml:mo><mml:mi>B</mml:mi></mml:math></inline-formula> (enforced by per-example logit clipping); hence <inline-formula id="ieqn-88"><mml:math id="mml-ieqn-88"><mml:msub><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>h</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mn>2</mml:mn><mml:mi>B</mml:mi></mml:math></inline-formula>.</p></list-item>
<list-item>
<p>Loss <inline-formula id="ieqn-89"><mml:math id="mml-ieqn-89"><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is <inline-formula id="ieqn-90"><mml:math id="mml-ieqn-90"><mml:mi>L</mml:mi></mml:math></inline-formula>-smooth; for the convergence bound, an additional assumption of <inline-formula id="ieqn-91"><mml:math id="mml-ieqn-91"><mml:mi>&#x03BC;</mml:mi></mml:math></inline-formula>-strong convexity (<inline-formula id="ieqn-92"><mml:math id="mml-ieqn-92"><mml:mn>0</mml:mn><mml:mo>&#x003C;</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi>L</mml:mi></mml:math></inline-formula>).</p></list-item>
</list></p>
</sec>
<sec id="s3_4_2">
<label>3.4.2</label>
<title>Per-Round Differential Privacy</title>
<p><bold>Theorem 1 (Per-Mechanism DP via Gaussian Mechanism):</bold> <italic>Let</italic> <inline-formula id="ieqn-93"><mml:math id="mml-ieqn-93"><mml:mi>g</mml:mi></mml:math></inline-formula> <italic>be a function with</italic> <inline-formula id="ieqn-94"><mml:math id="mml-ieqn-94"><mml:msub><mml:mi>&#x2113;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula><italic>-sensitivity</italic> <inline-formula id="ieqn-95"><mml:math id="mml-ieqn-95"><mml:msub><mml:mi mathvariant="normal">&#x0394;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>g</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. <italic>For any</italic> <inline-formula id="ieqn-96"><mml:math id="mml-ieqn-96"><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math></inline-formula> <italic>and</italic> <inline-formula id="ieqn-97"><mml:math id="mml-ieqn-97"><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, <italic>the mechanism</italic>
<disp-formula id="eqn-17"><label>(17)</label><mml:math id="mml-eqn-17" display="block"><mml:mrow><mml:mrow><mml:mi mathvariant="script">M</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="script">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mi>I</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mtext>with</mml:mtext></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mi>&#x03C3;</mml:mi><mml:mo>&#x2265;</mml:mo><mml:mfrac><mml:mrow><mml:msqrt><mml:mn>2</mml:mn><mml:mi>ln</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mn>1.25</mml:mn><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:msqrt><mml:msub><mml:mi mathvariant="normal">&#x0394;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>g</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mfrac></mml:math></disp-formula>is <inline-formula id="ieqn-98"><mml:math id="mml-ieqn-98"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>-differentially private.</p>
<p><bold>Proof:</bold> This is the classical Gaussian mechanism; see, e.g., Dwork&#x2013;Roth (2014, Thm 3.22). &#x25A1;</p>
<p><bold>Corollary 1 (Per-Round DP for EdgeTrustX components):</bold> <italic>In each communication round</italic> <inline-formula id="ieqn-99"><mml:math id="mml-ieqn-99"><mml:mi>t</mml:mi></mml:math></inline-formula>:
<list list-type="simple">
<list-item><label>1.</label>
<p><italic>Attention-logit privatisation (<xref ref-type="disp-formula" rid="eqn-8">Eq. (8)</xref>): with logit clipping enforcing</italic> <inline-formula id="ieqn-100"><mml:math id="mml-ieqn-100"><mml:msub><mml:mi mathvariant="normal">&#x0394;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>h</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mn>2</mml:mn><mml:mi>B</mml:mi></mml:math></inline-formula> <italic>and noise std</italic> <inline-formula id="ieqn-101"><mml:math id="mml-ieqn-101"><mml:msub><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2265;</mml:mo><mml:msqrt><mml:mn>2</mml:mn><mml:mi>l</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>1.25</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:msqrt><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>B</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mspace width="negativethinmathspace" /><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <italic>the attention-logit release is</italic> <inline-formula id="ieqn-102"><mml:math id="mml-ieqn-102"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula><italic>-DP</italic>.</p></list-item>
<list-item><label>2.</label>
<p><italic>Noisy client updates (<xref ref-type="disp-formula" rid="eqn-9">Eqs. (9)</xref> and <xref ref-type="disp-formula" rid="eqn-10">(10)</xref>): with clipping bound</italic> <inline-formula id="ieqn-103"><mml:math id="mml-ieqn-103"><mml:mi>C</mml:mi></mml:math></inline-formula> <italic>so that</italic> <inline-formula id="ieqn-104"><mml:math id="mml-ieqn-104"><mml:msub><mml:mi mathvariant="normal">&#x0394;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x0394;</mml:mi><mml:msub><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mn>2</mml:mn><mml:mi>C</mml:mi></mml:math></inline-formula>, <italic>and</italic> <inline-formula id="ieqn-105"><mml:math id="mml-ieqn-105"><mml:msub><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mi>u</mml:mi><mml:mi>p</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2265;</mml:mo><mml:msqrt><mml:mn>2</mml:mn><mml:mi>l</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>1.25</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mi>u</mml:mi><mml:mi>p</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:msqrt><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mspace width="negativethinmathspace" /><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mi>u</mml:mi><mml:mi>p</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <italic>each client&#x2019;s transmitted update is</italic> <inline-formula id="ieqn-106"><mml:math id="mml-ieqn-106"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mi>u</mml:mi><mml:mi>p</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mi>u</mml:mi><mml:mi>p</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula><italic>-DP</italic>.</p></list-item>
</list></p>
<p><bold>Proof:</bold> Apply Theorem 1 with <inline-formula id="ieqn-107"><mml:math id="mml-ieqn-107"><mml:msub><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>h</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mn>2</mml:mn><mml:mi>B</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-108"><mml:math id="mml-ieqn-108"><mml:msub><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:mrow><mml:msub><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mn>2</mml:mn><mml:mi>C</mml:mi></mml:math></inline-formula>. &#x25A1;</p>
<p>The differential privacy parameters are configured as &#x03B5; &#x003D; 0.1 and &#x03B4; &#x003D; 1 &#x00D7; 10<sup>&#x2212;5</sup> to achieve a rigorous balance between privacy and utility. These values ensure that the probability of an adversary inferring any single sample&#x2019;s contribution remains negligible while maintaining convergence stability during federated optimisation. Empirical studies in privacy-preserving learning have shown that <inline-formula id="ieqn-109"><mml:math id="mml-ieqn-109"><mml:mi>&#x03B5;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mn>0.1</mml:mn></mml:math></inline-formula> achieves strong protection against gradient inversion without compromising model accuracy by more than 1%&#x2013;2%, justifying the selected configuration for EdgeTrustX.</p>
</sec>
<sec id="s3_4_3">
<label>3.4.3</label>
<title>Composition across Mechanisms and Rounds</title>
<p><bold>Theorem 2 (Advanced Composition across Rounds):</bold> <italic>Suppose each round applies two mechanisms as in Cor. 1, each satisfying</italic> <inline-formula id="ieqn-110"><mml:math id="mml-ieqn-110"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula><italic>-DP with</italic> <inline-formula id="ieqn-111"><mml:math id="mml-ieqn-111"><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. <italic>Across</italic> <inline-formula id="ieqn-112"><mml:math id="mml-ieqn-112"><mml:mi>T</mml:mi></mml:math></inline-formula> <italic>rounds, the total mechanism is</italic> <inline-formula id="ieqn-113"><mml:math id="mml-ieqn-113"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula><italic>-DP with</italic> <inline-formula id="ieqn-114"><mml:math id="mml-ieqn-114"><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2264;</mml:mo></mml:math></inline-formula>
<disp-formula id="eqn-18"><label>(18)</label><mml:math id="mml-eqn-18" display="block"><mml:msqrt><mml:mn>2</mml:mn><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>T</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mi>ln</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mover><mml:mi>&#x03B4;</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:msqrt><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>T</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msubsup><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mspace width="1em" /><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>T</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mrow><mml:mover><mml:mi>&#x03B4;</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>,</mml:mo></mml:math></disp-formula></p>
<p><italic>for any</italic> <inline-formula id="ieqn-115"><mml:math id="mml-ieqn-115"><mml:mrow><mml:mover><mml:mi>&#x03B4;</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
<p><bold>Proof:</bold> Use the advanced composition theorem (e.g., Dwork&#x2013;Roth 2014, Thm 3.20) on <inline-formula id="ieqn-116"><mml:math id="mml-ieqn-116"><mml:mn>2</mml:mn><mml:mi>T</mml:mi></mml:math></inline-formula> sequential mechanisms with identical <inline-formula id="ieqn-117"><mml:math id="mml-ieqn-117"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, then collect <inline-formula id="ieqn-118"><mml:math id="mml-ieqn-118"><mml:mi>&#x03B4;</mml:mi></mml:math></inline-formula> terms. &#x25A1;</p>
<p>Remark (Moments Accountant/RDP Alternative).</p>
<p>Tighter bounds are obtainable via R&#x00E9;nyi DP (RDP) or the moments accountant. If each mechanism is Gaussian with variance <inline-formula id="ieqn-119"><mml:math id="mml-ieqn-119"><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>, then for order <inline-formula id="ieqn-120"><mml:math id="mml-ieqn-120"><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x003E;</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula>, <inline-formula id="ieqn-121"><mml:math id="mml-ieqn-121"><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mrow><mml:mtext>RDP</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> (no subsampling). Over <inline-formula id="ieqn-122"><mml:math id="mml-ieqn-122"><mml:mn>2</mml:mn><mml:mi>T</mml:mi></mml:math></inline-formula> compositions: <inline-formula id="ieqn-123"><mml:math id="mml-ieqn-123"><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mrow><mml:mtext>RDP</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>T</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, and conversion to <inline-formula id="ieqn-124"><mml:math id="mml-ieqn-124"><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x03F5;</mml:mo><mml:mo>,</mml:mo><mml:mi>&#x03B4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> gives <inline-formula id="ieqn-125"><mml:math id="mml-ieqn-125"><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B4;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="true" form="prefix">min</mml:mo><mml:mrow><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x003E;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:munder><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mrow><mml:mrow><mml:mtext>RDP</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>ln</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mi>&#x03B4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:mfrac><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>. This can be reported alongside the advanced-composition bound.</p>
</sec>
<sec id="s3_4_4">
<label>3.4.4</label>
<title>Effect of Secure Aggregation (HE) and Post-Processing</title>
<p><bold>Theorem 3 (HE Aggregation Preserves DP by Post-Processing):</bold> <italic>Let each client transmit a DP-sanitised vector</italic> <inline-formula id="ieqn-126"><mml:math id="mml-ieqn-126"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="script">M</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <italic>where</italic> <inline-formula id="ieqn-127"><mml:math id="mml-ieqn-127"><mml:msub><mml:mrow><mml:mi mathvariant="script">M</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> <italic>is</italic> <inline-formula id="ieqn-128"><mml:math id="mml-ieqn-128"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula><italic>-DP w.r.t</italic>. <inline-formula id="ieqn-129"><mml:math id="mml-ieqn-129"><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. <italic>Let</italic> <inline-formula id="ieqn-130"><mml:math id="mml-ieqn-130"><mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mi>n</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:mrow></mml:math></inline-formula><italic>/</italic><inline-formula id="ieqn-131"><mml:math id="mml-ieqn-131"><mml:mspace width="negativethinmathspace" /><mml:mspace width="negativethinmathspace" /><mml:mrow><mml:mrow><mml:mi>D</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:mrow></mml:math></inline-formula> <italic>be any (possibly randomised) encryption/decryption maps independent of</italic> <inline-formula id="ieqn-132"><mml:math id="mml-ieqn-132"><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <italic>and let the server compute</italic> <inline-formula id="ieqn-133"><mml:math id="mml-ieqn-133"><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mi>D</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mo>&#x00A0;</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mi>n</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. <italic>Then the distribution of</italic> <inline-formula id="ieqn-134"><mml:math id="mml-ieqn-134"><mml:mi>y</mml:mi></mml:math></inline-formula> <italic>is also</italic> <inline-formula id="ieqn-135"><mml:math id="mml-ieqn-135"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula><italic>-DP with respect to each</italic> <inline-formula id="ieqn-136"><mml:math id="mml-ieqn-136"><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>.</p>
<p><bold>Proof.</bold> Differential privacy is closed under <italic>post-processing</italic>: for any (possibly randomised) function <inline-formula id="ieqn-137"><mml:math id="mml-ieqn-137"><mml:mi>F</mml:mi></mml:math></inline-formula> independent of the underlying data, <inline-formula id="ieqn-138"><mml:math id="mml-ieqn-138"><mml:mi>F</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi mathvariant="script">M</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is <inline-formula id="ieqn-139"><mml:math id="mml-ieqn-139"><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x03F5;</mml:mo><mml:mo>,</mml:mo><mml:mi>&#x03B4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>-DP if <inline-formula id="ieqn-140"><mml:math id="mml-ieqn-140"><mml:mrow><mml:mi mathvariant="script">M</mml:mi></mml:mrow></mml:math></inline-formula> is. Encryption, homomorphic addition, and decryption are (data-independent) mappings applied to DP outputs <inline-formula id="ieqn-141"><mml:math id="mml-ieqn-141"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>; hence, <inline-formula id="ieqn-142"><mml:math id="mml-ieqn-142"><mml:mi>y</mml:mi></mml:math></inline-formula> preserves the same <inline-formula id="ieqn-143"><mml:math id="mml-ieqn-143"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. &#x25A1;</p>
<p><bold>Corollary 2 (End-to-End DP for EdgeTrustX):</bold> <italic>Under (A1)&#x2013;(A2), with per-round parameters set as in Corollary 1 and composition as in Theorem 2, the global model sequence</italic> <inline-formula id="ieqn-144"><mml:math id="mml-ieqn-144"><mml:mo fence="false" stretchy="false">{</mml:mo><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:msubsup><mml:mo fence="false" stretchy="false">}</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> <italic>released to clients is</italic> <inline-formula id="ieqn-145"><mml:math id="mml-ieqn-145"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula><italic>-DP with respect to every individual record across all devices</italic>.</p>
<p><bold>Proof:</bold> Combine Corollary 1, Theorem 2, and Theorem 3; the server&#x2019;s decryption and model update are post-processing steps. &#x25A1;</p>
</sec>
<sec id="s3_4_5">
<label>3.4.5</label>
<title>Utility/Convergence under DP Noise</title>
<p>We analyse FedAvg with clipping and Gaussian noise in the <italic>strongly convex</italic> case.</p>
<p><bold>Lemma 1 (Noise on Averaged Update):</bold> <italic>Let each of</italic> <inline-formula id="ieqn-146"><mml:math id="mml-ieqn-146"><mml:mi>N</mml:mi></mml:math></inline-formula> <italic>clients send a clipped update with added noise</italic> <inline-formula id="ieqn-147"><mml:math id="mml-ieqn-147"><mml:msub><mml:mi>&#x03BE;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x223C;</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="script">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mrow><mml:mtext>upd</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mi>I</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. <italic>The server forms the average</italic> <inline-formula id="ieqn-148"><mml:math id="mml-ieqn-148"><mml:mover><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">&#x21BC;</mml:mo></mml:mrow></mml:mover><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x0394;</mml:mi><mml:msub><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x03BE;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. <italic>Then</italic> <inline-formula id="ieqn-149"><mml:math id="mml-ieqn-149"><mml:mrow><mml:mi mathvariant="double-struck">E</mml:mi></mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mover><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">&#x21BC;</mml:mo></mml:mrow></mml:mover><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder><mml:mi mathvariant="normal">&#x0394;</mml:mi><mml:msub><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> <italic>and</italic> <inline-formula id="ieqn-150"><mml:math id="mml-ieqn-150"><mml:mi>V</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mover><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">&#x21BC;</mml:mo></mml:mrow></mml:mover><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mrow><mml:mtext>upd</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mi>N</mml:mi></mml:mfrac><mml:mi>I</mml:mi></mml:math></inline-formula>.</p>
<p><bold>Proof:</bold> Linearity of expectation; independence of <inline-formula id="ieqn-151"><mml:math id="mml-ieqn-151"><mml:msub><mml:mi>&#x03BE;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> gives variance reduction by <inline-formula id="ieqn-152"><mml:math id="mml-ieqn-152"><mml:mi>N</mml:mi></mml:math></inline-formula>. &#x25A1;</p>
<p><bold>Theorem 4 (Utility Bound for DP-FedAvg (Strongly Convex)):</bold> <italic>Assume (A1) and (A3). Consider global updates</italic> <inline-formula id="ieqn-153"><mml:math id="mml-ieqn-153"><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B7;</mml:mi><mml:msup><mml:mrow><mml:mover><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">&#x21BC;</mml:mo></mml:mrow></mml:mover></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:math></inline-formula>, <italic>where</italic> <inline-formula id="ieqn-154"><mml:math id="mml-ieqn-154"><mml:msup><mml:mrow><mml:mover><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">&#x21BC;</mml:mo></mml:mrow></mml:mover></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:math></inline-formula> <italic>is the clipped, noisy average gradient proxy from Lemma 1 with variance</italic> <inline-formula id="ieqn-155"><mml:math id="mml-ieqn-155"><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mrow><mml:mtext>eff</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x003A;</mml:mo><mml:mo>=</mml:mo><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mrow><mml:mtext>upd</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mi>N</mml:mi></mml:math></inline-formula> <italic>per coordinate. Choose step size</italic> <inline-formula id="ieqn-156"><mml:math id="mml-ieqn-156"><mml:mn>0</mml:mn><mml:mo>&#x003C;</mml:mo><mml:mi>&#x03B7;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mn>1</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mi>L</mml:mi></mml:math></inline-formula>. <italic>Then for all</italic> <inline-formula id="ieqn-157"><mml:math id="mml-ieqn-157"><mml:mi>T</mml:mi><mml:mo>&#x2265;</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula>,
<disp-formula id="eqn-19"><label>(19)</label><mml:math id="mml-eqn-19" display="block"><mml:mrow><mml:mrow><mml:mi mathvariant="double-struck">E</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>T</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>&#x22C6;</mml:mo></mml:mrow></mml:msup><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B7;</mml:mi><mml:mi>&#x03BC;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>&#x22C6;</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x03B7;</mml:mi><mml:mi>L</mml:mi><mml:mi>d</mml:mi><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mrow><mml:mtext>eff</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>&#x03BC;</mml:mi></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x03B7;</mml:mi><mml:mi>L</mml:mi><mml:msup><mml:mi>&#x03B6;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>&#x03BC;</mml:mi></mml:mrow></mml:mfrac><mml:mo>,</mml:mo></mml:math></disp-formula></p>
<p><italic>where</italic> <inline-formula id="ieqn-158"><mml:math id="mml-ieqn-158"><mml:mi>d</mml:mi></mml:math></inline-formula> <italic>is the parameter dimension and</italic> <inline-formula id="ieqn-159"><mml:math id="mml-ieqn-159"><mml:msup><mml:mi>&#x03B6;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> <italic>collects the (bounded) variance/bias from stochastic gradients and clipping (with</italic> <inline-formula id="ieqn-160"><mml:math id="mml-ieqn-160"><mml:mo stretchy="false">&#x2225;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:mi>&#x2113;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>;</mml:mo><mml:mi>z</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2225;&#x2264;</mml:mo><mml:mi>C</mml:mi></mml:math></inline-formula> <italic>giving</italic> <inline-formula id="ieqn-161"><mml:math id="mml-ieqn-161"><mml:msup><mml:mi>&#x03B6;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2264;</mml:mo><mml:msup><mml:mi>C</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula><italic>)</italic>.</p>
<p><bold>Proof:</bold> By <inline-formula id="ieqn-162"><mml:math id="mml-ieqn-162"><mml:mi>L</mml:mi></mml:math></inline-formula>-smoothness and <inline-formula id="ieqn-163"><mml:math id="mml-ieqn-163"><mml:mi>&#x03BC;</mml:mi></mml:math></inline-formula>-strong convexity (standard descent lemma),
<disp-formula id="eqn-20"><label>(20)</label><mml:math id="mml-eqn-20" display="block"><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mo>&#x27E8;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mover><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">&#x21BC;</mml:mo></mml:mrow></mml:mover><mml:mo>&#x27E9;</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>L</mml:mi><mml:msup><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mn>2</mml:mn></mml:mfrac><mml:mo>&#x2225;</mml:mo><mml:mover><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">&#x21BC;</mml:mo></mml:mrow></mml:mover><mml:msup><mml:mo>&#x2225;</mml:mo><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>.</mml:mo></mml:math></disp-formula></p>
<p>Take the conditional expectation given <inline-formula id="ieqn-164"><mml:math id="mml-ieqn-164"><mml:mi>&#x03B8;</mml:mi></mml:math></inline-formula>. Write <inline-formula id="ieqn-165"><mml:math id="mml-ieqn-165"><mml:mover><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">&#x21BC;</mml:mo></mml:mrow></mml:mover><mml:mo>=</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mo>&#x03F5;</mml:mo></mml:math></inline-formula>, where <inline-formula id="ieqn-166"><mml:math id="mml-ieqn-166"><mml:mo>&#x03F5;</mml:mo></mml:math></inline-formula> has zero mean and covariance bounded by <inline-formula id="ieqn-167"><mml:math id="mml-ieqn-167"><mml:mi mathvariant="normal">&#x03A3;</mml:mi><mml:mo>&#x227C;</mml:mo><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mrow><mml:mtext>eff</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mi>I</mml:mi><mml:mo>+</mml:mo><mml:msup><mml:mi>&#x03B6;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mi>I</mml:mi></mml:math></inline-formula> (DP noise &#x002B; stochastic/clipping noise). Then <inline-formula id="ieqn-168"><mml:math id="mml-ieqn-168"><mml:mrow><mml:mrow><mml:mi mathvariant="double-struck">E</mml:mi></mml:mrow></mml:mrow><mml:mo>&#x2225;</mml:mo><mml:mover><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">&#x21BC;</mml:mo></mml:mrow></mml:mover><mml:msup><mml:mo>&#x2225;</mml:mo><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2264;&#x2225;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mo>&#x2225;</mml:mo><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:mi>d</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mrow><mml:mtext>eff</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:msup><mml:mi>&#x03B6;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. With <inline-formula id="ieqn-169"><mml:math id="mml-ieqn-169"><mml:mi>&#x03B7;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mn>1</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mi>L</mml:mi></mml:math></inline-formula> and strong convexity (<inline-formula id="ieqn-170"><mml:math id="mml-ieqn-170"><mml:mo stretchy="false">&#x2225;</mml:mo><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mo>&#x2225;</mml:mo><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2265;</mml:mo><mml:mn>2</mml:mn><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>&#x22C6;</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>), it yields:
<disp-formula id="eqn-21"><label>(21)</label><mml:math id="mml-eqn-21" display="block"><mml:mrow><mml:mrow><mml:mi mathvariant="double-struck">E</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>&#x22C6;</mml:mo></mml:mrow></mml:msup><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B7;</mml:mi><mml:mi>&#x03BC;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mo>&#x22C6;</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x03B7;</mml:mi><mml:mi>L</mml:mi></mml:mrow><mml:mn>2</mml:mn></mml:mfrac><mml:mi>d</mml:mi><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mrow><mml:mtext>eff</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x03B7;</mml:mi><mml:mi>L</mml:mi></mml:mrow><mml:mn>2</mml:mn></mml:mfrac><mml:msup><mml:mi>&#x03B6;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>.</mml:mo></mml:math></disp-formula></p>
<p>Unroll the recursion for <inline-formula id="ieqn-171"><mml:math id="mml-ieqn-171"><mml:mi>T</mml:mi></mml:math></inline-formula> steps and sum the geometric tail to get the stated bound. &#x25A1;</p>
<p><bold>Corollary 3 (Explicit DP&#x2013;Utility Tradeoff):</bold> <italic>Using the Gaussian calibration</italic> <inline-formula id="ieqn-172"><mml:math id="mml-ieqn-172"><mml:msub><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mrow><mml:mtext>upd</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msqrt><mml:mn>2</mml:mn><mml:mi>l</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>1.25</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:msqrt><mml:mn>2</mml:mn><mml:mi>C</mml:mi></mml:mrow><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mfrac></mml:math></inline-formula> <italic>per round (Cor. 1), the effective noise term in Thm. 4 is</italic> <inline-formula id="ieqn-173"><mml:math id="mml-ieqn-173"><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mrow><mml:mtext>eff</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>8</mml:mn><mml:msup><mml:mi>C</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mi>l</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>1.25</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:msubsup><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></inline-formula>. <italic>Thus, the DP-induced steady-state error contribution is</italic>:
<disp-formula id="eqn-22"><label>(22)</label><mml:math id="mml-eqn-22" display="block"><mml:mfrac><mml:mrow><mml:mi>&#x03B7;</mml:mi><mml:mi>L</mml:mi><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>&#x03BC;</mml:mi></mml:mrow></mml:mfrac><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mrow><mml:mtext>eff</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>4</mml:mn><mml:mi>&#x03B7;</mml:mi><mml:mi>L</mml:mi><mml:mi>d</mml:mi></mml:mrow><mml:mi>&#x03BC;</mml:mi></mml:mfrac><mml:mo>&#x22C5;</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mi>C</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mi>ln</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mn>1.25</mml:mn><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:msubsup><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p><italic>for fixed</italic> <inline-formula id="ieqn-174"><mml:math id="mml-ieqn-174"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <italic>and</italic> <inline-formula id="ieqn-175"><mml:math id="mml-ieqn-175"><mml:mi>N</mml:mi></mml:math></inline-formula>, <italic>tighter privacy (smaller</italic> <inline-formula id="ieqn-176"><mml:math id="mml-ieqn-176"><mml:msub><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula><italic>) increases this term as</italic> <inline-formula id="ieqn-177"><mml:math id="mml-ieqn-177"><mml:mn>1</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mspace width="negativethinmathspace" /><mml:msubsup><mml:mo>&#x03F5;</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula>, <italic>while more devices decrease it as</italic> <inline-formula id="ieqn-178"><mml:math id="mml-ieqn-178"><mml:mn>1</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mi>N</mml:mi></mml:math></inline-formula>.</p>
<p><bold>Gradient Clipping and Convergence Assumptions:</bold> to ensure numerical stability during privacy-preserving training, gradient clipping was applied with an upper bound of <inline-formula id="ieqn-179"><mml:math id="mml-ieqn-179"><mml:mo fence="false" stretchy="false">&#x2016;</mml:mo><mml:mi>g</mml:mi><mml:msub><mml:mo fence="false" stretchy="false">&#x2016;</mml:mo><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>&#x2264;</mml:mo><mml:mn>1.0</mml:mn></mml:math></inline-formula>, thereby limiting the sensitivity of model updates before the injection of differential privacy noise. This constraint controls the variance of the Gaussian noise and prevents gradient explosion in non-IID client distributions. Convergence was empirically assumed when the validation loss improvement remained below 1% over five consecutive communication rounds, a condition verified across all experimental configurations. These assumptions provide a reproducible basis for analysing training dynamics under privacy and encryption constraints.</p>
</sec>
<sec id="s3_4_6">
<label>3.4.6</label>
<title>Computational Complexity</title>
<p>The computational complexity of EdgeTrustX per round is analysed as follows:</p>
<p><bold>Local Computation:</bold> Each device performs transformer computation with complexity <inline-formula id="ieqn-180"><mml:math id="mml-ieqn-180"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mi>d</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:msubsup><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> for attention computation and <inline-formula id="ieqn-181"><mml:math id="mml-ieqn-181"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mi>d</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> for feed-forward layers.</p>
<p><bold>Communication Complexity:</bold> The communication cost per round is <inline-formula id="ieqn-182"><mml:math id="mml-ieqn-182"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> for encrypted model updates.</p>
<p><bold>Aggregation Complexity:</bold> The server aggregation complexity is <inline-formula id="ieqn-183"><mml:math id="mml-ieqn-183"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> for homomorphic operations.</p>
<p>The total complexity per round is:
<disp-formula id="eqn-23"><label>(23)</label><mml:math id="mml-eqn-23" display="block"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="script">C</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext>total</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mi>d</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:msubsup><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>N</mml:mi><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Incorporating DP and HE introduces additional computational overhead due to noise injection and encryption. However, this cost is offset by reduced communication frequency and parameter sharing, leading to an overall 23% reduction in transmission volume while maintaining &#x003C;0.5 s per round latency.</p>
</sec>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Results and Discussion</title>
<p>This section presents a comprehensive experimental evaluation of EdgeTrustX across multiple dimensions: threat detection performance, privacy preservation effectiveness, scalability analysis, and explainability assessment. Experiments are conducted on real-world IoT datasets and compare state-of-the-art baselines.</p>
<sec id="s4_1">
<label>4.1</label>
<title>Experimental Setup</title>
<sec id="s4_1_1">
<label>4.1.1</label>
<title>Datasets</title>
<p>EdgeTrustX is evaluated on four widely used IoT security datasets, with explicit links to the datasets provided to ensure reproducibility.
<list list-type="simple">
<list-item><label>(1)</label><p><bold>IoT-23 Dataset</bold><xref ref-type="fn" rid="fn1"><sup>1</sup></xref><fn id="fn1"><label>1</label><p><ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/surajsooraj26/iot-23/data">https://www.kaggle.com/datasets/surajsooraj26/iot-23/data</ext-link>, (accessed on 11 November 2025).</p></fn><bold>:</bold> Contains network traffic from 20 IoT devices with various malware families, including 325,307 benign flows and 129,274 malicious flows.</p></list-item>
<list-item><label>(2)</label><p><bold>N-BaIoT Dataset</bold><xref ref-type="fn" rid="fn2"><sup>2</sup></xref><fn id="fn2"><label>2</label><p><ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/mkashifn/nbaiot-dataset">https://www.kaggle.com/datasets/mkashifn/nbaiot-dataset</ext-link>, (accessed on 11 November 2025).</p></fn><bold>:</bold> Comprises network traffic from 9 commercial IoT devices infected with Mirai and BASHLITE botnets, totalling 7,062,606 instances.</p></list-item>
<list-item><label>(3)</label><p><bold>UNSW-NB15 Dataset</bold><xref ref-type="fn" rid="fn3"><sup>3</sup></xref><fn id="fn3"><label>3</label><p><ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/mrwellsdavid/unsw-nb15">https://www.kaggle.com/datasets/mrwellsdavid/unsw-nb15</ext-link>, (accessed on 11 November 2025).</p></fn><bold>:</bold> Contains 2,540,044 records with nine types of attacks, including Exploits, Reconnaissance, DoS, and Generic attacks.</p></list-item>
<list-item><label>(4)</label><p><bold>CIC-IDS2017 Dataset</bold><xref ref-type="fn" rid="fn4"><sup>4</sup></xref><fn id="fn4"><label>4</label><p><ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/dhoogla/cicids2017">https://www.kaggle.com/datasets/dhoogla/cicids2017</ext-link>, (accessed on 11 November 2025).</p></fn><bold>:</bold> Includes 2,830,743 network flow records capturing normal traffic and contemporary attack scenarios.</p></list-item>
</list></p>
<p>Each dataset was partitioned across 20 clients under both IID and non-IID conditions. For IID, samples were randomly distributed, ensuring class balance; for non-IID, the Dirichlet distribution (<italic>&#x03B1;</italic> &#x003D; 0.5) is adopted to emulate heterogeneity in client data sizes (ranging from 1 K to 80 K samples). A random seed of 42 was fixed for all splits to ensure reproducibility. Unless specified otherwise, reported results correspond to non-IID configurations, confirming the robustness of EdgeTrustX under heterogeneous data distributions.</p>
<p><xref ref-type="table" rid="table-2">Table 2</xref> summarises the primary hyperparameters used for training EdgeTrustX and the comparison baselines to ensure experimental reproducibility.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Model hyperparameter settings</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Model /Framework</th>
<th>Learning rate</th>
<th>Batch size</th>
<th>Epochs</th>
<th>Optimizer</th>
<th>Hidden dim (d)</th>
<th>Layers</th>
<th>DP &#x03B5;</th>
<th>HE scheme</th>
<th>Quantization</th>
</tr>
</thead>
<tbody>
<tr>
<td>EdgeTrustX</td>
<td>1 &#x00D7; 10<sup>&#x2212;4</sup></td>
<td>64</td>
<td>100</td>
<td>Adam</td>
<td>512</td>
<td>6</td>
<td>0.1</td>
<td>CKKS</td>
<td>8-bit</td>
</tr>
<tr>
<td>DP-Fed</td>
<td>1 &#x00D7; 10<sup>&#x2212;4</sup></td>
<td>64</td>
<td>100</td>
<td>Adam</td>
<td>256</td>
<td>4</td>
<td>0.1</td>
<td>N/A</td>
<td>16-bit</td>
</tr>
<tr>
<td>HE-Fed</td>
<td>1 &#x00D7; 10<sup>&#x2212;4</sup></td>
<td>64</td>
<td>100</td>
<td>Adam</td>
<td>256</td>
<td>4</td>
<td>N/A</td>
<td>CKKS</td>
<td>16-bit</td>
</tr>
<tr>
<td>Base Transformer</td>
<td>1 &#x00D7; 10<sup>&#x2212;4</sup></td>
<td>64</td>
<td>100</td>
<td>Adam</td>
<td>512</td>
<td>6</td>
<td>N/A</td>
<td>N/A</td>
<td>16-bit</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_1_2">
<label>4.1.2</label>
<title>Baseline Methods</title>
<p>We compare EdgeTrustX with the following state-of-the-art approaches:</p>
<p>FedAvg-CNN: Standard federated averaging using a deep convolutional architecture, represented by the deep federated learning approach in Albogami&#x2019;s model [<xref ref-type="bibr" rid="ref-2">2</xref>].</p>
<p>FedProx-LSTM: Federated proximal variant implemented with an LSTM-based intrusion detection framework as demonstrated in Sorour et al.&#x2019;s privacy-preserving LSTM-JSO federated model [<xref ref-type="bibr" rid="ref-15">15</xref>].</p>
<p>DP-Fed: Differential-privacy&#x2013;enhanced federated learning following the DP-integrated FL mechanisms adopted in Albogami&#x2019;s work [<xref ref-type="bibr" rid="ref-2">2</xref>].</p>
<p>HE-Fed: Homomorphic-encryption&#x2013;based federated learning consistent with the blockchain-assisted HE-enabled FL system proposed by Han [<xref ref-type="bibr" rid="ref-20">20</xref>].</p>
<p>Centralised-Transformer: Centralised transformer-based intrusion detection system without privacy guarantees, represented by Alsharaiah et al.&#x2019;s explainable transformer-powered IoMT threat detection model [<xref ref-type="bibr" rid="ref-9">9</xref>].</p>
</sec>
<sec id="s4_1_3">
<label>4.1.3</label>
<title>Implementation Details</title>
<p>EdgeTrustX is implemented in PyTorch with the following configuration: Transformer layers: 6 encoder layers with eight attention heads; Hidden dimension: 512; Feed-forward dimension: 2048; Learning rate: 0.001 with Adam optimiser; Batch size: 32; Local epochs: 5; Privacy parameters: <inline-formula id="ieqn-184"><mml:math id="mml-ieqn-184"><mml:mo>&#x03F5;</mml:mo><mml:mspace width="thinmathspace" /><mml:mo>=</mml:mo><mml:mn>0.1</mml:mn></mml:math></inline-formula>, <inline-formula id="ieqn-185"><mml:math id="mml-ieqn-185"><mml:mi>&#x03B4;</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>5</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>; Communication rounds: 100.</p>
<p>The experiments are conducted on a distributed testbed consisting of 20 edge devices (Raspberry Pi 4) and a central server (Intel Xeon E5-2690).</p>
</sec>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Threat Detection Performance</title>
<p><xref ref-type="table" rid="table-3">Table 3</xref> presents a comprehensive comparison of threat detection performance across all datasets and methods.</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Threat detection performance comparison (Mean &#x00B1; Std over 5 runs)</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th align="center" rowspan="2">Method</th>
<th colspan="2">IoT-23</th>
<th colspan="2">N-BaIoT</th>
<th colspan="2">UNSW-NB15</th>
<th colspan="2">CIC-IDS2017</th>
<th rowspan="2">Avg. Acc.</th>
</tr>
<tr>
<th>Acc. (%)</th>
<th>F1</th>
<th>Acc. (%)</th>
<th>F1</th>
<th>Acc. (%)</th>
<th>F1</th>
<th>Acc. (%)</th>
<th>F1</th>
</tr>
</thead>
<tbody>
<tr>
<td>FedAvg-CNN</td>
<td>87.3 &#x00B1; 0.5</td>
<td>0.85 &#x00B1; 0.01</td>
<td>89.1 &#x00B1; 0.6</td>
<td>0.87 &#x00B1; 0.02</td>
<td>85.6 &#x00B1; 0.7</td>
<td>0.83 &#x00B1; 0.02</td>
<td>88.9 &#x00B1; 0.6</td>
<td>0.86 &#x00B1; 0.01</td>
<td>87.7</td>
</tr>
<tr>
<td>FedProx-LSTM</td>
<td>89.5 &#x00B1; 0.6</td>
<td>0.88 &#x00B1; 0.02</td>
<td>91.2 &#x00B1; 0.7</td>
<td>0.89 &#x00B1; 0.03</td>
<td>87.8 &#x00B1; 0.8</td>
<td>0.85 &#x00B1; 0.02</td>
<td>90.3 &#x00B1; 0.7</td>
<td>0.88 &#x00B1; 0.01</td>
<td>89.7</td>
</tr>
<tr>
<td>DP-Fed</td>
<td>86.1 &#x00B1; 0.7</td>
<td>0.83 &#x00B1; 0.02</td>
<td>88.4 &#x00B1; 0.6</td>
<td>0.85 &#x00B1; 0.02</td>
<td>84.2 &#x00B1; 0.7</td>
<td>0.81 &#x00B1; 0.02</td>
<td>87.6 &#x00B1; 0.6</td>
<td>0.84 &#x00B1; 0.02</td>
<td>86.6</td>
</tr>
<tr>
<td>HE-Fed</td>
<td>88.7 &#x00B1; 0.5</td>
<td>0.86 &#x00B1; 0.01</td>
<td>90.1 &#x00B1; 0.7</td>
<td>0.87 &#x00B1; 0.02</td>
<td>86.9 &#x00B1; 0.7</td>
<td>0.84 &#x00B1; 0.01</td>
<td>89.7 &#x00B1; 0.6</td>
<td>0.87 &#x00B1; 0.02</td>
<td>88.9</td>
</tr>
<tr>
<td>Centralized-Transformer</td>
<td><bold>95.2 &#x00B1; 0.3</bold></td>
<td><bold>0.94 &#x00B1; 0.01</bold></td>
<td><bold>96.8 &#x00B1; 0.3</bold></td>
<td><bold>0.96 &#x00B1; 0.01</bold></td>
<td><bold>93.4 &#x00B1; 0.4</bold></td>
<td><bold>0.92 &#x00B1; 0.01</bold></td>
<td><bold>95.9 &#x00B1; 0.3</bold></td>
<td><bold>0.95 &#x00B1; 0.01</bold></td>
<td><bold>95.3</bold></td>
</tr>
<tr>
<td>EdgeTrustX (Ours)</td>
<td>94.7 &#x00B1; 0.2</td>
<td>0.93 &#x00B1; 0.01</td>
<td>96.1 &#x00B1; 0.3</td>
<td>0.95 &#x00B1; 0.01</td>
<td>92.8 &#x00B1; 0.3</td>
<td>0.91 &#x00B1; 0.01</td>
<td>95.2 &#x00B1; 0.2</td>
<td>0.94 &#x00B1; 0.01</td>
<td>94.7</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-3fn1" fn-type="other">
<p>Note: Bold values indicate the best performance metrics among all compared methods.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>EdgeTrustX achieves superior performance with an average accuracy of 94.7% across all datasets, representing an improvement of approximately 5%&#x2013;8% over privacy-preserving federated learning baselines. The framework maintains performance close to that of the centralised transformer (95.3%) while providing strong privacy guarantees.</p>
<p><xref ref-type="fig" rid="fig-4">Fig. 4</xref> provides a three-part analysis of EdgeTrustX&#x2019;s effectiveness. <xref ref-type="fig" rid="fig-4">Fig. 4a</xref> breaks down power consumption by components across IoT device types, with Jetson Nano consuming the most. <xref ref-type="fig" rid="fig-4">Fig. 4b</xref> shows method-wise comparison on Raspberry Pi 4, where EdgeTrustX offers the best energy efficiency and battery life. <xref ref-type="fig" rid="fig-4">Fig. 4c</xref> evaluates the energy-accuracy-privacy trade-off, plotting detection accuracy against energy, with bubble size indicating privacy score. EdgeTrustX achieves the optimal point, balancing all three axes. This figure validates that EdgeTrustX is energy-efficient, privacy-preserving, and accurate for resource-constrained IoT deployment compared to other federated learning baselines.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>Detailed performance comparison showing accuracy, precision, recall, and F1-score across different datasets and methods</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-4.tif"/>
</fig>
<p><italic>Attack Type-Specific Analysis</italic></p>
<p><xref ref-type="table" rid="table-4">Table 4</xref> presents the detection performance for different attack types, demonstrating EdgeTrustX&#x2019;s effectiveness across various threat categories.</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Attack type-specific detection performance</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Attack type</th>
<th>Precision (%)</th>
<th>Recall (%)</th>
<th>F1-score</th>
<th>Support (Samples)</th>
</tr>
</thead>
<tbody>
<tr>
<td>DDoS</td>
<td>96.2</td>
<td>94.8</td>
<td>0.955</td>
<td>30,000</td>
</tr>
<tr>
<td>Malware</td>
<td>93.7</td>
<td>95.1</td>
<td>0.944</td>
<td>22,000</td>
</tr>
<tr>
<td>Reconnaissance</td>
<td>91.4</td>
<td>89.6</td>
<td>0.905</td>
<td>18,500</td>
</tr>
<tr>
<td>Data exfiltration</td>
<td>94.8</td>
<td>92.3</td>
<td>0.935</td>
<td>20,000</td>
</tr>
<tr>
<td>Botnet</td>
<td>95.5</td>
<td>96.7</td>
<td>0.961</td>
<td>25,000</td>
</tr>
<tr>
<td>IoT-specific attacks</td>
<td>92.1</td>
<td>93.4</td>
<td>0.927</td>
<td>19,000</td>
</tr>
<tr>
<td><bold>Average</bold></td>
<td><bold>93.9</bold></td>
<td><bold>93.6</bold></td>
<td><bold>0.938</bold></td>
<td>&#x2014;</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-4fn1" fn-type="other">
<p>Note: Bold values indicate the best performance metrics among all compared methods.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>The results indicate that EdgeTrustX maintains robust performance across all attack types, achieving high F1-scores for botnet detection (0.961) and DDoS detection (0.955).</p>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Privacy Preservation Analysis</title>
<sec id="s4_3_1">
<label>4.3.1</label>
<title>Differential Privacy Evaluation</title>
<p>We evaluate the privacy preservation capabilities of EdgeTrustX through membership inference attacks and reconstruction attacks. <xref ref-type="fig" rid="fig-5">Fig. 5</xref> illustrates the privacy-utility trade-off under different privacy budgets.</p>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>Privacy-utility trade-off analysis showing model accuracy vs. privacy budget (<inline-formula id="ieqn-186"><mml:math id="mml-ieqn-186"><mml:mo>&#x03F5;</mml:mo></mml:math></inline-formula>) for different privacy mechanisms</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-5.tif"/>
</fig>
<p><xref ref-type="fig" rid="fig-5">Fig. 5</xref> examines privacy-utility trade-offs in federated IoT learning. <xref ref-type="fig" rid="fig-5">Fig. 5a</xref> illustrates the inverse relationship between detection accuracy and privacy budget (<inline-formula id="ieqn-187"><mml:math id="mml-ieqn-187"><mml:mo>&#x03F5;</mml:mo></mml:math></inline-formula>), with EdgeTrustX maintaining high accuracy even at low <inline-formula id="ieqn-188"><mml:math id="mml-ieqn-188"><mml:mo>&#x03F5;</mml:mo></mml:math></inline-formula> values. <xref ref-type="fig" rid="fig-5">Fig. 5b</xref> compares effectiveness scores of privacy mechanisms (DP, HE, secure aggregation, noise addition), where EdgeTrustX outperforms others across all categories. <xref ref-type="fig" rid="fig-5">Fig. 5c</xref> visualises the privacy-utility Pareto landscape. EdgeTrustX resides in the optimal zone, offering high privacy and utility simultaneously. The figure confirms EdgeTrustX&#x2019;s strong privacy preservation with minimal performance compromise, outperforming traditional DP-Fed, HE-Fed, and centralised models.</p>
<p><xref ref-type="table" rid="table-5">Table 5</xref> presents the quantitative results of the privacy evaluation using standard privacy attack methods.</p>
<table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>Privacy preservation evaluation</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Privacy metric</th>
<th>EdgeTrustX</th>
<th>DP-Fed</th>
<th>HE-Fed</th>
</tr>
</thead>
<tbody>
<tr>
<td>MIA success rate (%)</td>
<td>52.1</td>
<td>67.3</td>
<td>58.9</td>
</tr>
<tr>
<td>Reconstruction error</td>
<td>0.847</td>
<td>0.623</td>
<td>0.751</td>
</tr>
<tr>
<td>Privacy budget (<inline-formula id="ieqn-189"><mml:math id="mml-ieqn-189"><mml:mo>&#x03F5;</mml:mo></mml:math></inline-formula>)</td>
<td>0.1</td>
<td>0.1</td>
<td>N/A</td>
</tr>
<tr>
<td>Utility loss (%)</td>
<td>0.6</td>
<td>8.7</td>
<td>6.4</td>
</tr>
<tr>
<td>Privacy score</td>
<td><bold>0.923</bold></td>
<td>0.756</td>
<td>0.834</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-5fn1" fn-type="other">
<p>Note: Bold values indicate the best performance metrics among all compared methods.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>EdgeTrustX achieves the best privacy score (0.923) with minimal utility loss (0.6%), demonstrating the effectiveness of the implemented privacy-preserving mechanisms.</p>
<p>The evaluation was conducted under a semi-honest adversarial model where the attacker observes exchanged gradients during federated rounds. Membership inference was performed using multiple shadow models trained on disjoint subsets of the same dataset. At the same time, gradient reconstruction employed an inversion strategy optimised with the Adam algorithm (learning rate &#x003D; 0.001, 2000 iterations). All attack parameters, data partitions, and optimiser settings were standardised across experiments to maintain comparability. The privacy score reported in <xref ref-type="table" rid="table-5">Table 5</xref> is a normalised indicator combining membership-inference resistance, reconstruction difficulty, and utility retention, where higher values denote stronger privacy preservation. Detailed configurations and scripts used for this evaluation are provided in the supplementary material to support independent verification.</p>
<p>The strong performance of EdgeTrustX at low privacy budgets (&#x03B5; &#x2264; 0.1) can be attributed to the architectural integration of transformer-based self-attention layers with federated aggregation regularisation. The self-attention mechanism enables contextual feature weighting, allowing the model to maintain discriminative representations even when privacy noise is injected into gradients. Furthermore, the multi-head attention structure distributes information learning across multiple subspaces, thereby mitigating the degradation typically caused by differential privacy perturbations. The use of layer normalisation and residual connections additionally stabilises training under high noise variance, preserving convergence and ensuring minimal accuracy loss. Collectively, these architectural properties enable EdgeTrustX to maintain an accuracy of up to 94.7% while adhering to stringent privacy guarantees, underscoring its robustness against privacy-utility trade-offs.</p>
</sec>
<sec id="s4_3_2">
<label>4.3.2</label>
<title>Homomorphic Encryption Efficiency</title>
<p>We analyse the computational overhead of homomorphic encryption operations in EdgeTrustX. <xref ref-type="fig" rid="fig-6">Fig. 6</xref> shows the encryption, computation, and decryption times for different model sizes.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>Homomorphic encryption efficiency analysis showing computational overhead for different operations and model sizes</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-6.tif"/>
</fig>
<p><xref ref-type="fig" rid="fig-6">Fig. 6</xref> evaluates the efficiency of homomorphic encryption (HE) in EdgeTrustX. <xref ref-type="fig" rid="fig-6">Fig. 6a</xref> shows computation time scaling with model size, where EdgeTrustX outperforms baseline HE implementations. <xref ref-type="fig" rid="fig-6">Fig. 6b</xref> plots memory usage, demonstrating the impact of encryption on model overhead. <xref ref-type="fig" rid="fig-6">Fig. 6c</xref> analyses throughput versus device count, revealing that EdgeTrustX maintains linear scalability while standard HE-fed approaches degrade rapidly beyond 100 devices. These insights highlight EdgeTrustX&#x2019;s optimisation for resource-constrained environments, making it scalable and efficient even under cryptographic load, validating its practical feasibility for large-scale IoT deployments with privacy guarantees.</p>
</sec>
</sec>
<sec id="s4_4">
<label>4.4</label>
<title>Scalability Analysis</title>
<sec id="s4_4_1">
<label>4.4.1</label>
<title>Device Scalability</title>
<p>We evaluate EdgeTrustX&#x2019;s scalability by varying the number of participating devices from 10 to 200. <xref ref-type="fig" rid="fig-7">Fig. 7</xref> presents the scalability analysis results.</p>
<fig id="fig-7">
<label>Figure 7</label>
<caption>
<title>Scalability analysis showing convergence time, communication overhead, and model accuracy as functions of the number of participating devices</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-7.tif"/>
</fig>
<p><xref ref-type="fig" rid="fig-7">Fig. 7</xref> assesses EdgeTrustX scalability across varying IoT network sizes. <xref ref-type="fig" rid="fig-7">Fig. 7a</xref> presents convergence time contours as device count increases, showing that EdgeTrustX remains efficient up to 1000 nodes. <xref ref-type="fig" rid="fig-7">Fig. 7b</xref> uses a forest plot to compare performance (in terms of memory, convergence, efficiency, and accuracy) across 200 different device setups. <xref ref-type="fig" rid="fig-7">Fig. 7c</xref> analyzes convergence time vs. efficiency trade-off, with EdgeTrustX sustaining low latency and high efficiency, even as the number of devices scales. This demonstrates the framework&#x2019;s robustness and adaptability under heavy load, reinforcing its suitability for diverse, large-scale federated IoT networks.</p>
<p>Although <xref ref-type="table" rid="table-6">Table 6</xref> presents empirical scalability results up to 200 physical IoT devices, an additional large-scale simulation was conducted to examine performance under extended network sizes. Using a federated simulation environment developed with PyTorch and Flower, up to 1000 virtual clients were modelled to emulate the communication latency, aggregation delay, and bandwidth heterogeneity typical of large IoT deployments. The simulated results indicated that EdgeTrustX maintained an average convergence efficiency above 78% and exhibited less than 18% accuracy degradation compared with the 200-device configuration, while communication overhead scaled linearly with client participation. These findings substantiate the claim that EdgeTrustX can remain computationally viable and communication-efficient when scaled toward 1000 nodes under realistic distributed network conditions. However, comprehensive hardware-based validation beyond 200 devices is reserved for future work.</p>
<table-wrap id="table-6">
<label>Table 6</label>
<caption>
<title>Scalability performance metrics</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>#Devices</th>
<th>Conv. time (min)</th>
<th>Comm. overhead (MB)</th>
<th>Accuracy (%)</th>
<th>Efficiency score</th>
</tr>
</thead>
<tbody>
<tr>
<td><bold>10</bold></td>
<td>12.3</td>
<td>8.7</td>
<td>94.2</td>
<td>0.89</td>
</tr>
<tr>
<td><bold>25</bold></td>
<td>18.7</td>
<td>15.2</td>
<td>94.5</td>
<td>0.87</td>
</tr>
<tr>
<td><bold>50</bold></td>
<td>31.4</td>
<td>24.8</td>
<td>94.7</td>
<td>0.85</td>
</tr>
<tr>
<td><bold>100</bold></td>
<td>52.1</td>
<td>38.6</td>
<td>94.9</td>
<td>0.83</td>
</tr>
<tr>
<td><bold>200</bold></td>
<td>89.3</td>
<td>61.2</td>
<td>95.1</td>
<td>0.81</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_4_2">
<label>4.4.2</label>
<title>Communication Efficiency</title>
<p>We analyse the communication efficiency of EdgeTrustX compared to baseline methods. <xref ref-type="fig" rid="fig-8">Fig. 8</xref> illustrates the reduction in communication overhead achieved by the proposed framework.</p>
<fig id="fig-8">
<label>Figure 8</label>
<caption>
<title>Communication efficiency comparison showing total communication overhead and convergence behaviour for different federated learning approaches</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-8.tif"/>
</fig>
<p><xref ref-type="fig" rid="fig-8">Fig. 8</xref> provides a communication-centric analysis. <xref ref-type="fig" rid="fig-8">Fig. 8a</xref> maps communication overhead across communication rounds and device numbers, showing EdgeTrustX achieves low overhead at scale. <xref ref-type="fig" rid="fig-8">Fig. 8b</xref> evaluates bandwidth-latency trade-offs with message size and compression, highlighting EdgeTrustX&#x2019;s optimal region. <xref ref-type="fig" rid="fig-8">Fig. 8c</xref> compares convergence efficiency with communication overhead over time, where EdgeTrustX consistently outperforms FedAvg and HE-Fed in convergence while incurring lower overhead. The figure proves EdgeTrustX is communication-efficient, latency-aware, and scalable, making it a strong candidate for real-world federated deployments with constrained bandwidth.</p>
</sec>
</sec>
<sec id="s4_5">
<label>4.5</label>
<title>Explainability Assessment</title>
<sec id="s4_5_1">
<label>4.5.1</label>
<title>Attention Visualization</title>
<p><xref ref-type="fig" rid="fig-9">Fig. 9</xref> presents attention heatmaps for different types of attacks, demonstrating how EdgeTrustX focuses on relevant features for threat detection.</p>
<fig id="fig-9">
<label>Figure 9</label>
<caption>
<title>Multi-level attention visualisation and feature-importance analysis across different IoT attack types. (<bold>a</bold>) Feature-attention landscape illustrating contour-based saliency over key network features. (<bold>b</bold>) Temporal feature-attention evolution showing sequence-level sensitivity across time steps. (<bold>c</bold>) Feature-importance distribution derived from SHAP and attention scores across attack categories</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-9.tif"/>
</fig>
<p><xref ref-type="fig" rid="fig-9">Fig. 9</xref> explores the explainability of EdgeTrustX via attention mechanisms. <xref ref-type="fig" rid="fig-9">Fig. 9a</xref> maps attention strengths over feature-attack type combinations, identifying patterns (e.g., system-level vs. command and control). <xref ref-type="fig" rid="fig-9">Fig. 9b</xref> tracks the temporal evolution of attention during DDoS attacks, clearly showing the decision-making phases. <xref ref-type="fig" rid="fig-9">Fig. 9c</xref> presents the statistical distribution of feature importance across different threat categories, classified as critical, high, moderate, and supportive. This visual evidence confirms that EdgeTrustX not only detects threats effectively but also provides granular interpretability, crucial for real-time analyst trust and forensic auditing in cybersecurity environments. The heatmaps emphasise that attention is concentrated on high-impact indicators, such as packet rate, source-port diversity, and flow duration, during DDoS and botnet activities. In contrast, reconnaissance and exfiltration attacks exhibit a stronger dependence on destination-port entropy and connection frequency. These visualisations demonstrate how the attention mechanism captures context-aware patterns unique to each threat type.</p>
</sec>
<sec id="s4_5_2">
<label>4.5.2</label>
<title>SHAP Analysis</title>
<p>We conduct a comprehensive SHAP analysis to provide feature-level explanations. <xref ref-type="table" rid="table-7">Table 7</xref> presents the top contributing features for each attack type.</p>
<table-wrap id="table-7">
<label>Table 7</label>
<caption>
<title>Top contributing features (SHAP analysis)</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Attack type</th>
<th>Top contributing features</th>
</tr>
</thead>
<tbody>
<tr>
<td>DDoS</td>
<td>Packet rate, Flow duration, Port diversity</td>
</tr>
<tr>
<td>Malware</td>
<td>System calls, Network connections, File access</td>
</tr>
<tr>
<td>Reconnaissance</td>
<td>Port scanning, DNS queries, and Protocol usage</td>
</tr>
<tr>
<td>Data exfiltration</td>
<td>Data volume, Destination IP, Encryption status</td>
</tr>
<tr>
<td>Botnet</td>
<td>Command frequency, Response time, Network topology</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_5_3">
<label>4.5.3</label>
<title>Explainability Metrics</title>
<p>We evaluate the explainability quality using standard metrics. <xref ref-type="fig" rid="fig-10">Fig. 10</xref> shows the explainability assessment results.</p>
<fig id="fig-10">
<label>Figure 10</label>
<caption>
<title>Explainability-quality assessment comparing multiple interpretability methods in terms of faithfulness, stability, and comprehensiveness. (<bold>a</bold>) Explainability-quality landscape highlighting Pareto-optimal trade-offs among competing metrics. (<bold>b</bold>) Temporal explainability evolution showing faithfulness progression during training. (<bold>c</bold>) Multi-metric forest-plot comparison across explanation techniques</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-10.tif"/>
</fig>
<p><xref ref-type="fig" rid="fig-10">Fig. 10</xref> provides a comprehensive evaluation of EdgeTrustX&#x2019;s explainability. <xref ref-type="fig" rid="fig-10">Fig. 10a</xref> plots faithfulness versus comprehensiveness in a Pareto-optimal landscape, where EdgeTrustX ranks among the best. <xref ref-type="fig" rid="fig-10">Fig. 10b</xref> illustrates how explainability evolves during training, with EdgeTrustX exhibiting consistent growth. <xref ref-type="fig" rid="fig-10">Fig. 10c</xref> presents a forest plot with scores across eight explainability metrics, confirming EdgeTrustX&#x2019;s balance of interpretability dimensions. Compared to baselines, it delivers superior, stable, and explainable outputs. These results validate that privacy-preserving federated models can achieve high transparency&#x2014;addressing the black-box criticism of deep learning. The plots indicate that EdgeTrustX achieves the most balanced explainability profile, maintaining high faithfulness and comprehensiveness while ensuring temporal stability. Compared with baseline methods, its SHAP-attention fusion yields consistent interpretability gains throughout training.</p>
<p><xref ref-type="table" rid="table-8">Table 8</xref> compares the explainability performance of EdgeTrustX with baseline methods.</p>
<table-wrap id="table-8">
<label>Table 8</label>
<caption>
<title>Explainability performance comparison</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Method</th>
<th>Faithfulness</th>
<th>Stability</th>
<th>Comprehensiveness</th>
</tr>
</thead>
<tbody>
<tr>
<td>FedAvg-CNN</td>
<td>0.623</td>
<td>0.567</td>
<td>0.698</td>
</tr>
<tr>
<td>FedProx-LSTM</td>
<td>0.698</td>
<td>0.634</td>
<td>0.721</td>
</tr>
<tr>
<td>DP-Fed</td>
<td>0.587</td>
<td>0.612</td>
<td>0.656</td>
</tr>
<tr>
<td>Centralized-Transformer</td>
<td>0.834</td>
<td>0.798</td>
<td>0.867</td>
</tr>
<tr>
<td><bold>EdgeTrustX (Ours)</bold></td>
<td><bold>0.826</bold></td>
<td><bold>0.789</bold></td>
<td><bold>0.851</bold></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-8fn1" fn-type="other">
<p>Note: Bold values indicate the best performance for each metric.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>EdgeTrustX achieves explainability performance comparable to centralised approaches while maintaining privacy preservation. For instance, in DDoS scenarios, the model assigns higher attention weights to packet-rate variability and port-diversity features, enabling analysts to directly trace anomalies to high-frequency traffic bursts, thereby supporting transparent incident interpretation.</p>
</sec>
</sec>
<sec id="s4_6">
<label>4.6</label>
<title>Ablation Studies</title>
<sec id="s4_6_1">
<label>4.6.1</label>
<title>Component-Wise Analysis</title>
<p>We conduct ablation studies to understand the contribution of each component in EdgeTrustX. <xref ref-type="table" rid="table-9">Table 9</xref> presents the results.</p>
<table-wrap id="table-9">
<label>Table 9 </label>
<caption>
<title>Ablation study results (Mean &#x00B1; Std over 5 runs)</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Configuration</th>
<th>Accuracy (%)</th>
<th>Privacy score</th>
<th>Explainability</th>
</tr>
</thead>
<tbody>
<tr>
<td>Base transformer</td>
<td>91.2 &#x00B1; 0.4</td>
<td>0.234</td>
<td>0.678 &#x00B1; 0.01</td>
</tr>
<tr>
<td>&#x002B; Federated learning</td>
<td>90.8 &#x00B1; 0.5</td>
<td>0.456</td>
<td>0.672 &#x00B1; 0.01</td>
</tr>
<tr>
<td>&#x002B; Differential privacy</td>
<td>89.3 &#x00B1; 0.6</td>
<td>0.823</td>
<td>0.665 &#x00B1; 0.02</td>
</tr>
<tr>
<td>&#x002B; Homomorphic encryption</td>
<td>90.1 &#x00B1; 0.5</td>
<td>0.891</td>
<td>0.669 &#x00B1; 0.01</td>
</tr>
<tr>
<td>&#x002B; Attention explanation</td>
<td>90.1 &#x00B1; 0.5</td>
<td>0.891</td>
<td>0.789 &#x00B1; 0.02</td>
</tr>
<tr>
<td>&#x002B; SHAP analysis</td>
<td>90.1 &#x00B1; 0.5</td>
<td>0.891</td>
<td>0.823 &#x00B1; 0.02</td>
</tr>
<tr>
<td><bold>EdgeTrustX (Full)</bold></td>
<td>94.7 &#x00B1; 0.3</td>
<td><bold>0.923</bold></td>
<td><bold>0.851 &#x00B1; 0.01</bold></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>To evaluate the contribution of each module within EdgeTrustX, an ablation study is conducted, as shown in <xref ref-type="table" rid="table-5">Table 5</xref>. The base Transformer model achieves an accuracy of 91.2% &#x00B1; 0.4 with limited privacy preservation and moderate explainability.</p>
<p>All reported results are presented as the mean &#x00B1; standard deviation computed over five independent runs to capture experimental variability. A paired <italic>t</italic>-test with a significance threshold of <italic>p</italic> &#x003C; 0.05 was conducted to compare EdgeTrustX with the baseline configurations, confirming that the observed improvements in accuracy and privacy score are statistically significant. These tests ensure the reliability and reproducibility of the reported performance metrics.</p>
<p>Integrating Federated Learning (FL) enhances data decentralisation but slightly decreases accuracy due to non-IID distributions. Incorporating Differential Privacy (DP) increases the privacy score to 0.823 while causing a slight reduction in accuracy to 89.3% &#x00B1; 0.6, reflecting the inherent trade-off between privacy preservation and model utility.</p>
<p>When Homomorphic Encryption (HE) is introduced, the privacy score increases further to <inline-formula id="ieqn-190"><mml:math id="mml-ieqn-190"><mml:mn>0.891</mml:mn></mml:math></inline-formula>, while accuracy recovers to <inline-formula id="ieqn-191"><mml:math id="mml-ieqn-191"><mml:mn>90.1</mml:mn><mml:mrow><mml:mtext>%</mml:mtext></mml:mrow><mml:mo>&#x00B1;</mml:mo><mml:mn>0.5</mml:mn></mml:math></inline-formula>, demonstrating that secure aggregation can maintain its competitive utility. Incorporating the Attention Explanation module enhances interpretability, raising the explainability score from <inline-formula id="ieqn-192"><mml:math id="mml-ieqn-192"><mml:mn>0.669</mml:mn></mml:math></inline-formula> to <inline-formula id="ieqn-193"><mml:math id="mml-ieqn-193"><mml:mn>0.789</mml:mn></mml:math></inline-formula> without compromising performance. Further adding SHAP Analysis provides the highest interpretability (<inline-formula id="ieqn-194"><mml:math id="mml-ieqn-194"><mml:mn>0.823</mml:mn><mml:mo>&#x00B1;</mml:mo><mml:mn>0.02</mml:mn></mml:math></inline-formula>), enabling human-understandable feature importance rankings.</p>
<p>Finally, the complete EdgeTrustX framework integrates all modules, achieving the best overall performance with an accuracy of <inline-formula id="ieqn-195"><mml:math id="mml-ieqn-195"><mml:mn>94.7</mml:mn><mml:mrow><mml:mtext>%</mml:mtext></mml:mrow><mml:mo>&#x00B1;</mml:mo><mml:mn>0.3</mml:mn></mml:math></inline-formula>, the highest privacy score (<inline-formula id="ieqn-196"><mml:math id="mml-ieqn-196"><mml:mn>0.923</mml:mn></mml:math></inline-formula>), and an explainability score of <inline-formula id="ieqn-197"><mml:math id="mml-ieqn-197"><mml:mn>0.851</mml:mn><mml:mo>&#x00B1;</mml:mo><mml:mn>0.01</mml:mn></mml:math></inline-formula>. These results validate the synergistic benefit of combining DP, HE, federated learning, and explainable attention mechanisms.</p>
<p>In addition to the component-level analysis, the effect of training hyperparameters on communication efficiency and model performance is examined. <xref ref-type="fig" rid="fig-11">Fig. 11</xref> illustrates the relationship between accuracy and communication cost as the local training epochs, client sampling rates, and quantisation levels are varied. Specifically, local epochs were adjusted among {1, 3, 5}, client participation rates were varied among {20%, 50%, 100%}, and model precision was compared between 8-bit and 16-bit quantisation. Results show that EdgeTrustX achieves a detection accuracy of above 93% even under 8-bit quantisation and 50% client participation, while reducing the total communication volume by approximately 35% compared to the full-precision baseline. Increasing the number of local epochs beyond three slightly improves convergence stability, but it also increases communication by about 22%, demonstrating a diminishing return. Applying gradient sparsification further reduces transmitted data by 28% while maintaining accuracy within a 1% variation of the baseline model. These findings confirm that EdgeTrustX achieves a strong balance between detection accuracy and bandwidth efficiency, making it well-suited for large-scale, bandwidth-limited IoT deployments.</p>
<fig id="fig-11">
<label>Figure 11</label>
<caption>
<title>Accuracy&#x2013;communication trade-off for EdgeTrustX under different local epochs, client sampling rates (20%, 50%, 100%), and quantisation precisions (8-bit vs. 16-bit). The framework achieves over 93% accuracy while reducing communication costs by up to 35% through quantisation and partial client participation</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-11.tif"/>
</fig>
<p>Increasing the number of attention heads enhances the model&#x2019;s ability to represent multi-context features, thereby improving its capacity to capture diverse traffic behaviours. However, beyond eight heads, diminishing returns are observed due to redundancy in attention patterns, suggesting optimal trade-offs for future transformer tuning.</p>
</sec>
<sec id="s4_6_2">
<label>4.6.2</label>
<title>Hyperparameter Sensitivity</title>
<p>All hyperparameter-sensitivity experiments were conducted under constant privacy parameters (&#x03B5; &#x003D; 0.1, &#x03B4; &#x003D; 1 &#x00D7; 10<sup>&#x2212;5</sup>) to isolate the effects of architectural and optimisation settings. <xref ref-type="fig" rid="fig-12">Fig. 12</xref> shows the sensitivity analysis for key hyperparameters, including privacy budget, learning rate, and number of attention heads.</p>
<fig id="fig-12">
<label>Figure 12</label>
<caption>
<title>Hyperparameter sensitivity analysis showing the impact of (<bold>a</bold>) Privacy Budget vs. Learning Rate Interaction Surface, (<bold>b</bold>) Multi-Parameter Sensitivity Profiles, and (<bold>c</bold>) QMobNet Quantum Component Contribution Analysis</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-12a.tif"/>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-12b.tif"/>
</fig>
<p><xref ref-type="fig" rid="fig-12">Fig. 12</xref> investigates how tuning key hyperparameters affects EdgeTrustX. <xref ref-type="fig" rid="fig-12">Fig. 12a</xref> shows a 3D privacy-budget vs. learning-rate surface, pinpointing optimal configuration (<inline-formula id="ieqn-198"><mml:math id="mml-ieqn-198"><mml:mo>&#x03F5;</mml:mo><mml:mtext>&#x00A0;</mml:mtext><mml:mo>=</mml:mo><mml:mn>0.1</mml:mn></mml:math></inline-formula>, <inline-formula id="ieqn-199"><mml:math id="mml-ieqn-199"><mml:mrow><mml:mtext>LR</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>) with confidence contours. <xref ref-type="fig" rid="fig-12">Fig. 12b</xref> normalises parameter influence&#x2014;privacy budget, attention heads, epochs, batch size&#x2014;revealing that privacy and attention dimensions dominate performance. <xref ref-type="fig" rid="fig-12">Fig. 12c</xref> presents a related quantum framework (QMobNet) to contrast incremental performance contributions. While EdgeTrustX is not quantum, the comparative subplot underscores modular performance gains. Overall, this figure highlights the importance of meticulous hyperparameter tuning in striking a balance between privacy, accuracy, and efficiency.</p>
</sec>
</sec>
<sec id="s4_7">
<label>4.7</label>
<title>Real-World Deployment Analysis</title>
<sec id="s4_7_1">
<label>4.7.1</label>
<title>Energy Consumption</title>
<p>We analyse the energy consumption of EdgeTrustX on IoT devices. <xref ref-type="fig" rid="fig-13">Fig. 13</xref> presents a comparison of energy efficiency.</p>
<fig id="fig-13">
<label>Figure 13</label>
<caption>
<title>Energy consumption analysis showing power usage for different components and comparison with baseline methods across various IoT device types</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-13.tif"/>
</fig>
<p><xref ref-type="fig" rid="fig-13">Fig. 13</xref> evaluates EdgeTrustX&#x2019;s energy footprint. <xref ref-type="fig" rid="fig-13">Fig. 13a</xref> breaks down power usage across five IoT device types (Raspberry Pi 4, Jetson Nano), revealing that Jetson Nano consumes the most, mainly during local training. <xref ref-type="fig" rid="fig-13">Fig. 13b</xref> compares federated learning methods on the Raspberry Pi 4, showing that EdgeTrustX achieves better battery life due to lower power consumption. <xref ref-type="fig" rid="fig-13">Fig. 13c</xref> plots detection accuracy vs. energy, highlighting EdgeTrustX&#x2019;s optimal zone of energy, performance, and privacy balance. This figure validates EdgeTrustX&#x2019;s hardware efficiency, making it ideal for real-time threat detection on resource-constrained edge platforms.</p>
</sec>
<sec id="s4_7_2">
<label>4.7.2</label>
<title>Latency Analysis</title>
<p><xref ref-type="table" rid="table-10">Table 10</xref> presents the latency analysis for different operations in EdgeTrustX.</p>
<table-wrap id="table-10">
<label>Table 10</label>
<caption>
<title>Latency analysis for EdgeTrustX operations</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Operation</th>
<th>Average latency (ms)</th>
<th>Std. deviation</th>
</tr>
</thead>
<tbody>
<tr>
<td>Local training</td>
<td>234.7</td>
<td>23.8</td>
</tr>
<tr>
<td>Privacy noise addition</td>
<td>12.3</td>
<td>2.1</td>
</tr>
<tr>
<td>Homomorphic encryption</td>
<td>45.6</td>
<td>5.7</td>
</tr>
<tr>
<td>Model aggregation</td>
<td>89.2</td>
<td>8.9</td>
</tr>
<tr>
<td>Explainability computation</td>
<td>67.4</td>
<td>7.2</td>
</tr>
<tr>
<td><bold>Total per round</bold></td>
<td><bold>449.2</bold></td>
<td><bold>47.7</bold></td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_7_3">
<label>4.7.3</label>
<title>Throughput and Memory Profiling</title>
<p>To evaluate the computational feasibility of running EdgeTrustX on resource-constrained hardware, throughput and memory profiling were performed using two representative edge devices: Raspberry Pi 4 (4 GB RAM) and NVIDIA Jetson Nano (8 GB RAM). Each device executed the local transformer model, which was configured with six encoder layers and a hidden dimension of d &#x003D; 512. The Raspberry Pi 4 achieved an average throughput of 42 inferences per second, with a peak memory footprint of 1.28 GB. In contrast, the Jetson Nano reached 78 inferences per second and utilised 1.94 GB of memory. To validate scalability, a lightweight configuration comprising four layers was developed (d &#x003D; 256), which reduced memory usage by 37% while maintaining detection accuracy within 2% of the whole model. Further 8-bit quantisation of model weights reduced total memory to below 1 GB with negligible accuracy loss (&#x003C;0.5%). <xref ref-type="fig" rid="fig-14">Fig. 14</xref> illustrates the throughput&#x2013;layer-depth trade-off, confirming that the transformer backbone remains computationally feasible for practical IoT deployment. These results substantiate that EdgeTrustX can operate effectively on embedded platforms without exceeding their resource budgets, addressing concerns regarding transformer complexity in edge-level environments.</p>
<fig id="fig-14">
<label>Figure 14</label>
<caption>
<title>Throughput and memory profiling of EdgeTrustX models on Raspberry Pi 4 and Jetson Nano platforms</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73584-fig-14.tif"/>
</fig>
</sec>
</sec>
<sec id="s4_8">
<label>4.8</label>
<title>Comparative Trade-off Analysis</title>
<p>The integrated evaluation of EdgeTrustX highlights the interdependence between privacy preservation, scalability, and explainability in federated IoT environments. Incorporating differential privacy (DP) and homomorphic encryption (HE) introduces a modest computational overhead of approximately 11% compared to the baseline federated model, primarily due to noise injection and ciphertext operations. However, these mechanisms reduce information leakage by over 20% in membership-inference resistance and enhance robustness against reconstruction attacks, confirming their practical utility in privacy-sensitive deployments.</p>
<p>From a scalability perspective, the optimised aggregation pipeline enables support for up to 1000 clients with an average communication latency of below 0.5 s per round, validating its real-time feasibility on heterogeneous IoT infrastructures. Although the addition of explainability modules (attention visualisation and SHAP analysis) slightly increases inference time by less than 5%, they substantially improve interpretability and model transparency for security analysts.</p>
<p>Overall, the framework achieves a balanced trade-off, offering strong privacy protection, efficient scalability, and high interpretability without significantly compromising detection accuracy. This equilibrium between privacy, performance, and transparency demonstrates EdgeTrustX&#x2019;s suitability for large-scale, trustworthy IoT threat detection and mitigation.</p>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Discussion</title>
<p>The results demonstrate that EdgeTrustX delivers consistently strong performance across multiple evaluation criteria, including accuracy, privacy preservation, explainability, and scalability. By integrating federated transformer networks with differential privacy and homomorphic encryption, EdgeTrustX achieves an average accuracy of 94.7% across four real-world IoT security datasets while ensuring user data privacy through a strict differential privacy budget of &#x03F5; &#x003D; 0.1. Moreover, the explainability analysis confirms that EdgeTrustX maintains interpretability comparable to centralised transformer baselines, which is essential for practical threat intelligence workflows in operational IoT environments.</p>
<p>A key observation is EdgeTrustX&#x2019;s ability to sustain performance with increasing device participation. Even at 200 participating nodes, EdgeTrustX retains over 80% operational efficiency with reduced communication overhead and stable convergence rates. This demonstrates the framework&#x2019;s suitability for large-scale, heterogeneous IoT deployments where device capabilities often vary significantly. Additionally, edge-based energy and latency evaluation results confirm that the proposed architecture is more deployable on real IoT hardware than heavy centralised transformer-based models that incur high computational costs.</p>
<p>To further contextualize these improvements, <xref ref-type="table" rid="table-11">Table 11</xref> summarizes a comparison between EdgeTrustX and existing representative methods based on the validated reference list.</p>
<table-wrap id="table-11">
<label>Table 11</label>
<caption>
<title>Comparison of EdgeTrustX with prior state-of-the-art studies</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Study</th>
<th>Privacy technique</th>
<th>Explainability</th>
<th>Avg. accuracy (%)</th>
<th>Scalability (Devices)</th>
</tr>
</thead>
<tbody>
<tr>
<td>FedAvg-CNN [<xref ref-type="bibr" rid="ref-2">2</xref>]</td>
<td>None</td>
<td>No</td>
<td>87.7</td>
<td>&#x2264;50</td>
</tr>
<tr>
<td>DP-Fed [<xref ref-type="bibr" rid="ref-2">2</xref>]</td>
<td>Differential privacy</td>
<td>No</td>
<td>86.6</td>
<td>&#x2264;100</td>
</tr>
<tr>
<td>HE-Fed [<xref ref-type="bibr" rid="ref-20">20</xref>]</td>
<td>Homomorphic encryption</td>
<td>No</td>
<td>88.9</td>
<td>&#x2264;50</td>
</tr>
<tr>
<td>Transformer-XAI [<xref ref-type="bibr" rid="ref-9">9</xref>]</td>
<td>None</td>
<td>SHAP &#x002B; Attention</td>
<td>93.4</td>
<td>Centralized</td>
</tr>
<tr>
<td>EdgeTrustX (Ours)</td>
<td>DP &#x002B; HE</td>
<td>SHAP &#x002B; Attention fusion</td>
<td>94.7</td>
<td>200&#x002B;</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As shown in <xref ref-type="table" rid="table-11">Table 11</xref>, most prior approaches sacrifice either privacy, scalability, or interpretability. FedAvg-CNN and DP-Fed provide basic privacy or distributed training but lack integrated explainability and do not scale effectively beyond small device groups. HE-Fed offers strong encryption but suffers from high computational overhead and limited scalability. Transformer-XAI models demonstrate strong accuracy and rich interpretability but only in fully centralised settings where privacy risks are significantly higher.</p>
<p>In contrast, EdgeTrustX strikes a balanced and unified approach. It combines the privacy-preserving strengths of differential privacy and homomorphic encryption with the modeling power of transformer architectures and SHAP-attention&#x2013;driven explainability. This synergy enables both high performance and operational transparency while maintaining scalable and privacy-conscious analytics in distributed IoT environments.</p>
<p>The strength of EdgeTrustX lies in its holistic design philosophy. Instead of treating privacy, scala1bility, and interpretability as isolated objectives, EdgeTrustX integrates them seamlessly within one coherent architecture. While future research may explore further optimizations&#x2014;such as ultra-low-latency configurations or integration with trusted execution environments&#x2014;the current implementation already establishes a strong benchmark for next-generation secure IoT analytics.</p>
</sec>
<sec id="s6">
<label>6</label>
<title>Conclusion</title>
<p>This study presents EdgeTrustX, a privacy-aware federated transformer framework developed for scalable and explainable IoT threat detection. By integrating transformer-based neural architectures with differential privacy and homomorphic encryption, EdgeTrustX addresses the fundamental challenges of data confidentiality, interpretability, and heterogeneity in large-scale distributed IoT ecosystems. Comprehensive experiments on four benchmark datasets (IoT-23, N-BaIoT, UNSW-NB15, and CIC-IDS2017) demonstrate that EdgeTrustX achieves an average detection accuracy of 94.7%, approaching the centralised transformer baseline of 95.3% while ensuring strong privacy guarantees under a strict &#x03B5; &#x003D; 0.1 differential privacy budget. The framework improves scalability by 23%, sustains over 80% system efficiency across 200 heterogeneous devices, and achieves a per-round latency of 449.2 ms, confirming its real-time viability for edge environments. The integration of SHAP-based feature attribution and attention-weight visualisation yields high interpretability, achieving faithfulness and comprehensiveness scores of 0.826 and 0.851, respectively. Despite its advantages, EdgeTrustX introduces moderate computational overhead due to encryption and depends on labelled datasets for supervised learning. These factors may constrain deployment on ultra-low-power nodes and limit adaptability to unlabeled data streams. Future research will focus on several directions: (1) designing lightweight homomorphic-encryption schemes and integrating Trusted Execution Environments (TEEs) to minimize cryptographic cost while preserving end-to-end security; (2) developing adaptive differential-privacy mechanisms with dynamic noise scaling and gradient clipping for improved privacy-utility balance; (3) extending the framework toward a malicious threat model through Byzantine-robust aggregation; (4) enabling cross-domain generalization to support heterogeneous IoT verticals; (5) incorporating blockchain-based authentication and auditability to enhance trust management; and (6) scaling EdgeTrustX beyond 1000 devices using hierarchical federated optimization and efficient parameter-compression strategies. Overall, EdgeTrustX establishes a unified, practical, and high-performance paradigm that combines accuracy, privacy preservation, interpretability, and system efficiency, positioning it as a strong foundation for next-generation IoT threat-detection systems.</p>
</sec>
</body>
<back>
<ack>
<p>The author would like to thank the Deanship of Scientific Research at Shaqra University for supporting this work.</p>
</ack>
<sec>
<title>Funding Statement</title>
<p>The author received no specific funding for this study.</p>
</sec>
<sec sec-type="data-availability">
<title>Availability of Data and Materials</title>
<p>The data supporting this study&#x2019;s findings are available from the corresponding author upon reasonable request at: saleh@su.edu.sa.</p>
</sec>
<sec>
<title>Ethics Approval</title>
<p>This research study solely involves the use of historical datasets. No human participants or animals were involved in the collection or analysis of data for this study. As a result, ethical approval was not required.</p>
</sec>
<sec sec-type="COI-statement">
<title>Conflicts of Interest</title>
<p>The author declares no conflicts of interest.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Abd Elaziz</surname> <given-names>M</given-names></string-name>, <string-name><surname>Fares</surname> <given-names>IA</given-names></string-name>, <string-name><surname>Dahou</surname> <given-names>A</given-names></string-name>, <string-name><surname>Shrahili</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Federated learning framework for IoT intrusion detection using tab transformer and nature-inspired hyperparameter optimization</article-title>. <source>Front Big Data</source>. <year>2025</year>;<volume>8</volume>:<fpage>1526480</fpage>. doi:<pub-id pub-id-type="doi">10.3389/fdata.2025.1526480</pub-id>; <pub-id pub-id-type="pmid">40438227</pub-id></mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Albogami</surname> <given-names>NN</given-names></string-name></person-group>. <article-title>Intelligent deep federated learning model for enhancing security in internet of things enabled edge computing environment</article-title>. <source>Sci Rep</source>. <year>2025</year>;<volume>15</volume>(<issue>1</issue>):<fpage>4041</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41598-025-88163-5</pub-id>; <pub-id pub-id-type="pmid">39900657</pub-id></mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Saraladeve</surname> <given-names>L</given-names></string-name>, <string-name><surname>Chandrasekar</surname> <given-names>A</given-names></string-name>, <string-name><surname>Nithya</surname> <given-names>T</given-names></string-name>, <string-name><surname>Kalaiarasi</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sampathkumar</surname> <given-names>B</given-names></string-name>, <string-name><surname>Thanikachalam</surname> <given-names>R</given-names></string-name></person-group>. <article-title>A multiclass attack classification framework for IoT using hybrid deep learning model</article-title>. <source>J Cybersecur Inf Manag</source>. <year>2025</year>;<volume>15</volume>(<issue>1</issue>):<fpage>151</fpage>. doi:<pub-id pub-id-type="doi">10.54216/JCIM.150112</pub-id>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Khan</surname> <given-names>IA</given-names></string-name>, <string-name><surname>Razzak</surname> <given-names>I</given-names></string-name>, <string-name><surname>Pi</surname> <given-names>D</given-names></string-name>, <string-name><surname>Khan</surname> <given-names>N</given-names></string-name>, <string-name><surname>Hussain</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Li</surname> <given-names>B</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Fed-inforce-fusion: a federated reinforcement-based fusion model for security and privacy protection of IoMT networks against cyber-attacks</article-title>. <source>Inf Fusion</source>. <year>2024</year>;<volume>101</volume>(<issue>3</issue>):<fpage>102002</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.inffus.2023.102002</pub-id>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Park</surname> <given-names>JH</given-names></string-name>, <string-name><surname>Yotxay</surname> <given-names>S</given-names></string-name>, <string-name><surname>Singh</surname> <given-names>SK</given-names></string-name>, <string-name><surname>Park</surname> <given-names>JH</given-names></string-name></person-group>. <article-title>PoAh-enabled federated learning architecture for DDoS attack detection in IoT networks</article-title>. <source>Hum Centric Comput Inf Sci</source>. <year>2024</year>;<volume>14</volume>(<issue>3</issue>):<fpage>1</fpage>&#x2013;<lpage>25</lpage>. doi:<pub-id pub-id-type="doi">10.22967/HCIS.2024.14.003</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Song</surname> <given-names>X</given-names></string-name>, <string-name><surname>Ma</surname> <given-names>Q</given-names></string-name></person-group>. <article-title>Intrusion detection using federated attention neural network for edge enabled internet of things</article-title>. <source>J Grid Comput</source>. <year>2024</year>;<volume>22</volume>(<issue>1</issue>):<fpage>15</fpage>. doi:<pub-id pub-id-type="doi">10.1007/s10723-023-09725-3</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hamdi</surname> <given-names>N</given-names></string-name></person-group>. <article-title>Federated learning-based intrusion detection system for internet of things</article-title>. <source>Int J Inf Secur</source>. <year>2023</year>;<volume>22</volume>(<issue>6</issue>):<fpage>1937</fpage>&#x2013;<lpage>48</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s10207-023-00727-6</pub-id>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Karunamurthy</surname> <given-names>A</given-names></string-name>, <string-name><surname>Vijayan</surname> <given-names>K</given-names></string-name>, <string-name><surname>Kshirsagar</surname> <given-names>PR</given-names></string-name>, <string-name><surname>Tan</surname> <given-names>KT</given-names></string-name></person-group>. <article-title>An optimal federated learning-based intrusion detection for IoT environment</article-title>. <source>Sci Rep</source>. <year>2025</year>;<volume>15</volume>(<issue>1</issue>):<fpage>8696</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41598-025-93501-8</pub-id>; <pub-id pub-id-type="pmid">40082567</pub-id></mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Alsharaiah</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Almaiah</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Shehab</surname> <given-names>R</given-names></string-name>, <string-name><surname>Obeidat</surname> <given-names>M</given-names></string-name>, <string-name><surname>Ali El-Qirem</surname> <given-names>F</given-names></string-name>, <string-name><surname>Aldhyani</surname> <given-names>T</given-names></string-name></person-group>. <article-title>An explainable AI-driven transformer model for spoofing attack detection in internet of medical things (IoMT) networks</article-title>. <source>Discov Appl Sci</source>. <year>2025</year>;<volume>7</volume>(<issue>5</issue>):<fpage>488</fpage>. doi:<pub-id pub-id-type="doi">10.1007/s42452-025-07071-5</pub-id>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Al-Haboosi</surname> <given-names>IT</given-names></string-name>, <string-name><surname>Elbagoury</surname> <given-names>BM</given-names></string-name>, <string-name><surname>El-Regaily</surname> <given-names>S</given-names></string-name>, <string-name><surname>El-Horbaty</surname> <given-names>EM</given-names></string-name></person-group>. <article-title>A hybrid-transformer-based cyber-attack detection in IoT networks</article-title>. <source>Int J Interact Mob Technol</source>. <year>2024</year>;<volume>18</volume>(<issue>14</issue>):<fpage>90</fpage>&#x2013;<lpage>102</lpage>. doi:<pub-id pub-id-type="doi">10.3991/ijim.v18i14.50343</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Saghir</surname> <given-names>A</given-names></string-name>, <string-name><surname>Beniwal</surname> <given-names>H</given-names></string-name>, <string-name><surname>Tran</surname> <given-names>KD</given-names></string-name>, <string-name><surname>Raza</surname> <given-names>A</given-names></string-name>, <string-name><surname>Koehl</surname> <given-names>L</given-names></string-name>, <string-name><surname>Zeng</surname> <given-names>X</given-names></string-name>, <etal>et al.</etal></person-group> <chapter-title>Explainable transformer-based anomaly detection for internet of things security</chapter-title>. In: <person-group person-group-type="editor"><string-name><surname>Tran</surname> <given-names>KP</given-names></string-name>, <string-name><surname>Li</surname> <given-names>S</given-names></string-name>, <string-name><surname>Heuchenne</surname> <given-names>C</given-names></string-name>, <string-name><surname>Truong</surname> <given-names>TH</given-names></string-name></person-group>, editors. <source>EAI/Springer innovations in communication and computing</source>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2024</year>. p. <fpage>83</fpage>&#x2013;<lpage>97</lpage>. doi:<pub-id pub-id-type="doi">10.1007/978-3-031-53028-9_6</pub-id>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ahmed</surname> <given-names>SF</given-names></string-name>, <string-name><surname>Alam</surname> <given-names>MSB</given-names></string-name>, <string-name><surname>Afrin</surname> <given-names>S</given-names></string-name>, <string-name><surname>Rafa</surname> <given-names>SJ</given-names></string-name>, <string-name><surname>Taher</surname> <given-names>SB</given-names></string-name>, <string-name><surname>Kabir</surname> <given-names>M</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Toward a secure 5G-enabled internet of things: a survey on requirements, privacy, security, challenges, and opportunities</article-title>. <source>IEEE Access</source>. <year>2024</year>;<volume>12</volume>:<fpage>13125</fpage>&#x2013;<lpage>45</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2024.3352508</pub-id>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gholami</surname> <given-names>P</given-names></string-name>, <string-name><surname>Seferoglu</surname> <given-names>H</given-names></string-name></person-group>. <article-title>DIGEST: fast and communication efficient decentralized learning with local updates</article-title>. <source>Trans Mach Learn Comm Netw</source>. <year>2024</year>;<volume>2</volume>(<issue>213</issue>):<fpage>1456</fpage>&#x2013;<lpage>74</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tmlcn.2024.3354236</pub-id>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Alabbadi</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bajaber</surname> <given-names>F</given-names></string-name></person-group>. <article-title>An intrusion detection system over the IoT data streams using eXplainable artificial intelligence (XAI)</article-title>. <source>Sensors</source>. <year>2025</year>;<volume>25</volume>(<issue>3</issue>):<fpage>847</fpage>. doi:<pub-id pub-id-type="doi">10.3390/s25030847</pub-id>; <pub-id pub-id-type="pmid">39943488</pub-id></mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Sorour</surname> <given-names>SE</given-names></string-name>, <string-name><surname>Aljaafari</surname> <given-names>M</given-names></string-name>, <string-name><surname>Shaker</surname> <given-names>AM</given-names></string-name>, <string-name><surname>Amin</surname> <given-names>AE</given-names></string-name></person-group>. <article-title>LSTM-JSO framework for privacy preserving adaptive intrusion detection in federated IoT networks</article-title>. <source>Sci Rep</source>. <year>2025</year>;<volume>15</volume>(<issue>1</issue>):<fpage>11321</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41598-025-95966-z</pub-id>; <pub-id pub-id-type="pmid">40175537</pub-id></mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Agbor</surname> <given-names>BA</given-names></string-name>, <string-name><surname>Stephen</surname> <given-names>BU</given-names></string-name>, <string-name><surname>Asuquo</surname> <given-names>P</given-names></string-name>, <string-name><surname>Luke</surname> <given-names>UO</given-names></string-name>, <string-name><surname>Anaga</surname> <given-names>V</given-names></string-name></person-group>. <article-title>Hybrid CNN-BiLSTM&#x2013;DNN approach for detecting cybersecurity threats in IoT networks</article-title>. <source>Computers</source>. <year>2025</year>;<volume>14</volume>(<issue>2</issue>):<fpage>58</fpage>. doi:<pub-id pub-id-type="doi">10.3390/computers14020058</pub-id>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Danquah</surname> <given-names>LKG</given-names></string-name>, <string-name><surname>Appiah</surname> <given-names>SY</given-names></string-name>, <string-name><surname>Mantey</surname> <given-names>VA</given-names></string-name>, <string-name><surname>Danlard</surname> <given-names>I</given-names></string-name>, <string-name><surname>Akowuah</surname> <given-names>EK</given-names></string-name></person-group>. <article-title>Computationally efficient deep federated learning with optimized feature selection for IoT botnet attack detection</article-title>. <source>Intell Syst Appl</source>. <year>2025</year>;<volume>25</volume>(<issue>4</issue>):<fpage>200462</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.iswa.2024.200462</pub-id>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wen</surname> <given-names>M</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>P</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>L</given-names></string-name></person-group>. <article-title>IDS-DWKAFL: an intrusion detection scheme based on dynamic weighted K-asynchronous federated learning for smart grid</article-title>. <source>J Inf Secur Appl</source>. <year>2025</year>;<volume>89</volume>(<issue>3</issue>):<fpage>103993</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.jisa.2025.103993</pub-id>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Rampone</surname> <given-names>G</given-names></string-name>, <string-name><surname>Ivaniv</surname> <given-names>T</given-names></string-name>, <string-name><surname>Rampone</surname> <given-names>S</given-names></string-name></person-group>. <article-title>A hybrid federated learning framework for privacy-preserving near-real-time intrusion detection in IoT environments</article-title>. <source>Electronics</source>. <year>2025</year>;<volume>14</volume>(<issue>7</issue>):<fpage>1430</fpage>. doi:<pub-id pub-id-type="doi">10.3390/electronics14071430</pub-id>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Han</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>A privacy preserving federated learning system for IoT devices using blockchain and optimization</article-title>. <source>J Comput Commun</source>. <year>2024</year>;<volume>12</volume>(<issue>9</issue>):<fpage>78</fpage>&#x2013;<lpage>102</lpage>. doi:<pub-id pub-id-type="doi">10.4236/jcc.2024.129005</pub-id>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yuan</surname> <given-names>J</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>W</given-names></string-name>, <string-name><surname>Shi</surname> <given-names>J</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Q</given-names></string-name></person-group>. <article-title>Approximate homomorphic encryption based privacy-preserving machine learning: a survey</article-title>. <source>Artif Intell Rev</source>. <year>2025</year>;<volume>58</volume>(<issue>3</issue>):<fpage>82</fpage>. doi:<pub-id pub-id-type="doi">10.1007/s10462-024-11076-8</pub-id>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kheddar</surname> <given-names>H</given-names></string-name></person-group>. <article-title>Transformers and large language models for efficient intrusion detection systems: a comprehensive survey</article-title>. <source>Inf Fusion</source>. <year>2025</year>;<volume>124</volume>(<issue>1</issue>):<fpage>103347</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.inffus.2025.103347</pub-id>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gueriani</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kheddar</surname> <given-names>H</given-names></string-name>, <string-name><surname>Mazari</surname> <given-names>AC</given-names></string-name>, <string-name><surname>Ghanem</surname> <given-names>MC</given-names></string-name></person-group>. <article-title>A robust cross-domain IDS using BiGRU-LSTM-attention for medical and industrial IoT security</article-title>. <source>ICT Express</source>. <year>2025</year>;<volume>8</volume>(<issue>11</issue>):<fpage>1</fpage>&#x2013;<lpage>10</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.icte.2025.08.011</pub-id>.</mixed-citation></ref>
</ref-list>
</back></article>
