<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">76608</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2026.076608</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>From Hardening to Understanding: Adversarial Training vs. CF-Aug for Explainable Cyber-Threat Detection System</article-title>
<alt-title alt-title-type="left-running-head">From Hardening to Understanding: Adversarial Training vs. CF-Aug for Explainable Cyber-Threat Detection System</alt-title>
<alt-title alt-title-type="right-running-head">From Hardening to Understanding: Adversarial Training vs. CF-Aug for Explainable Cyber-Threat Detection System</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Al-Essa</surname><given-names>Malik</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><email>m.alessa@ju.edu.jo</email></contrib>
<contrib id="author-2" contrib-type="author">
<name name-style="western"><surname>Qatawneh</surname><given-names>Mohammad</given-names></name><xref ref-type="aff" rid="aff-2">2</xref><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Al-Shamayleh</surname><given-names>Ahmad Sami</given-names></name><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Abualghanam</surname><given-names>Orieb</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-5" contrib-type="author">
<name name-style="western"><surname>Almobaideen</surname><given-names>Wesam</given-names></name><xref ref-type="aff" rid="aff-4">4</xref><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<aff id="aff-1"><label>1</label><institution>Computer Science Department, King Abdullah II Faculty for Information Technology, The University of Jordan</institution>, <addr-line>Amman, 11942</addr-line>, <country>Jordan</country></aff>
<aff id="aff-2"><label>2</label><institution>Department of Networks and Cybersecurity, Faculty of Information Technology, Al-Ahliyya Amman University</institution>, <addr-line>Amman, 19111</addr-line>, <country>Jordan</country></aff>
<aff id="aff-3"><label>3</label><institution>Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Al-Ahliyya Amman University</institution>, <addr-line>Amman, 19111</addr-line>, <country>Jordan</country></aff>
<aff id="aff-4"><label>4</label><institution>Department of Electrical Engineering and Computing Sciences, Rochester Institute of Technology</institution>, <addr-line>Dubai, 341055</addr-line>, <country>United Arab Emirates</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Malik Al-Essa. Email: <email>m.alessa@ju.edu.jo</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2026</year>
</pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>9</day><month>4</month><year>2026</year>
</pub-date>
<volume>87</volume>
<issue>3</issue>
<elocation-id>17</elocation-id>
<history>
<date date-type="received">
<day>23</day>
<month>11</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>22</day>
<month>12</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2026 The Authors. Published by Tech Science Press.</copyright-statement>
<copyright-year>2026</copyright-year>
<copyright-holder>The Authors</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_76608.pdf"></self-uri>
<abstract>
<p>Machine Learning (ML) intrusion detection systems (IDS) are vulnerable to manipulations: small, protocol-valid manipulations can push samples across brittle decision boundaries. We study two complementary remedies that reshape the learner in distinct ways. Adversarial Training (AT) exposes the model to worst-case, in-threat perturbations during learning to thicken local margins; Counterfactual Augmentation (CF-Aug) adds near-boundary exemplars that are explicitly constrained to be feasible, causally consistent, and operationally meaningful for defenders. The main goal of this work is to investigate and compare how AT and CF-Aug can reshape the decision surface of the IDS. eXplainable Artificial Intelligence (XAI) is used to analyze the shifts in global feature importance stability under both AT and CF perturbation to link these shifts to the accuracy of the IDS in detecting cyber-threats. This yields a clear picture when boundary hardening (AT) or boundary sculpting (CF-Aug) better serves IDS. Two well-known techniques are used to generate adversarial samples, namely the Fast Gradient Sign Method (FGSM) and the Projected Gradient Descent (PGD) techniques. We have achieved better accuracy with AT and CF-Aug compared to the baseline IDS.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>eXplainable artificial intelligence (XAI)</kwd>
<kwd>intrusion detection systems (IDS)</kwd>
<kwd>counterfactual explanation</kwd>
<kwd>adversarial training</kwd>
<kwd>deep learning</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>AI has become a core engine of technological progress, with steady advances across different domains that include computer vision, healthcare, and autonomous platforms [<xref ref-type="bibr" rid="ref-1">1</xref>&#x2013;<xref ref-type="bibr" rid="ref-3">3</xref>]. Cybersecurity as one of the domains that started to integrate AI, particularly ML and DL, in its technologies. Data-driven methods now reinforce digital infrastructure against evolving threats through IDS/Intrusion Prevention Systems (IPS), malware and phishing classification, behavior- and anomaly-based monitoring at network and endpoint layers [<xref ref-type="bibr" rid="ref-4">4</xref>]. This adoption wave is propelled by abundant, fine-grained telemetry, network flows, system logs, and endpoint signals, paired with elastic compute and rapid progress in representation learning.</p>
<p>Contemporary campaigns increasingly mix AI-assisted evasion with zero-day exploitation and rapidly adapting malware, which raises the bar for detection systems: they must learn new patterns quickly and provide transparent reasoning for their decisions [<xref ref-type="bibr" rid="ref-5">5</xref>]. However, adversaries actively generate inputs to mislead ML pipelines, escalating an offense-defense arms race that makes robust and interpretable security models a practical requirement rather than an academic ideal [<xref ref-type="bibr" rid="ref-6">6</xref>]. Even though there is an increased adoption of AI in cybersecurity, two limitations still affect this adoption. First, DL models are vulnerable to small, protocol-consistent manipulations that can produce large prediction swings, enabling adversarial activity that remains operationally plausible. Second, many state-of-the-art Deep Neural Network (DNN) architectures operate as black boxes, offering limited visibility into why an alert was triggered, which features controlled the decision, or how the prediction would change under a feasible mitigation. In practice, this combination of vulnerability to adversarial perturbations and black-box nature erodes analyst trust, complicates governance and compliance audits, and slows incident response.</p>
<p>Recent research directions to address these weaknesses include a call for techniques that balance robustness with domain constraints and integrate explainability to actionable controls. Explainability should progress beyond post hoc saliency to include counterfactuals, uncertainty estimates, and feature stability analyses that map cleanly to defense levers (rate limits, ACLs, service hardening). This work advances this agenda by comparing two complementary strategies, AT and CF-Aug, under security-aware constraints, and by assessing not only accuracy but also the explainability of the DL models. Using MoDel Agnostic Language for Exploration and eXplanation (DALEX), an XAI technique, we provide model-agnostic explanations that reveal how each intervention shifts feature importance and decision boundaries under clean and adversarial conditions.</p>
<p>The main contributions of this paper are:
<list list-type="simple">
<list-item><label>1.</label><p>We design a unified experimental framework that directly compares AT and CF-Aug for DL-based IDS, under the same architecture, datasets, and evaluation metrics. This allows us to disentangle the effects of boundary hardening (AT) from boundary sculpting (CF-Aug) on cyber-threat detection performance.</p></list-item>
<list-item><label>2.</label><p>We introduce an XAI-driven augmentation pipeline in which counterfactual explanations are constrained to be protocol-valid, label-preserving, and operationally meaningful for defenders. These counterfactuals are then reused as training samples, enabling us to quantify how XAI-generated data reshapes the decision surface and improves minority-class detection.</p></list-item>
<list-item><label>3.</label><p>We perform a global explainability study using DALEX to compare feature-importance profiles across Baseline, AT, and CF-Aug models. By connecting shifts in global importance to changes in per-class F1, we show that CF-Aug encourages the model to rely more on semantic, security-relevant features, especially for rare attack types such as R2L and U2R in NSL-KDD and Slowloris/SSH-Patator in CICIDS17.</p></list-item>
<list-item><label>4.</label><p>We evaluate the framework on two benchmark datasets (NSL-KDD and a cleaned multi-class version of CICIDS17) and report detailed per-class metrics, highlighting the conditions under which AT and CF-Aug provide complementary benefits in terms of overall robustness, minority-class F1, and interpretability.</p></list-item>
</list></p>
<p>The remainder of this paper is structured as follows. <xref ref-type="sec" rid="s2">Section 2</xref> surveys the most relevant work in the literature. <xref ref-type="sec" rid="s3">Section 3</xref> presents the proposed approach in detail. <xref ref-type="sec" rid="s4">Section 4</xref> introduces the datasets, describes the implementation setup, and analyzes the experimental results. Finally, <xref ref-type="sec" rid="s5">Section 5</xref> summarizes the main contributions and discusses potential avenues for future research.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Work</title>
<sec id="s2_1">
<label>2.1</label>
<title>Adversarial Learning</title>
<p>Research on AT has developed along both the &#x201C;offensive&#x201D; and &#x201C;defensive&#x201D; dimensions of the problem. On the one hand, a large body of work focuses on how small, carefully designed changes to legitimate inputs&#x02014;so-called adversarial samples&#x02014;can sharply raise the misclassification rate of DL models [<xref ref-type="bibr" rid="ref-7">7</xref>]. Quite a few attack algorithms have been put forth for this purpose. Among them, the Fast Gradient Sign Method (FGSM) [<xref ref-type="bibr" rid="ref-8">8</xref>] is considered one of the most popular gradient-based white-box attacks. However, it may suffer from catastrophic overfitting when being naively integrated into training procedures [<xref ref-type="bibr" rid="ref-9">9</xref>]. FGSM and similar methods leverage the model&#x2019;s loss landscape, usually cross-entropy, to compute perturbations that nudge inputs in the direction of regions where the classifier&#x2019;s decisions are more fragile for a given class.</p>
<p>On the defensive side, research has mostly converged on two main families of methods: AT and provable (or certified) defenses [<xref ref-type="bibr" rid="ref-10">10</xref>]. In a nutshell, AT-based defenses augment the training set by adding adversarially perturbed samples (while keeping their original labels) and then retraining or fine-tuning the classifier so that it learns to correctly classify both clean and perturbed inputs. The goal here is to get a model that is empirically more robust than the original one against a specified threat model. Provable defenses, on the other hand, aim to provide formal guarantees of robustness by computing and optimizing robustness certificates, usually at considerable additional computational cost and reduced scalability. Empirical studies such as [<xref ref-type="bibr" rid="ref-9">9</xref>] also suggest that AT remains the most widely adopted approach in practice, since it can provide strong empirical robustness, scales fairly well to large DNNs, and can be adapted to different attack algorithms. For these practical reasons, we focus on AT and refer the reader to [<xref ref-type="bibr" rid="ref-11">11</xref>] for a recent taxonomy of its many variants.</p>
<p>The interaction of adversarial learning with cybersecurity has attracted a lot of attention, predominantly from the attack perspective. Various works design adversarial samples specifically to evade network intrusion and malware detectors [<xref ref-type="bibr" rid="ref-12">12</xref>,<xref ref-type="bibr" rid="ref-13">13</xref>], showing that even small perturbations can significantly degrade the performance of DL-based security systems. More recently, a smaller but growing body of work has started to explore adversarial defenses for DL in cybersecurity settings. For example, AT has been combined with generative adversarial networks (GANs) to improve intrusion detection and malware classification in [<xref ref-type="bibr" rid="ref-14">14</xref>]. In [<xref ref-type="bibr" rid="ref-15">15</xref>], adversarial samples are generated to extend and rebalance the training data, increasing the fraction of malicious traffic in a binary classification task. Reference [<xref ref-type="bibr" rid="ref-16">16</xref>] applies AT to enhance the robustness of supervised models for the automated detection of cyber threats in industrial control systems. Reference [<xref ref-type="bibr" rid="ref-17">17</xref>] performs a systematic evaluation of various evasion attacks, including an evaluation of AT on multiple DNN architectures.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>XAI</title>
<p>Most DL models today are still black boxes: they produce some final verdict, for instance, benign vs. malicious or normal traffic vs. attack traffic, but fail to provide any clue regarding why they made that decision. For security teams, this lack of insight erodes trust, slows incident response, and makes it increasingly difficult for analysts to confirm or contest alerts in high-stakes environments. In particular, Reference [<xref ref-type="bibr" rid="ref-18">18</xref>] discussed how XAI techniques can be used to explain the decisions made by neural models for malware detection and vulnerability discovery. Similarly, Reference [<xref ref-type="bibr" rid="ref-19">19</xref>] applies XAI methods towards the identification of which input features were of the most importance for the detection of each intrusion category, while Reference [<xref ref-type="bibr" rid="ref-20">20</xref>] proposes an explainable IDS/IPS designed for cloud environments. More recently, researchers have started to interlink XAI with adversarial learning in cybersecurity: Reference [<xref ref-type="bibr" rid="ref-21">21</xref>] uses an adversarial learning framework to understand why some network intrusions are misclassified by a DNN, expressing explanations as the smallest changes needed in the input to flip the model&#x2019;s decision for misclassified samples. In [<xref ref-type="bibr" rid="ref-22">22</xref>], local, instance-level explanations were combined with AT, where instance-level explanations helped to guide fine-tuning of a DNN that had been previously trained with AT. Reference [<xref ref-type="bibr" rid="ref-23">23</xref>] integrates counterfactual explanations in order to provide defense mechanisms against poisoning and backdoor attacks. In particular, a counterfactual explanation algorithm, originally designed to generate &#x201C;what-if&#x201D; explanations, is repurposed for the purpose of systematically modifying inputs in ways that drive the model toward different predictions [<xref ref-type="bibr" rid="ref-24">24</xref>]. Golden Jackal Driven Optimization is used to design a transparent, interpretable intrusion detection system that leverages explainable AI to strengthen and modernize cybersecurity defenses [<xref ref-type="bibr" rid="ref-25">25</xref>]. DoS Attack Detection Enhancement in Autonomous Vehicle Systems with XAI focuses on improving the reliability of detecting DoS attacks while providing transparent, interpretable insights into the Autonomous Vehicles&#x2019; security decisions [<xref ref-type="bibr" rid="ref-26">26</xref>].</p>
<p>Several recent studies combine AT with explainability techniques for IDS. For example, in our prior work [<xref ref-type="bibr" rid="ref-22">22</xref>], we used XAI to analyze and guide the AT process by identifying vulnerable regions of the feature space, but the explanations were not reused as synthetic training data. In contrast, the present work leverages counterfactual explanations directly as an augmentation mechanism (CF-Aug), allowing us to study how XAI-generated samples reshape the decision boundary and affect robustness. A complementary line of research focuses on building explainable IDS models without explicitly contrasting different robustness strategies. Existing XAI-IDS approaches typically employ post-hoc methods such as feature-importance analyses or local explanations to interpret deep or tree-based detectors, but they do not systematically compare AT and CF-Aug under a unified experimental setup. Our study fills this gap by placing AT and CF-Aug on the same footing and jointly evaluating their impact on both predictive performance and model interpretability. <xref ref-type="table" rid="table-1">Table 1</xref> summarizes related work and compares it to the proposed approach.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Summary of representative adversarial training and XAI-based schemes related to IDS, positioned relative to the proposed approach</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Work (Ref.)</th>
<th>Domain/Task</th>
<th>AT</th>
<th>CF</th>
<th>XAI</th>
</tr>
</thead>
<tbody>
<tr>
<td>[<xref ref-type="bibr" rid="ref-14">14</xref>]</td>
<td>Network intrusion detection</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-27">27</xref>]</td>
<td>Malware detection with GAN-based defenses</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-15">15</xref>]</td>
<td>Android malware detection</td>
<td>Yes</td>
<td>Partial</td>
<td>No</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-16">16</xref>]</td>
<td>ICS cyber-threat detection</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-17">17</xref>]</td>
<td>DL-based IDS under evasion attacks</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-18">18</xref>]</td>
<td>Malware and vulnerability detection</td>
<td>No</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-19">19</xref>]</td>
<td>Intrusion detection (feature importance)</td>
<td>No</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-20">20</xref>]</td>
<td>Cloud-based IDS/IPS</td>
<td>No</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-21">21</xref>]</td>
<td>Intrusion detection with adversarial analysis</td>
<td>Yes</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-22">22</xref>]</td>
<td>Cybersecurity classification with guided fine-tuning</td>
<td>Yes</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-23">23</xref>]</td>
<td>Defenses against poisoning/backdoor attacks</td>
<td>No</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-25">25</xref>]</td>
<td>Intrusion detection with metaheuristic optimization</td>
<td>No</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-26">26</xref>]</td>
<td>DoS detection in autonomous vehicles</td>
<td>No</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>This work</td>
<td>DL-based IDS on NSL-KDD and CICIDS17</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Method</title>
<p>The proposed method, as shown in Algorithm 1 ( <xref ref-type="fig" rid="fig-1">Fig. 1</xref> illustrated the schema for the proposed method), establishes a comparative framework to study how distinct training regimes shape both robustness and global interpretability in DL. Given a labeled dataset <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mrow><mml:mi>&#x1D4AF;</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:msubsup><mml:mo fence="false" stretchy="false">}</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:msubsup></mml:math></inline-formula> with <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mi>d</mml:mi></mml:msup></mml:math></inline-formula>, a baseline DL model <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:msub></mml:math></inline-formula> is first trained under standard empirical risk minimization and serves as the reference point for subsequent augmentation strategies (Algorithm 1 line 2). To induce robustness to gradient-based perturbations, an adversarial augmentation set <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mrow><mml:mi>&#x1D49C;</mml:mi></mml:mrow></mml:math></inline-formula> is generated by applying FGSM and PGD to each training sample under a perturbation factor <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:mi>&#x03B5;</mml:mi></mml:math></inline-formula> (Algorithm 1 line 6). Concretely, FGSM constructs<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo>+</mml:mo><mml:mi>&#x03F5;</mml:mi><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext>sign</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mi>&#x2113;</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>where <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:msub><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> denotes the gradient computed with respect to <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:math></inline-formula>, and <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mi>&#x2113;</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> denotes the loss function of the neural model <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:msub><mml:mi>M</mml:mi><mml:mi>&#x03B8;</mml:mi></mml:msub></mml:math></inline-formula>. while PGD performs <italic>K</italic> iterative steps with step size <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mi>&#x03B1;</mml:mi></mml:math></inline-formula> and projection <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:msub><mml:mi mathvariant="normal">&#x03A0;</mml:mi><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x0212C;</mml:mi></mml:mrow><mml:mi>p</mml:mi></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mi>&#x03F5;</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msub></mml:math></inline-formula> onto the <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:msub><mml:mi>&#x2113;</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:math></inline-formula>-ball of radius <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:mi>&#x03F5;</mml:mi></mml:math></inline-formula>:
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:msup><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msup><mml:mo stretchy="false">&#x2190;</mml:mo><mml:msub><mml:mi mathvariant="normal">&#x03A0;</mml:mi><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x0212C;</mml:mi></mml:mrow><mml:mi>p</mml:mi></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mi>&#x03F5;</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msub><mml:mspace width="negativethinmathspace" /><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext>sign</mml:mtext></mml:mrow><mml:mspace width="negativethinmathspace" /><mml:mstyle scriptlevel="0"><mml:mrow><mml:mo maxsize="1.2em" minsize="1.2em">(</mml:mo></mml:mrow></mml:mstyle><mml:msub><mml:mi mathvariant="normal">&#x2207;</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mi>&#x2113;</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msup><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mstyle scriptlevel="0"><mml:mrow><mml:mo maxsize="1.2em" minsize="1.2em">)</mml:mo></mml:mrow></mml:mstyle><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mspace width="1em" /><mml:msup><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Overall architecture of the proposed explainable intrusion detection framework comparing the baseline, adversarially trained, and counterfactually augmented DL models</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76608-fig-1.tif"/>
</fig>
<fig id="fig-5">
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76608-fig-5.tif"/>
</fig>
<p>In all our experiments, <xref ref-type="disp-formula" rid="eqn-1">Eqs. (1)</xref> and <xref ref-type="disp-formula" rid="eqn-2">(2)</xref> are instantiated with the <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:msub><mml:mi>&#x2113;</mml:mi><mml:mi mathvariant="normal">&#x221E;</mml:mi></mml:msub></mml:math></inline-formula> norm. For both the NSL-KDD and CICIDS17 datasets, FGSM uses a perturbation factor of <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mi>&#x03F5;</mml:mi><mml:mo>=</mml:mo><mml:mn>0.01</mml:mn></mml:math></inline-formula>, while PGD uses the same factor with a step size of <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mtext>step_size</mml:mtext><mml:mo>=</mml:mo><mml:mn>10</mml:mn></mml:math></inline-formula>.</p>
<p>These adversarial samples are retained with their ground-truth labels and aggregated into <inline-formula id="ieqn-59"><mml:math id="mml-ieqn-59"><mml:mrow><mml:mi>&#x1D49C;</mml:mi></mml:mrow></mml:math></inline-formula>, after which a second model <inline-formula id="ieqn-60"><mml:math id="mml-ieqn-60"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>&#x03D5;</mml:mi></mml:msub></mml:math></inline-formula> is trained on <inline-formula id="ieqn-61"><mml:math id="mml-ieqn-61"><mml:mrow><mml:mi>&#x1D4AF;</mml:mi></mml:mrow><mml:mo>&#x222A;</mml:mo><mml:mrow><mml:mi>&#x1D49C;</mml:mi></mml:mrow></mml:math></inline-formula> following the classical AT paradigm, typically improving accuracy (Algorithm 1 line 12). In parallel, the method introduces CF-Aug to encourage semantically meaningful and smoother decision boundaries: for each <inline-formula id="ieqn-62"><mml:math id="mml-ieqn-62"><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mi>&#x1D4AF;</mml:mi></mml:mrow></mml:math></inline-formula>, a minimally modified <inline-formula id="ieqn-63"><mml:math id="mml-ieqn-63"><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> is generated that flips the <italic>baseline</italic> model&#x2019;s prediction while preserving the original label, and these counterfactuals are collected into <inline-formula id="ieqn-64"><mml:math id="mml-ieqn-64"><mml:mrow><mml:mi>&#x1D49E;</mml:mi></mml:mrow></mml:math></inline-formula> (Algorithm 1 line 10). In this work, generating a counterfactual that &#x201C;flips the prediction minimally&#x201D; is formalized as a constrained optimization problem. Given a trained baseline model <inline-formula id="ieqn-65"><mml:math id="mml-ieqn-65"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:msub></mml:math></inline-formula> and an input <inline-formula id="ieqn-66"><mml:math id="mml-ieqn-66"><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:math></inline-formula>, we seek a counterfactual <inline-formula id="ieqn-67"><mml:math id="mml-ieqn-67"><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> that changes the model&#x2019;s predicted class while staying as close as possible to <inline-formula id="ieqn-68"><mml:math id="mml-ieqn-68"><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:math></inline-formula> and remaining domain-feasible. Concretely, we approximate the solution of
<disp-formula id="ueqn-3"><mml:math id="mml-ueqn-3" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi></mml:mi><mml:munder><mml:mo movablelimits="true" form="prefix">min</mml:mo><mml:mrow><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow></mml:mrow></mml:munder><mml:mtext>&#xA0;</mml:mtext><mml:mo fence="false" stretchy="false">&#x2016;</mml:mo><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:msub><mml:mo fence="false" stretchy="false">&#x2016;</mml:mo><mml:mn>1</mml:mn></mml:msub><mml:mspace width="1em" /><mml:mrow><mml:mtext>s.t.</mml:mtext></mml:mrow><mml:mspace width="1em" /><mml:mi>arg</mml:mi><mml:mo>&#x2061;</mml:mo><mml:munder><mml:mo movablelimits="true" form="prefix">max</mml:mo><mml:mi>c</mml:mi></mml:munder><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mi>c</mml:mi></mml:msub><mml:mo>&#x2260;</mml:mo><mml:mi>arg</mml:mi><mml:mo>&#x2061;</mml:mo><mml:munder><mml:mo movablelimits="true" form="prefix">max</mml:mo><mml:mi>c</mml:mi></mml:munder><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mi>c</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#xA0;</mml:mtext><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mi>&#x2131;</mml:mi></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>where <inline-formula id="ieqn-69"><mml:math id="mml-ieqn-69"><mml:mrow><mml:mi>&#x2131;</mml:mi></mml:mrow></mml:math></inline-formula> denotes the set of feasible samples defined by domain constraints. The optimization is implemented as an iterative search procedure that updates <inline-formula id="ieqn-70"><mml:math id="mml-ieqn-70"><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> while monitoring both the model prediction and the distance <inline-formula id="ieqn-71"><mml:math id="mml-ieqn-71"><mml:mo fence="false" stretchy="false">&#x2016;</mml:mo><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:msub><mml:mo fence="false" stretchy="false">&#x2016;</mml:mo><mml:mn>1</mml:mn></mml:msub></mml:math></inline-formula>; after each update, the candidate is projected back onto <inline-formula id="ieqn-72"><mml:math id="mml-ieqn-72"><mml:mrow><mml:mi>&#x2131;</mml:mi></mml:mrow></mml:math></inline-formula> and discarded if it violates any constraint, exceeds a maximum allowed distance <inline-formula id="ieqn-73"><mml:math id="mml-ieqn-73"><mml:mo fence="false" stretchy="false">&#x2016;</mml:mo><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:msub><mml:mo fence="false" stretchy="false">&#x2016;</mml:mo><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2264;</mml:mo><mml:mi>&#x03C4;</mml:mi></mml:math></inline-formula>, or fails to flip the prediction within a fixed number of iterations. Only candidates that both flip the prediction and satisfy all feasibility criteria are retained and added to the CF-Aug set <inline-formula id="ieqn-74"><mml:math id="mml-ieqn-74"><mml:mrow><mml:mi>&#x1D49E;</mml:mi></mml:mrow></mml:math></inline-formula>, ensuring that CF-Aug uses realistic, near-boundary samples rather than arbitrary perturbations.</p>
<p>Training a third model <inline-formula id="ieqn-75"><mml:math id="mml-ieqn-75"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi mathvariant="normal">&#x2207;</mml:mi></mml:msub></mml:math></inline-formula> on <inline-formula id="ieqn-76"><mml:math id="mml-ieqn-76"><mml:mrow><mml:mi>&#x1D4AF;</mml:mi></mml:mrow><mml:mo>&#x222A;</mml:mo><mml:mrow><mml:mi>&#x1D49E;</mml:mi></mml:mrow></mml:math></inline-formula> promotes decision surfaces that are less sensitive to superficial manipulations, especially when the counterfactual generator enforces domain constraints such as non-negativity, type integrity, and known invariants (Algorithm 1 line 14). After training <inline-formula id="ieqn-77"><mml:math id="mml-ieqn-77"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-78"><mml:math id="mml-ieqn-78"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>&#x03D5;</mml:mi></mml:msub></mml:math></inline-formula>, and <inline-formula id="ieqn-79"><mml:math id="mml-ieqn-79"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi mathvariant="normal">&#x2207;</mml:mi></mml:msub></mml:math></inline-formula>, the pipeline employs DALEX to obtain global explanations (e.g., feature-importance profiles); comparing these profiles across the three models (Algorithm 1 line 17) reveals whether robustness-oriented training reallocates importance from brittle or spurious features to more stable, task-relevant features.</p>
</sec>
<sec id="s4">
<label>4</label>
<title>Empirical Evaluation and Discussion</title>
<p>We assessed the effectiveness of the proposed approach through a series of experiments conducted on two established benchmark datasets for cybersecurity. The details of these datasets are presented in <xref ref-type="sec" rid="s4_1">Section 4.1</xref>, while <xref ref-type="sec" rid="s4_2">Section 4.2</xref> details the implementation aspects of the proposed method. The results are discussed in <xref ref-type="sec" rid="s4_3">Section 4.3</xref>.</p>
<sec id="s4_1">
<label>4.1</label>
<title>Datasets</title>
<p>Two datasets are used to evaluate the proposed method, i.e., NSL-KDD and CICIDS17. Both datasets contain different classes (normal class and attack classes). The descriptions for the datasets is shown in <xref ref-type="table" rid="table-2">Table 2</xref>.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Data description (#: Number of)</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>#Training set</th>
<th>#Testing set</th>
<th>#Features</th>
<th>#Classes</th>
</tr>
</thead>
<tbody>
<tr>
<td>NSL-KDD</td>
<td>25,192</td>
<td>22,544</td>
<td>41</td>
<td>5</td>
</tr>
<tr>
<td>CICIDS17</td>
<td>100,000</td>
<td>100,000</td>
<td>72</td>
<td>9</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><italic>NSL-KDD</italic></p>
<p>The NSL-KDD dataset [<xref ref-type="bibr" rid="ref-28">28</xref>]<xref ref-type="fn" rid="fn1"><sup>1</sup></xref><fn id="fn1"><label>1</label><p><ext-link ext-link-type="uri" xlink:href="https://www.unb.ca/cic/datasets/nsl.html">https://www.unb.ca/cic/datasets/nsl.html</ext-link></p></fn> contains labeled network-flow records covering <italic>normal</italic> traffic and four broad attack categories: Denial of Service (DoS), Probe, Remote-to-Local (R2L), and User-to-Root (U2R). Each record comprises 41 attributes (37 numeric, 3 categorical, plus the class label); feature definitions are provided in [<xref ref-type="bibr" rid="ref-28">28</xref>]. A distinctive aspect of NSL-KDD is its shift between training and testing distributions: the training split includes 21 attack subtypes, whereas the test split contains 37, introducing 16 previously unseen attacks at evaluation time. This property stresses generalization to novel behaviors, an ability required by practical IDS. The label distribution is highly skewed, with R2L and U2R representing rare classes.</p>
<p>Although NSL-KDD is dated and does not perfectly reflect today&#x2019;s production networks, it remains a de facto multi-class benchmark that enables controlled comparison across IDS methods and careful analysis of minority-class detection. Following common practice, we adopt the predefined NSL-KDD splits. We use the <monospace>KDDTrain&#x002B;20Percent</monospace> subset as the training set and <monospace>KDDTest&#x002B;</monospace> as the held-out test set. This preserves the benchmark&#x2019;s original difficulty and allows a fair comparison with prior work while keeping the training size computationally manageable. All models are trained on <monospace>KDDTrain&#x002B;20Percent</monospace> and evaluated on <monospace>KDDTest&#x002B;</monospace>. A stratified 20% subset of the training set is further held out as a validation set for hyperparameter tuning.</p>
<p><italic>CICIDS17</italic></p>
<p>The CICIDS17 dataset [<xref ref-type="bibr" rid="ref-29">29</xref>] is one of the most commonly used benchmarks in cybersecurity research. It was captured in a realistic network testbed that included a variety of devices and operating systems, arranged into separate victim and attacker networks that communicated over the Internet. The recorded traffic covers widely used protocols such as HTTP, HTTPS, FTP, SSH, and SMTP. To better reflect real-world usage, the authors employed profiling agents&#x2014;trained on traffic generated by real human users&#x2014;to automatically produce both benign and malicious network flows. The dataset contains multiple attack scenarios, including brute-force attacks, Heartbleed, botnet communications, several forms of DoS and DDoS, infiltration attempts, and web-based attacks. Each flow in CICIDS17 is represented by 78 numerical features and a single class label, all extracted using the CICFlowMeter tool and described in detail in [<xref ref-type="bibr" rid="ref-29">29</xref>].</p>
<p>For our experiments, we rely on the refined version of CICIDS17<xref ref-type="fn" rid="fn2"><sup>2</sup></xref><fn id="fn2"><label>2</label><p><ext-link ext-link-type="uri" xlink:href="https://downloads.distrinet-research.be/WTMC2021">downloads.distrinet-research.be/WTMC2021</ext-link></p></fn> released by [<xref ref-type="bibr" rid="ref-30">30</xref>]. This cleaned version corrects issues in the original dataset by removing meaningless artifacts, fixing dataset errors, and discarding mislabeled traces. In particular, it retains 72 features from the original feature set, excluding those deemed irrelevant or prone to causing overfitting. The refined dataset was previously used in [<xref ref-type="bibr" rid="ref-31">31</xref>] for binary classification; in this work, we extend its use to a multi-class classification setting. From this dataset, we construct two disjoint subsets of 100,000 samples each: one for training and one for testing. We perform <italic>stratified sampling without replacement</italic> using a fixed random seed, preserving the original class proportions in both splits (approximately 80% benign and 20% attack traffic). Within the training subset, we randomly select 20% of the samples (again using stratified sampling and the same seed) as a validation set. This procedure ensures that (i) the training and test sets do not overlap, (ii) each split reflects the original class distribution of CICIDS17, and (iii) our results are reproducible. The final processed dataset includes <inline-formula id="ieqn-80"><mml:math id="mml-ieqn-80"><mml:mn>8</mml:mn></mml:math></inline-formula> attack types: PortScan, DoS Hulk, DDoS, DoS GoldenEye, DoS Slowloris, FTP-Patator, SSH-Patator, and DoS Slowhttptest.</p>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Implementation Details</title>
<p>The implementation was carried out in Python 3.12 using Keras 2.7 for model construction, with TensorFlow as the computational backend. All numerical features were normalized via min&#x2013;max scaling, which linearly maps each feature to the interval <inline-formula id="ieqn-81"><mml:math id="mml-ieqn-81"><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math></inline-formula>. Normalization was fit on the training split and applied consistently to test sets to prevent leakage. This step mitigates disproportionate influence from features with different units or scales and typically improves convergence stability for gradient-based optimization. Neural-network hyperparameters were tuned with the Tree-structured Parzen Estimator (TPE) as implemented in <monospace>Hyperopt</monospace>. We allocated <inline-formula id="ieqn-82"><mml:math id="mml-ieqn-82"><mml:mn>20</mml:mn><mml:mi mathvariant="normal">&#x0025;</mml:mi></mml:math></inline-formula> of the training set as a stratified validation subset to preserve class proportions, following a Pareto-style allocation to balance exploration and validation reliability. We executed thirty optimization trials and selected the configuration minimizing validation loss. The search space is summarized in <xref ref-type="table" rid="table-3">Table 3</xref>. The resulting hyperparameters are used for all DL models in this study.</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Hyperparameter search space</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Hyperparameter</th>
<th>Values</th>
</tr>
</thead>
<tbody>
<tr>
<td>Mini-batch size</td>
<td><inline-formula id="ieqn-83"><mml:math id="mml-ieqn-83"><mml:mo fence="false" stretchy="false">{</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mn>5</mml:mn></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mn>6</mml:mn></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mn>7</mml:mn></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mn>8</mml:mn></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mn>9</mml:mn></mml:msup><mml:mo fence="false" stretchy="false">}</mml:mo></mml:math></inline-formula></td>
</tr>
<tr>
<td>Learning rate</td>
<td><inline-formula id="ieqn-84"><mml:math id="mml-ieqn-84"><mml:mo stretchy="false">[</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>4</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:mspace width="thinmathspace" /><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>3</mml:mn></mml:mrow></mml:msup><mml:mo stretchy="false">]</mml:mo></mml:math></inline-formula></td>
</tr>
<tr>
<td>Dropout</td>
<td><inline-formula id="ieqn-85"><mml:math id="mml-ieqn-85"><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mspace width="thinmathspace" /><mml:mn>1</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math></inline-formula></td>
</tr>
<tr>
<td>Number of neurons per hidden layer</td>
<td><inline-formula id="ieqn-86"><mml:math id="mml-ieqn-86"><mml:mo fence="false" stretchy="false">{</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mn>5</mml:mn></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mn>6</mml:mn></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mn>7</mml:mn></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mn>8</mml:mn></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mn>9</mml:mn></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mrow><mml:mn>10</mml:mn></mml:mrow></mml:msup><mml:mo fence="false" stretchy="false">}</mml:mo></mml:math></inline-formula></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The DNN architecture comprises three fully connected layers, with the number of neurons per layer chosen by the hyperparameter search. To enhance generalization and reduce overfitting, the network includes dropout and batch normalization. Hidden layers use the Rectified Linear Unit (ReLU) activation for nonlinearity, and the output layer employs a softmax activation to produce class probabilities. Training proceeded for a maximum of <inline-formula id="ieqn-87"><mml:math id="mml-ieqn-87"><mml:mi>T</mml:mi><mml:mo>=</mml:mo><mml:mn>150</mml:mn></mml:math></inline-formula> epochs with early stopping based on validation loss. Let <inline-formula id="ieqn-88"><mml:math id="mml-ieqn-88"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>t</mml:mi></mml:msub></mml:math></inline-formula> denote the DL model after epoch <inline-formula id="ieqn-89"><mml:math id="mml-ieqn-89"><mml:mi>t</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-90"><mml:math id="mml-ieqn-90"><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mrow><mml:mtext>val</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> the validation loss; we select
<disp-formula id="ueqn-4"><mml:math id="mml-ueqn-4" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:msup><mml:mi>t</mml:mi><mml:mo>&#x22C6;</mml:mo></mml:msup><mml:mtext>&#x00A0;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>&#x00A0;</mml:mtext><mml:mi>arg</mml:mi><mml:mo>&#x2061;</mml:mo><mml:munder><mml:mo movablelimits="true" form="prefix">min</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mrow><mml:mtext>val</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mspace width="2em" /><mml:mrow><mml:mover><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mo>&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mrow><mml:msup><mml:mi>t</mml:mi><mml:mo>&#x22C6;</mml:mo></mml:msup></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<p>We used patience <inline-formula id="ieqn-91"><mml:math id="mml-ieqn-91"><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>10</mml:mn></mml:math></inline-formula>: training halts if <inline-formula id="ieqn-92"><mml:math id="mml-ieqn-92"><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mrow><mml:mtext>val</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> fails to improve for 10 consecutive epochs, after which the best checkpoint is restored. The cap of 150 epochs provides adequate headroom while avoiding unnecessary computation; empirical learning curves indicated that MacroF1 gains beyond <inline-formula id="ieqn-93"><mml:math id="mml-ieqn-93"><mml:mo>&#x223C;</mml:mo></mml:math></inline-formula>140 epochs were negligible across runs.</p>
<p>DALEX is employed to derive <italic>global explanations</italic> of the DL model, characterizing behavior over the entire dataset rather than at the level of individual predictions. By summarizing feature importance, inspecting model response patterns, and revealing structure-induced effects, DALEX offers a dataset-wide view of the model&#x2019;s decision logic. This perspective helps surface systematic biases, assess alignment with domain knowledge, and validate that salient signals are consistent with practitioner expectations. We integrate the DALEX Python package (v1.2.0) to quantify the global relevance of input features. We create adversarial samples using the FGSM and PGD methods as implemented in the Adversarial Robustness Toolbox (ART) library.</p>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Results and Discussion</title>
<p>The experimental study is designed with several objectives. First, we evaluate how well the proposed method performs in practice (<xref ref-type="sec" rid="s4_3_1">Section 4.3.1</xref>). Second, we conduct an ablation analysis to compare two enhancement strategies for strengthening a DL intrusion detection model: (i) training with CF-augmented samples, and (ii) AT. This comparison targets the overall predictive accuracy of the DL model for detecting attacks.</p>
<p>In addition to performance, we examine how these training strategies influence model behavior and decision logic. To that end, we apply the DALEX XAI framework to study and visualize how CF-Aug training and AT reshape the model&#x2019;s reasoning, and we contrast both against the baseline model <inline-formula id="ieqn-94"><mml:math id="mml-ieqn-94"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:msub></mml:math></inline-formula> trained on the original dataset. This analysis (<xref ref-type="sec" rid="s4_3_2">Section 4.3.2</xref>) allows us to quantify not only which approach yields better classification results, but also which approach produces a model whose decisions are more interpretable and more stable.</p>
<sec id="s4_3_1">
<label>4.3.1</label>
<title>Analysis of the Performance for the Proposed Method</title>
<p>We perform an ablation study to assess the effectiveness of the proposed approach on two widely used intrusion-detection benchmarks: NSL-KDD and CICIDS2017. <xref ref-type="table" rid="table-4">Table 4</xref> summarizes the number of training samples per class for the NSL-KDD and CICIDS17 datasets under the three training regimes. In the Baseline setting, the models are trained only on the original data distribution. In the Adversarial-Augmented (Adv-Augmented) setting, each original sample is paired with an adversarial counterpart, effectively doubling the number of samples per class; consequently, this configuration yields the largest training sets for both benchmarks. In contrast, the CF-Aug setting increases the dataset size more selectively: only those samples for which valid counterfactuals can be generated&#x2014;given the chosen hyperparameters and the counterfactual generator&#x2019;s feasibility constraints&#x2014;are added to the training pool. As a result, the total number of samples under CF-Augmented lies between the Baseline and Adv-Augmented cases, with a noticeable relative increase for rare classes (e.g., R2L and U2R in NSL-KDD and several minority attack types in CICIDS17). This leads to a more balanced training distribution without incurring the full cost of duplicating the entire dataset.</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Number of training samples per class under baseline, Adv-augmented, and CF-Aug settings for NSL-KDD and CICIDS17</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>Class</th>
<th>Baseline</th>
<th>Adv-Augmented</th>
<th>CF-Augmented</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Normal</td>
<td>13,449</td>
<td>26,898</td>
<td>14,155</td>
</tr>
<tr>
<td></td>
<td>DoS</td>
<td>9234</td>
<td>18,468</td>
<td>10,323</td>
</tr>
<tr>
<td>NSL-KDD</td>
<td>Probe</td>
<td>2289</td>
<td>4578</td>
<td>3589</td>
</tr>
<tr>
<td></td>
<td>R2L</td>
<td>209</td>
<td>418</td>
<td>665</td>
</tr>
<tr>
<td></td>
<td>U2R</td>
<td>11</td>
<td>22</td>
<td>466</td>
</tr>
<tr>
<td>Total</td>
<td>&#x2013;</td>
<td>25,192</td>
<td>50,384</td>
<td>29,198</td>
</tr>
<tr>
<td></td>
<td>Normal</td>
<td>79,288</td>
<td>158,576</td>
<td>85,631</td>
</tr>
<tr>
<td></td>
<td>DoS Hulk</td>
<td>7582</td>
<td>15,164</td>
<td>11,774</td>
</tr>
<tr>
<td></td>
<td>PortScan</td>
<td>7609</td>
<td>15,218</td>
<td>14,006</td>
</tr>
<tr>
<td></td>
<td>DDoS</td>
<td>4551</td>
<td>9102</td>
<td>11,013</td>
</tr>
<tr>
<td>CICIDS17</td>
<td>DoS GoldenEye</td>
<td>362</td>
<td>724</td>
<td>4976</td>
</tr>
<tr>
<td></td>
<td>FTP-Patator</td>
<td>191</td>
<td>382</td>
<td>3037</td>
</tr>
<tr>
<td></td>
<td>SSH-Patator</td>
<td>143</td>
<td>286</td>
<td>4937</td>
</tr>
<tr>
<td></td>
<td>DoS Slowloris</td>
<td>191</td>
<td>382</td>
<td>5053</td>
</tr>
<tr>
<td></td>
<td>DoS Slowhttptest</td>
<td>83</td>
<td>166</td>
<td>6896</td>
</tr>
<tr>
<td>Total</td>
<td>&#x2013;</td>
<td>100,000</td>
<td>200,000</td>
<td>147,296</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><xref ref-type="table" rid="table-5">Table 5</xref> reports the per-class F1 scores for NSL-KDD and CICIDS17 across the four training strategies. For NSL-KDD, the results show that all variants perform similarly on the majority classes (Normal and DoS), with PGD-based AT achieving the best F1 for the Normal class (0.83) and tying with the Baseline and FGSM models on DoS. The benefits of the proposed CF-based augmentation are most evident on the minority classes. For R2L and U2R, CF-Augment yields the highest F1 scores (0.31 and 0.17, respectively), substantially improving over the Baseline (0.22 and 0.12). This indicates that counterfactual samples help the model capture more informative decision boundaries for rare attack types, without sacrificing performance on the frequent classes.</p>
<table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>F1 per class for NSL-KDD and CICIDS17 datasets. The best results are in bold</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>Class</th>
<th>Baseline</th>
<th>Adv-Training (FGSM)</th>
<th>Adv-Training (PGD)</th>
<th>CF-Aug</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Normal</td>
<td>0.82</td>
<td>0.81</td>
<td><bold>0.83</bold></td>
<td>0.82</td>
</tr>
<tr>
<td></td>
<td>DoS</td>
<td><bold>0.89</bold></td>
<td><bold>0.89</bold></td>
<td>0.88</td>
<td>0.88</td>
</tr>
<tr>
<td>NSL-KDD</td>
<td>Probe</td>
<td><bold>0.78</bold></td>
<td>0.76</td>
<td>0.77</td>
<td>0.75</td>
</tr>
<tr>
<td></td>
<td>R2L</td>
<td>0.22</td>
<td>0.22</td>
<td>0.27</td>
<td><bold>0.31</bold></td>
</tr>
<tr>
<td></td>
<td>U2R</td>
<td>0.12</td>
<td>0.14</td>
<td>0.15</td>
<td><bold>0.17</bold></td>
</tr>
<tr>
<td></td>
<td>Normal</td>
<td>0.96</td>
<td>0.96</td>
<td><bold>0.97</bold></td>
<td><bold>0.97</bold></td>
</tr>
<tr>
<td></td>
<td>DoS Hulk</td>
<td>0.97</td>
<td><bold>0.98</bold></td>
<td><bold>0.98</bold></td>
<td><bold>0.98</bold></td>
</tr>
<tr>
<td></td>
<td>PortScan</td>
<td>0.81</td>
<td>0.84</td>
<td>0.86</td>
<td><bold>0.87</bold></td>
</tr>
<tr>
<td></td>
<td>DDoS</td>
<td><bold>0.99</bold></td>
<td>0.97</td>
<td><bold>0.99</bold></td>
<td>0.98</td>
</tr>
<tr>
<td>CICIDS17</td>
<td>DoS GoldenEye</td>
<td>0.36</td>
<td>0.38</td>
<td><bold>0.44</bold></td>
<td>0.38</td>
</tr>
<tr>
<td></td>
<td>FTP-Patator</td>
<td>0.70</td>
<td><bold>0.97</bold></td>
<td>0.94</td>
<td>0.84</td>
</tr>
<tr>
<td></td>
<td>SSH-Patator</td>
<td>0.53</td>
<td>0.69</td>
<td>0.61</td>
<td><bold>0.69</bold></td>
</tr>
<tr>
<td></td>
<td>DoS slowloris</td>
<td>0.21</td>
<td>0.20</td>
<td>0.22</td>
<td><bold>0.40</bold></td>
</tr>
<tr>
<td></td>
<td>DoS Slowhttptest</td>
<td><bold>0.47</bold></td>
<td>0.17</td>
<td>0.41</td>
<td>0.27</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>For CICIDS17, all methods attain high F1 on the Normal and DoS Hulk traffic, with PGD and CF-Aug slightly outperforming the others (0.97 for Normal and 0.98 for DoS Hulk). On more challenging or less frequent attack categories, the behavior diverges more clearly. CF-Aug achieves the best F1 for PortScan (0.87) and DoS slowloris (0.40), and matches the highest score for SSH-Patator (0.69), highlighting its effectiveness in improving detection of certain sophisticated or low-frequency attacks. Conversely, AT, especially with FGSM, excels on FTP-Patator (0.97), while PGD-based AT offers the best performance for DoS GoldenEye (0.44). However, on DoS Slowhttptest the Baseline model still yields the highest F1 (0.47), suggesting that adversarial objectives can occasionally distort the decision boundary in ways that hurt specific classes.</p>
<p>Overall, these results suggest a complementary effect between AT and CF-Aug. AT tends to enhance robustness and F1 for several medium-frequency attack types, whereas CF-Aug is particularly beneficial for hard or underrepresented classes, improving their detection without degrading performance on the dominant classes. This supports the claim that counterfactual samples provide targeted, label-preserving variations that help the model better discriminate fine-grained attack behaviors. <xref ref-type="fig" rid="fig-2">Fig. 2</xref> compares the overall accuracy (OA), WeightedF1, and MacroF1 obtained on NSL-KDD and CICIDS17 for the four training regimes (Baseline, AT with FGSM, AT with PGD, and CF-Aug). Across both datasets, AT with PGD and the CF-Aug models consistently outperform the Baseline and FGSM-based AT in all three metrics. On NSL-KDD, AT (PGD) and CF-Aug yield the highest OA and WeightedF1, while CF-Aug achieves the best MacroF1, indicating a more balanced treatment of minority classes. A similar trend is observed on CICIDS17, where AT (PGD) and CF-Aug steadily increase OA and WeightedF1 over the Baseline, and both reach the highest MacroF1 values.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Overall accuracy (OA), WeightedF1, and MacroF1 for NSL-KDD and CICIDS17</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76608-fig-2.tif"/>
</fig>
<p>The improvement in MacroF1 under CF-Aug is driven primarily by gains on rare classes such as R2L and U2R in NSL-KDD and low-frequency DoS/Probe attacks in CICIDS17 (see <xref ref-type="table" rid="table-5">Table 5</xref>). Because counterfactual samples are generated near the decision boundary, they tend to concentrate around ambiguous regions where minority classes are under-represented, effectively increasing the density of informative training points for these classes. In contrast, AT doubles the dataset size by adding highly correlated adversarial variants of all samples. While this improves OA and WeightedF1, it can also reallocate capacity towards the majority or easier classes, which is reflected in slight drops for some rare categories (e.g., DoS Slowhttptest) despite the overall positive trend. Together, these results suggest that AT (PGD) and CF-Aug provide complementary benefits: AT mainly hardens the decision boundary around frequent behaviors, whereas CF-Aug focuses on minority, near-boundary behaviors, leading to more accurate and more balanced intrusion-detection performance.</p>

<p>All results reported in this work are obtained from a single train/validation/test split with a fixed random seed. While this is common practice in the IDS literature, it does not capture variability due to random initialisation and sampling. A more exhaustive study with multiple independent runs (reporting mean and standard deviation of OA and F1 scores) is an important direction for future work. Nevertheless, we note that the relative behaviour of the Baseline, AT, and CF-Aug models is consistent across our experiments and in line with the qualitative differences highlighted by the explanation analysis.</p>
</sec>
<sec id="s4_3_2">
<label>4.3.2</label>
<title>Analysis of the XAI-Based Strategy</title>
<p><xref ref-type="fig" rid="fig-3">Fig. 3</xref> visualizes the geometric relationship between the original training samples and the different types of synthetic data used in our experiments. Each plot shows a two&#x2013;dimensional PCA projection of the feature space, with original points overlaid with either CF-based samples or adversarial samples generated by FGSM and PGD.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>PCA projections of original and synthetic samples for NSL-KDD (top row) and CICIDS17 (bottom row). Each panel shows the original training samples (blue) together with (<bold>a,d</bold>) counterfactual (CF) samples, (<bold>b,e</bold>) FGSM adversarial samples, and (<bold>c,f</bold>) PGD adversarial samples. Axes correspond to the first two principal components computed on the original training data</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76608-fig-3.tif"/>
</fig>
<p>For NSL-KDD (top row, <xref ref-type="fig" rid="fig-3">Fig. 3a</xref>&#x2013;<xref ref-type="fig" rid="fig-3">c</xref>), the CF samples (<xref ref-type="fig" rid="fig-3">Fig. 3a</xref>) largely trace the same manifold as the original data, forming tight clouds that remain close to the corresponding original points. This indicates that counterfactuals perform local, label-preserving moves in feature space, exploring nearby decision boundaries without drifting far from the data distribution. By contrast, FGSM and especially PGD (<xref ref-type="fig" rid="fig-3">Fig. 3b</xref>,<xref ref-type="fig" rid="fig-3">c</xref>) produce more pronounced displacements: the perturbed samples spread further along specific directions of the PCA axes, revealing that adversarial samples tend to push inputs toward regions where the classifier is uncertain or misclassifies, even when those regions lie somewhat off the main data manifold.</p>
<p>A similar pattern is observed for CICIDS17 (bottom row, <xref ref-type="fig" rid="fig-3">Fig. 3d</xref>&#x2013;<xref ref-type="fig" rid="fig-3">f</xref>). The CF samples again stay close to the dense regions of the original distribution, effectively thickening the existing clusters rather than creating new ones. In comparison, adversarial samples generated by FGSM and PGD introduce more extreme shifts and sharper &#x201C;fringes&#x201D; around the original clusters, reflecting higher-magnitude, targeted perturbations. Overall, these visualizations support our interpretation that CF-Augment enriches the training data with realistic, near-manifold variations, whereas AT exposes the model to more aggressive worst-case perturbations that deliberately probe the boundaries of the learned decision regions. To complement the qualitative impression from <xref ref-type="fig" rid="fig-3">Fig. 3</xref>, we compute a simple quantitative measure of displacement in the PCA space. Let <inline-formula id="ieqn-95"><mml:math id="mml-ieqn-95"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">z</mml:mtext></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:math></inline-formula> denote the projection of an original sample <inline-formula id="ieqn-96"><mml:math id="mml-ieqn-96"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> onto the first two principal components and <inline-formula id="ieqn-97"><mml:math id="mml-ieqn-97"><mml:msubsup><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">z</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>m</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> the projection of its corresponding synthetic sample generated by method <inline-formula id="ieqn-98"><mml:math id="mml-ieqn-98"><mml:mi>m</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mtext>CF</mml:mtext><mml:mo>,</mml:mo><mml:mtext>FGSM</mml:mtext><mml:mo>,</mml:mo><mml:mtext>PGD</mml:mtext><mml:mo fence="false" stretchy="false">}</mml:mo></mml:math></inline-formula>. We define the average PCA displacement for method <inline-formula id="ieqn-99"><mml:math id="mml-ieqn-99"><mml:mi>m</mml:mi></mml:math></inline-formula> as
<disp-formula id="ueqn-5"><mml:math id="mml-ueqn-5" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:msub><mml:mi>d</mml:mi><mml:mi>m</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mstyle scriptlevel="0"><mml:mrow><mml:mo symmetric="true" maxsize="1.2em" minsize="1.2em">&#x2016;</mml:mo></mml:mrow></mml:mstyle><mml:msubsup><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">z</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>m</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">z</mml:mtext></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mstyle scriptlevel="0"><mml:mrow><mml:mo symmetric="true" maxsize="1.2em" minsize="1.2em">&#x2016;</mml:mo></mml:mrow></mml:mstyle><mml:mn>2</mml:mn></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>

<p>Across both NSL-KDD and CICIDS17 we obtain <inline-formula id="ieqn-100"><mml:math id="mml-ieqn-100"><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mtext>CF</mml:mtext></mml:mrow></mml:msub><mml:mo>&#x003C;</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mtext>FGSM</mml:mtext></mml:mrow></mml:msub><mml:mo>&#x003C;</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mtext>PGD</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>, confirming that counterfactual samples remain closest to their original counterparts in the projected space, whereas PGD-based adversarial samples move farthest away.</p>
<p><xref ref-type="fig" rid="fig-4">Fig. 4</xref> presents heatmaps of the top&#x2013;20 ranked features for the four models (Baseline, AT(FGSM), AT(PGD), and CF-Aug) on NSL-KDD and CICIDS17. Each row corresponds to a feature and each column to a model; the cell value denotes the rank of that feature within the corresponding model (1 &#x003D; most important, 20 &#x003D; least important among the top&#x2013;20), while empty cells indicate that the feature does not appear in that model&#x2019;s top&#x2013;20 list.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>DALEX-based ranking of the top 20 features. (1) Baseline, (2) AT (FGSM), (3) AT (PGD), (4) CF-Aug</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76608-fig-4.tif"/>
</fig>
<p>For CICIDS17 (<xref ref-type="fig" rid="fig-4">Fig. 4</xref>), the Baseline and both adversarially trained models exhibit very similar importance profiles. Their top&#x2013;10 is dominated by <italic>Idle Max</italic>, <italic>Protocol</italic>, and <italic>BWD IAT Total</italic>, together with related connection-count features. In contrast, these features drop out of the top&#x2013;10 for the CF-Aug model, which instead assigns the highest importance to packet- and flow-level descriptors, including <italic>Average Packet Size</italic>, <italic>Packet Length Std</italic>, <italic>Bwd Segment Size Avg</italic>, and <italic>RST Flag Count</italic>. This shift suggests that CF-Aug encourages the detector to rely less on coarse host aggregates&#x2014;which can be more easily manipulated by changing traffic mix&#x2014;and more on fine-grained traffic characteristics that are tightly coupled to the morphology of malicious flows.</p>

<p>On NSL-KDD (<xref ref-type="fig" rid="fig-4">Fig. 4</xref>), all four models agree on the central role of the <italic>dst_host</italic> family of attributes: <italic>dst_host_serror_rate</italic>, <italic>dst_host_srv_count</italic>, and <italic>dst_host_rerror_rate</italic> consistently appear among the top three features across training regimes. However, the CF-Aug model diverges in the lower portion of the ranking. In addition to these dominant host-level rates, CF-Aug elevates more semantic features such as <italic>is_guest_login</italic>, <italic>srv_rerror_rate</italic>, and <italic>su_attempt</italic>, which are strongly associated with R2L and U2R behaviors. This is consistent with our per-class results, where CF-Aug provides the largest gains for these minority attack classes. Overall, the heatmaps indicate that AT largely preserves the baseline inductive biases, whereas CF-Aug actively reshapes the feature-importance landscape in a way that emphasizes features relevant to underrepresented and harder-to-detect threats.</p>

<p>These differences can be traced back to the way the two augmentation schemes interact with the decision boundary. Counterfactuals are explicitly optimized to flip the prediction with minimal, domain-feasible changes, so for minority classes, they tend to perturb precisely those few semantic attributes that separate R2L/U2R (or rare CICIDS17 attacks) from normal and majority classes. When the model is retrained on <inline-formula id="ieqn-101"><mml:math id="mml-ieqn-101"><mml:mrow><mml:mi>&#x1D4AF;</mml:mi></mml:mrow><mml:mo>&#x222A;</mml:mo><mml:mrow><mml:mi>&#x1D49E;</mml:mi></mml:mrow></mml:math></inline-formula>, these boundary-probing counterfactuals repeatedly expose the classifier to such semantic cues in ambiguous regions of the feature space, naturally increasing their global importance. In contrast, FGSM/PGD adversarial samples are generated by following the loss gradient around all training points within an <inline-formula id="ieqn-102"><mml:math id="mml-ieqn-102"><mml:msub><mml:mi>&#x2113;</mml:mi><mml:mi mathvariant="normal">&#x221E;</mml:mi></mml:msub></mml:math></inline-formula> factor; they do not target specific minority decision fronts, and they primarily nudge the model to stabilize its existing predictions without substantially reweighting, which features those predictions depend on. As a result, AT tends to maintain the baseline inductive biases visible in the heatmaps, whereas CF-Aug shifts the model toward features that are most informative for underrepresented attack classes. A systematic comparison of our DALEX-based analysis with alternative XAI methods, such as SHAP and LIME, is left as future work.</p>
</sec>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Conclusion</title>
<p>In this paper, we examined how two complementary data-centric strategies, AT and CF-Aug, reshape the accuracy and interpretability of DL-based IDS. Using NSL-KDD and CICIDS17 as benchmark datasets, we instantiated a common experimental pipeline in which a baseline DNN was first trained on the original data, and then contrasted with variants retrained on adversarially generated samples (FGSM- and PGD-based) and on label-preserving counterfactual samples. This design allowed us to disentangle the effects of boundary hardening via adversarial perturbations from those of boundary sculpting via counterfactual explanations, and to assess their impact on predictive performance. The empirical results indicate that AT and CF-Aug consistently improve the overall performance compared to the baseline in terms of multiple metrics: OA, WeightedF1, and MacroF1. The explainability analysis further elucidates how these training regimes alter model behavior. Global feature-importance profiles indicate that both AT and CF-Aug can shift emphasis away from brittle or spurious signals toward more stable, domain-relevant features. However, CF-Aug tends to produce models whose importance rankings are more stable under perturbations and more closely aligned with security intuition, while AT can occasionally distort the decision surface for specific classes despite improving aggregate robustness. Taken together, quantitative metrics and explanation profiles underscore that no single strategy universally outperforms others; rather, AT and CF-Aug provide complementary benefits that can be chosen or combined depending on whether the primary objective is worst-case robustness, minority-class detection, or analyst-facing interpretability. A number of avenues for future work arise naturally from this study. On the robustness side, extending the evaluation to broader threat models, including black-box, transfer, and problem-space attacks, would provide a more complete picture of the protection afforded by AT and CF-Aug. On the explainability side, taking global analysis, combining it with local explanations, and adding causal constraints might yield richer, action-oriented insights for Security Operations Center workflows.</p>
</sec>
</body>
<back>
<ack>
<p>Not applicable.</p>
</ack>
<sec>
<title>Funding Statement</title>
<p>Not applicable.</p>
</sec>
<sec>
<title>Author Contributions</title>
<p>Conceptualization: Malik Al-Essa, Mohammad Qatawnweh, and Orieb Abualghanam; Methodology: Malik Al-Essa, Mohammad Qatawneh, Ahmad Sami Al-Shamayleh, Orieb Abualghanam, and Wesam Almobaideen; Software: Malik Al-Essa, Orieb Abualghanam, and Wesam Almobaideen; Validation: Malik Al-Essa, Mohammad Qatawneh, Ahmad Sami Al-Shamayleh, Orieb Abualghanam, and Wesam Almobaideen; Formal analysis: Malik Al-Essa, Mohammad Qatawneh, Ahmad Sami Al-Shamayleh, Orieb Abualghanam, and Wesam Almobaideen; Investigation: Malik Al-Essa, and Mohammad Qatawnweh; Resources: Malik Al-Essa, Mohammad Qatawneh, Ahmad Sami Al-Shamayleh, Orieb Abualghanam, and Wesam Almobaideen; Data curation: Malik Al-Essa, and Orieb Abualghanam; Writing&#x2014;original draft preparation: Malik Al-Essa, Mohammad Qatawneh, Ahmad Sami Al-Shamayleh, Orieb Abualghanam, and Wesam Almobaideen; Writing&#x2014;review and editing: Malik Al-Essa, Mohammad Qatawneh, Ahmad Sami Al-Shamayleh, Orieb Abualghanam, and Wesam Almobaideen; Visualization: Malik Al-Essa, Mohammad Qatawneh, and Ahmad Sami Al-Shamayleh; Supervision: Wesam Almobaideen, and Mohammad Qatawneh; Project administration: Wesam Almobaideen, and Mohammad Qatawneh. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability">
<title>Availability of Data and Materials</title>
<p>Data is openly available in a public repository.</p>
</sec>
<sec>
<title>Ethics Approval</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement">
<title>Conflicts of Interest</title>
<p>The authors declare no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Malik</surname> <given-names>J</given-names></string-name>, <string-name><surname>Akhunzada</surname> <given-names>A</given-names></string-name>, <string-name><surname>Al-Shamayleh</surname> <given-names>AS</given-names></string-name>, <string-name><surname>Zeadally</surname> <given-names>S</given-names></string-name>, <string-name><surname>Almogren</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Hybrid deep learning based threat intelligence framework for industrial IoT systems</article-title>. <source>J Ind Inf Integr</source>. <year>2025</year>;<volume>45</volume>(<issue>1</issue>):<fpage>100846</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.jii.2025.100846</pub-id>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Assmi</surname> <given-names>H</given-names></string-name>, <string-name><surname>Guezzaz</surname> <given-names>A</given-names></string-name>, <string-name><surname>Benkirane</surname> <given-names>S</given-names></string-name>, <string-name><surname>Azrour</surname> <given-names>M</given-names></string-name>, <string-name><surname>Jabbour</surname> <given-names>S</given-names></string-name>, <string-name><surname>Innab</surname> <given-names>N</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>A robust security detection strategy for next generation IoT networks</article-title>. <source>Comput Mater Contin</source>. <year>2025</year>;<volume>82</volume>(<issue>1</issue>):<fpage>443</fpage>&#x2013;<lpage>66</lpage>. doi:<pub-id pub-id-type="doi">10.32604/cmc.2024.059047</pub-id>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ghazal</surname> <given-names>TM</given-names></string-name>, <string-name><surname>Hasan</surname> <given-names>MK</given-names></string-name>, <string-name><surname>Raju</surname> <given-names>KN</given-names></string-name>, <string-name><surname>Khan</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Alshamayleh</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bhatt</surname> <given-names>MW</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Data space privacy model with federated learning technique for securing IoT communications in autonomous marine vehicles</article-title>. <source>J Intell Robotic Syst</source>. <year>2025</year>;<volume>111</volume>(<issue>3</issue>):<fpage>84</fpage>. doi:<pub-id pub-id-type="doi">10.1007/s10846-025-02298-1</pub-id>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Al-Essa</surname> <given-names>M</given-names></string-name>, <string-name><surname>Appice</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Dealing with imbalanced data in multi-class network intrusion detection systems using xgboost</article-title>. In: <conf-name>Joint European Conference on Machine Learning and Knowledge Discovery in Databases</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2021</year>. p. <fpage>5</fpage>&#x2013;<lpage>21</lpage>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ali</surname> <given-names>WA</given-names></string-name>, <string-name><surname>Roccotelli</surname> <given-names>M</given-names></string-name>, <string-name><surname>Boggia</surname> <given-names>G</given-names></string-name>, <string-name><surname>Fanti</surname> <given-names>MP</given-names></string-name></person-group>. <article-title>Intrusion detection system for vehicular ad hoc network attacks based on machine learning techniques</article-title>. <source>Inf Secur J Global Perspect</source>. <year>2024</year>;<volume>33</volume>(<issue>6</issue>):<fpage>659</fpage>&#x2013;<lpage>77</lpage>. doi:<pub-id pub-id-type="doi">10.1080/19393555.2024.2307638</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Xu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>H</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>Z</given-names></string-name></person-group>. <article-title>Adversarial machine learning in cybersecurity: attacks and defenses</article-title>. <source>Int J Manag Sci Res</source>. <year>2025</year>;<volume>8</volume>(<issue>2</issue>):<fpage>26</fpage>&#x2013;<lpage>33</lpage>. doi:<pub-id pub-id-type="doi">10.53469/ijomsr.2025.08(02).04</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Asha</surname> <given-names>S</given-names></string-name>, <string-name><surname>Vinod</surname> <given-names>P</given-names></string-name></person-group>. <article-title>Evaluation of adversarial machine learning tools for securing AI systems</article-title>. <source>Clust Comput</source>. <year>2022</year>;<volume>25</volume>(<issue>1</issue>):<fpage>503</fpage>&#x2013;<lpage>22</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s10586-021-03421-1</pub-id>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Goodfellow</surname> <given-names>IJ</given-names></string-name>, <string-name><surname>Shlens</surname> <given-names>J</given-names></string-name>, <string-name><surname>Szegedy</surname> <given-names>C</given-names></string-name></person-group>. <article-title>Explaining and harnessing adversarial examples</article-title>. In: <conf-name>Proceedings of the 3rd International Conference on Learning Representations; 2015 May 7&#x2013;9</conf-name>; <publisher-loc>San Diego, CA, USA</publisher-loc>. p. <fpage>1</fpage>&#x2013;<lpage>11</lpage>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Andriushchenko</surname> <given-names>M</given-names></string-name>, <string-name><surname>Flammarion</surname> <given-names>N</given-names></string-name></person-group>. <article-title>Understanding and improving fast adversarial training</article-title>. In: <conf-name>Proceedings of the 34th International Conference on Neural Information Processing Systems; 2020 Dec 6&#x2013;12</conf-name>; <publisher-loc>Vancouver, BC, Canada</publisher-loc>. p. <fpage>16048</fpage>&#x2013;<lpage>59</lpage>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Szegedy</surname> <given-names>C</given-names></string-name>, <string-name><surname>Zaremba</surname> <given-names>W</given-names></string-name>, <string-name><surname>Sutskever</surname> <given-names>I</given-names></string-name>, <string-name><surname>Bruna</surname> <given-names>J</given-names></string-name>, <string-name><surname>Erhan</surname> <given-names>D</given-names></string-name>, <string-name><surname>Goodfellow</surname> <given-names>IJ</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Intriguing properties of neural networks</article-title>. In: <conf-name>Proceedings of the 2nd International Conference on Learning Representations; 2014 Apr 14&#x2013;16</conf-name>; <publisher-loc>Banff, AB, Canada</publisher-loc>. p. <fpage>1</fpage>&#x2013;<lpage>10</lpage>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Bai</surname> <given-names>T</given-names></string-name>, <string-name><surname>Luo</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhao</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wen</surname> <given-names>B</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Q</given-names></string-name></person-group>. <article-title>Recent advances in adversarial training for adversarial robustness</article-title>. In: <conf-name>Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence Survey Track; 2021 Aug 19&#x2013;26</conf-name>; <publisher-loc>Virtual</publisher-loc>. p. <fpage>4312</fpage>&#x2013;<lpage>21</lpage>. doi:<pub-id pub-id-type="doi">10.24963/ijcai.2021/591</pub-id>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Alhajjar</surname> <given-names>E</given-names></string-name>, <string-name><surname>Maxwell</surname> <given-names>P</given-names></string-name>, <string-name><surname>Bastian</surname> <given-names>N</given-names></string-name></person-group>. <article-title>Adversarial machine learning in Network Intrusion Detection Systems</article-title>. <source>Expert Syst Appl</source>. <year>2021</year>;<volume>186</volume>(<issue>2</issue>):<fpage>115782</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.eswa.2021.115782</pub-id>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Pierazzi</surname> <given-names>F</given-names></string-name>, <string-name><surname>Pendlebury</surname> <given-names>F</given-names></string-name>, <string-name><surname>Cortellazzi</surname> <given-names>J</given-names></string-name>, <string-name><surname>Cavallaro</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Intriguing properties of adversarial ML attacks in the problem space</article-title>. In: <conf-name>Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP); 2020 May 18&#x2013;21</conf-name>; <publisher-loc>San Francisco, CA, USA</publisher-loc>. p. <fpage>1332</fpage>&#x2013;<lpage>49</lpage>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yin</surname> <given-names>C</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>S</given-names></string-name>, <string-name><surname>Fei</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>H</given-names></string-name></person-group>. <article-title>Enhancing network intrusion detection classifiers using supervised adversarial training</article-title>. <source>J Supercomput</source>. <year>2020</year>;<volume>76</volume>(<issue>9</issue>):<fpage>6690</fpage>&#x2013;<lpage>719</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s11227-019-03092-1</pub-id>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>C</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Zhao</surname> <given-names>K</given-names></string-name>, <string-name><surname>Ding</surname> <given-names>X</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>X</given-names></string-name></person-group>. <article-title>AdvAndMal: adversarial training for android malware detection and family classification</article-title>. <source>Symmetry</source>. <year>2021</year>;<volume>13</volume>(<issue>6</issue>):<fpage>1081</fpage>. doi:<pub-id pub-id-type="doi">10.3390/sym13061081</pub-id>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Anthi</surname> <given-names>E</given-names></string-name>, <string-name><surname>Williams</surname> <given-names>L</given-names></string-name>, <string-name><surname>Rhode</surname> <given-names>M</given-names></string-name>, <string-name><surname>Burnap</surname> <given-names>P</given-names></string-name>, <string-name><surname>Wedgbury</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Adversarial attacks on machine learning cybersecurity defences in industrial control systems</article-title>. <source>J Inf Secur Appl</source>. <year>2021</year>;<volume>58</volume>(<issue>8</issue>):<fpage>102717</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.jisa.2020.102717</pub-id>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Khamis</surname> <given-names>RA</given-names></string-name>, <string-name><surname>Matrawy</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Evaluation of adversarial training on different types of neural networks in deep learning-based IDSs</article-title>. In: <conf-name>Proceedings of the 2020 International Symposium on Networks, Computers and Communications (ISNCC); 2020 Oct 20&#x2013;22</conf-name>; <publisher-loc>Montreal, QC, Canada</publisher-loc>. p. <fpage>1</fpage>&#x2013;<lpage>6</lpage>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Warnecke</surname> <given-names>A</given-names></string-name>, <string-name><surname>Arp</surname> <given-names>D</given-names></string-name>, <string-name><surname>Wressnegger</surname> <given-names>C</given-names></string-name>, <string-name><surname>Rieck</surname> <given-names>K</given-names></string-name></person-group>. <article-title>Evaluating explanation methods for deep learning in security</article-title>. In: <conf-name>Proceedings of the 2020 IEEE European Symposium on Security and Privacy (EuroS&#x0026;P); 2020 Sep 7&#x2013;11</conf-name>; <publisher-loc>Genoa, Italy</publisher-loc>. p. <fpage>158</fpage>&#x2013;<lpage>74</lpage>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>M</given-names></string-name>, <string-name><surname>Zheng</surname> <given-names>K</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>X</given-names></string-name></person-group>. <article-title>An explainable machine learning framework for intrusion detection systems</article-title>. <source>IEEE Access</source>. <year>2020</year>;<volume>8</volume>:<fpage>73127</fpage>&#x2013;<lpage>41</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2020.2988359</pub-id>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Sangeetha</surname> <given-names>SKB</given-names></string-name>, <string-name><surname>A</surname> <given-names>NB</given-names></string-name></person-group>. <article-title>Intrumer: a multi module distributed explainable IDS/IPS for securing cloud environment</article-title>. <source>Comput Mater Contin</source>. <year>2025</year>;<volume>82</volume>(<issue>1</issue>):<fpage>579</fpage>&#x2013;<lpage>607</lpage>. doi:<pub-id pub-id-type="doi">10.32604/cmc.2024.059805</pub-id>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Marino</surname> <given-names>DL</given-names></string-name>, <string-name><surname>Wickramasinghe</surname> <given-names>CS</given-names></string-name>, <string-name><surname>Manic</surname> <given-names>M</given-names></string-name></person-group>. <article-title>An adversarial approach for explainable AI in intrusion detection systems</article-title>. In: <conf-name>Proceedings of the 44th Annual Conference of the IEEE Industrial Electronics Society, IECON 2018; 2018 Oct 21&#x2013;23</conf-name>; <publisher-loc>Washington, DC, USA</publisher-loc>. p. <fpage>3237</fpage>&#x2013;<lpage>43</lpage>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Malik</surname> <given-names>AE</given-names></string-name>, <string-name><surname>Andresini</surname> <given-names>G</given-names></string-name>, <string-name><surname>Appice</surname> <given-names>A</given-names></string-name>, <string-name><surname>Malerba</surname> <given-names>D</given-names></string-name></person-group>. <article-title>An XAI-based adversarial training approach for cyber-threat detection</article-title>. In: <conf-name>Proceedings of the 2022 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing; 2022 Sep 12&#x2013;15</conf-name>; <publisher-loc>Falerna, Italy</publisher-loc>. p. <fpage>1</fpage>&#x2013;<lpage>8</lpage>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kuppa</surname> <given-names>A</given-names></string-name>, <string-name><surname>Le-Khac</surname> <given-names>NA</given-names></string-name></person-group>. <article-title>Adversarial XAI methods in cybersecurity</article-title>. <source>IEEE Trans Inf Forensics Secur</source>. <year>2021</year>;<volume>16</volume>:<fpage>4924</fpage>&#x2013;<lpage>38</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tifs.2021.3117075</pub-id>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Pawlicki</surname> <given-names>M</given-names></string-name>, <string-name><surname>Pawlicka</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kozik</surname> <given-names>R</given-names></string-name>, <string-name><surname>Chora&#x015B;</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Explainability vs. security: the unintended consequences of xAI in cybersecurity</article-title>. In: <conf-name>Proceedings of the 2nd ACM Workshop on Secure and Trustworthy Deep Learning Systems; 2024 Jul 2&#x2013;20</conf-name>; <publisher-loc>Singapore</publisher-loc>. p. <fpage>1</fpage>&#x2013;<lpage>7</lpage>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Shah</surname> <given-names>SHA</given-names></string-name>, <string-name><surname>Akhtar</surname> <given-names>LH</given-names></string-name>, <string-name><surname>Ali</surname> <given-names>MN</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>BS</given-names></string-name></person-group>. <article-title>Golden jackal driven optimization for a transparent and interpretable intrusion detection system using explainable AI to revolutionize cybersecurity</article-title>. <source>Egypt Inform J</source>. <year>2025</year>;<volume>32</volume>(<issue>1</issue>):<fpage>100837</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.eij.2025.100837</pub-id>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Ali</surname> <given-names>MN</given-names></string-name>, <string-name><surname>Imran</surname> <given-names>M</given-names></string-name>, <string-name><surname>Bahoo</surname> <given-names>G</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>BS</given-names></string-name></person-group>. <article-title>DoS attack detection enhancement in autonomous vehicle systems with explainable AI</article-title>. In: <conf-name>Proceedings of the 2024 IEEE International Conference on Future Machine Learning and Data Science (FMLDS); 2024 Nov 20&#x2013;23</conf-name>; <publisher-loc>Sydney, NSW, Australia</publisher-loc>. p. <fpage>228</fpage>&#x2013;<lpage>33</lpage>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Chang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Rodr&#x00ED;guez</surname> <given-names>RJ</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>J</given-names></string-name></person-group>. <article-title>LSGAN-AT: enhancing malware detector robustness against adversarial examples</article-title>. <source>Cybersecurity</source>. <year>2021</year>;<volume>4</volume>(<issue>1</issue>):<fpage>1</fpage>&#x2013;<lpage>15</lpage>. doi:<pub-id pub-id-type="doi">10.1186/s42400-021-00102-9</pub-id>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Tavallaee</surname> <given-names>M</given-names></string-name>, <string-name><surname>Bagheri</surname> <given-names>E</given-names></string-name>, <string-name><surname>Lu</surname> <given-names>W</given-names></string-name>, <string-name><surname>Ghorbani</surname> <given-names>AA</given-names></string-name></person-group>. <article-title>A detailed analysis of the KDD CUP 99 data set</article-title>. In: <conf-name>Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications; 2009 Jul 8&#x2013;10</conf-name>; <publisher-loc>Ottawa, ON, Canada</publisher-loc>. p. <fpage>1</fpage>&#x2013;<lpage>6</lpage>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Sharafaldin</surname> <given-names>I</given-names></string-name>, <string-name><surname>Lashkari</surname> <given-names>AH</given-names></string-name>, <string-name><surname>Ghorbani</surname> <given-names>AA</given-names></string-name></person-group>. <article-title>Toward generating a new intrusion detection dataset and intrusion traffic characterization</article-title>. In: <conf-name>Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP 2018); 2018 Jan 22&#x2013;24</conf-name>; <publisher-loc>Funchal, Madeira, Portugal</publisher-loc>. p. <fpage>108</fpage>&#x2013;<lpage>16</lpage>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Engelen</surname> <given-names>G</given-names></string-name>, <string-name><surname>Rimmer</surname> <given-names>V</given-names></string-name>, <string-name><surname>Joosen</surname> <given-names>W</given-names></string-name></person-group>. <article-title>Troubleshooting an intrusion detection dataset: the CICIDS2017 case study</article-title>. In: <conf-name>Proceedings of the 6th IEEE European Symposium on Security and Privacy Workshops, EuroS&#x0026;PW 2021; 2021 May 27</conf-name>; <publisher-loc>San Francisco, CA, USA</publisher-loc>. p. <fpage>7</fpage>&#x2013;<lpage>12</lpage>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Andresini</surname> <given-names>G</given-names></string-name>, <string-name><surname>Pendlebury</surname> <given-names>F</given-names></string-name>, <string-name><surname>Pierazzi</surname> <given-names>F</given-names></string-name>, <string-name><surname>Loglisci</surname> <given-names>C</given-names></string-name>, <string-name><surname>Appice</surname> <given-names>A</given-names></string-name>, <string-name><surname>Cavallaro</surname> <given-names>L</given-names></string-name></person-group>. <article-title>INSOMNIA: towards concept-drift robustness in network intrusion detection</article-title>. In: <conf-name>Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security; 2021 Nov 15</conf-name>; <publisher-loc>Virtual Event, Republic of Korea</publisher-loc>. p. <fpage>111</fpage>&#x2013;<lpage>22</lpage>.</mixed-citation></ref>
</ref-list>
</back></article>