<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="review-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">74473</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2025.074473</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Review</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Recent Advances in Deep-Learning Side-Channel Attacks on AES Implementations</article-title>
<alt-title alt-title-type="left-running-head">Recent Advances in Deep-Learning Side-Channel Attacks on AES Implementations</alt-title>
<alt-title alt-title-type="right-running-head">Recent Advances in Deep-Learning Side-Channel Attacks on AES Implementations</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Wang</surname><given-names>Junnian</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-2" contrib-type="author">
<name name-style="western"><surname>Wang</surname><given-names>Xiaoxia</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Luo</surname><given-names>Zexin</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Ouyang</surname><given-names>Qixiang</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-5" contrib-type="author">
<name name-style="western"><surname>Zhou</surname><given-names>Chao</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-6" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Wang</surname><given-names>Huanyu</given-names></name><xref ref-type="aff" rid="aff-2">2</xref><xref rid="cor1" ref-type="corresp">&#x002A;</xref><email>huanyu@hnust.edu.cn</email></contrib>
<aff id="aff-1"><label>1</label><institution>School of Physics and Electronic Science, Hunan University of Science and Technology</institution>, <addr-line>Xiangtan, 411201</addr-line>, <country>China</country></aff>
<aff id="aff-2"><label>2</label><institution>School of Computer Science and Engineering, Hunan University of Science and Technology</institution>, <addr-line>Xiangtan, 411201</addr-line>, <country>China</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Huanyu Wang. Email: <email>huanyu@hnust.edu.cn</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2026</year>
</pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>10</day><month>2</month><year>2026</year>
</pub-date>
<volume>87</volume>
<issue>1</issue>
<elocation-id>3</elocation-id>
<history>
<date date-type="received">
<day>11</day>
<month>10</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>18</day>
<month>11</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2026 The Authors.</copyright-statement>
<copyright-year>2026</copyright-year>
<copyright-holder>Published by Tech Science Press.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_74473.pdf"></self-uri>
<abstract>
<p>Internet of Things (IoTs) devices are bringing about a revolutionary change our society by enabling connectivity regardless of time and location. However, The extensive deployment of these devices also makes them attractive victims for the malicious actions of adversaries. Within the spectrum of existing threats, Side-Channel Attacks (SCAs) have established themselves as an effective way to compromise cryptographic implementations. These attacks exploit unintended, unintended physical leakage that occurs during the cryptographic execution of devices, bypassing the theoretical strength of the crypto design. In recent times, the advancement of deep learning has provided SCAs with a powerful ally. Well-trained deep-learning models demonstrate an exceptional capacity to identify correlations between side-channel measurements and sensitive data, thereby significantly enhancing such attacks. To further understand the security threats posed by deep-learning SCAs and to aid in formulating robust countermeasures in the future, this paper undertakes an exhaustive investigation of leading-edge SCAs targeting Advanced Encryption Standard (AES) implementations. The study specifically focuses on attacks that exploit power consumption and electromagnetic (EM) emissions as primary leakage sources, systematically evaluating the extent to which diverse deep learning techniques enhance SCAs across multiple critical dimensions. These dimensions include: (i) the characteristics of publicly available datasets derived from various hardware and software platforms; (ii) the formalization of leakage models tailored to different attack scenarios; (iii) the architectural suitability and performance of state-of-the-art deep learning models. Furthermore, the survey provides a systematic synthesis of current research findings, identifies significant unresolved issues in the existing literature and suggests promising directions for future work, including cross-device attack transferability and the impact of quantum-classical hybrid computing on side-channel security.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Side-channel attacks</kwd>
<kwd>deep learning</kwd>
<kwd>advanced encryption standard</kwd>
<kwd>power analysis</kwd>
<kwd>EM analysis</kwd>
</kwd-group>
<funding-group>
<award-group id="awg1">
<funding-source>The Key R&#x0026;D Program of Hunan Province of the Department of Science and Technology of Hunan Province</funding-source>
<award-id>2025AQ2024</award-id>
</award-group>
<award-group id="awg2">
<funding-source>Distinguished Young Scientists Fund of Hunan Education Department</funding-source>
<award-id>24B0446</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>The Internet of Things (IoT) is bringing about a revolution in society with advanced connectivity and real-time analytics, turning sensor data into immediate and actionable insights for better operations. Real-time analytics is key to preventing downtime and managing risks, but integrating it with IoT involves challenges such as securing the system. However, since numerous embedded edge devices commonly execute encryption and decryption operations on-site, significant security concerns are triggered by Side-Channel Attacks(SCAs), as shown in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>. According to the inherent imperfections of the physical implementations, there might be some physical leakage, such as power consumption and electromagnetic (EM) emissions, and such leakages could disclose sensitive data in IoT embedded devices. Adversaries can exploit these non-intentional physical leakages to analyze and extract the secret. These are called SCAs. The secret key leaking from a cryptographic module could cause information security to be entirely compromised across IoT systems [<xref ref-type="bibr" rid="ref-1">1</xref>].</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Illustration of how SCAs extract secrets from physical devices. It depicts the process where attackers capture side-channel leakage from devices, analyze the measurements via techniques like deep learning, and ultimately extract sensitive data</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_74473-fig-1.tif"/>
</fig>
<p>Over the past two decades, diverse side channels have been exploited across various applications. These include compromising cryptographic implementations [<xref ref-type="bibr" rid="ref-2">2</xref>,<xref ref-type="bibr" rid="ref-3">3</xref>], reverse-engineering neural network architectures [<xref ref-type="bibr" rid="ref-4">4</xref>,<xref ref-type="bibr" rid="ref-5">5</xref>], stealing intellectual property [<xref ref-type="bibr" rid="ref-6">6</xref>], monitoring user-browser information [<xref ref-type="bibr" rid="ref-7">7</xref>,<xref ref-type="bibr" rid="ref-8">8</xref>], predicting the output generated by random number generators [<xref ref-type="bibr" rid="ref-9">9</xref>], tracking how code executes [<xref ref-type="bibr" rid="ref-10">10</xref>,<xref ref-type="bibr" rid="ref-11">11</xref>] and intercepting victims&#x2019; password from the keystroke [<xref ref-type="bibr" rid="ref-12">12</xref>,<xref ref-type="bibr" rid="ref-13">13</xref>] and fingerprints [<xref ref-type="bibr" rid="ref-14">14</xref>].</p>
<p>Different side channels can be exploited in various practical scenarios. Consider far field EM SCA [<xref ref-type="bibr" rid="ref-15">15</xref>&#x2013;<xref ref-type="bibr" rid="ref-17">17</xref>] (also called screaming channel attacks), which pose a security threat due to their remote execution capability. These attacks remain effective even when the target device operates in a seemingly secure office environment without direct physical access. Adversaries could position themselves in a neighboring office and capture the EM traces using specialized radio equipment. By analyzing these captured traces, the attacker may achieve various malicious objectives. In another scenario, the adversary might not be interested in the sensitive data itself but rather in the behavior of the victim device. In this case, the attacker could track the execution path of the victim&#x2019;s code by analyzing the traces of specific blocks or even library functions used by the device [<xref ref-type="bibr" rid="ref-10">10</xref>].</p>
<p>However, the attack scenarios can vary significantly depending on the type of side-channel being exploited. For instance, when power consumption is used as the side channel, the attacker generally requires physical access to the target device to measure power consumption during its operation. A practical example of this could involve an adversary legally acquiring a car, which grants them access to the car&#x2019;s digital key. Once in possession of the key, the adversary could capture power traces while using the key to communicate with the car. By analyzing these traces, the cryptographic information shared between the car and the key could be extracted. This would enable the adversary to create unauthorized copies of the car key, which could allow them to illegally replicate and sell backup car keys to customers, without the car company&#x2019;s authorization. This would undoubtedly cause significant damage to the company&#x2019;s profits and reputation.</p>
<p>Presently, there exist six commonly utilized side channels within both academia and industry, as shown below.
<list list-type="bullet">
<list-item>
<p>Power consumption [<xref ref-type="bibr" rid="ref-18">18</xref>], involves exploiting the inherent variations in power consumption exhibited by logic circuits during different operations and with varying data inputs.</p></list-item>
<list-item>
<p>Time consumption [<xref ref-type="bibr" rid="ref-19">19</xref>], exploiting differences that depend on the data in the execution time.</p></list-item>
<list-item>
<p>Optical leakage [<xref ref-type="bibr" rid="ref-20">20</xref>], involves exploiting differences of optical properties of silicon due to changes in voltage or current.</p></list-item>
<list-item>
<p>Acoustic leakage [<xref ref-type="bibr" rid="ref-21">21</xref>], which entails exploiting the piezoelectric characteristics of ceramic capacitors, functions in power supply filtering and the conversion of AC to DC.</p></list-item>
<list-item>
<p>Near field EM emissions [<xref ref-type="bibr" rid="ref-22">22</xref>], resulting from the rapid change of current in the logic components. These emissions exhibit high-frequency components and are typically identified by a close-proximity probe placed in the immediate vicinity of the chip.</p></list-item>
<list-item>
<p>Far field EM emissions (screaming channels) [<xref ref-type="bibr" rid="ref-15">15</xref>], resulting from the coupling that occurs between distinct components on mixed-signal chips. These far-field EM emissions are detectable at a certain distance away from the target device. Thus, attackers do not have to physically approach the victim.</p></list-item>
</list></p>
<p>This paper focuses on two side channels: power consumption and EM emissions.</p>
<p>In recent years, Deep Learning (DL) techniques have become extremely prevalent owing to their remarkable ability to find complex patterns and make accurate predictions or classifications from large volumes of data. This popularity is attributed to their success in multiple fields like computer vision [<xref ref-type="bibr" rid="ref-23">23</xref>], natural language processing [<xref ref-type="bibr" rid="ref-24">24</xref>], edge computing [<xref ref-type="bibr" rid="ref-25">25</xref>], and resource management [<xref ref-type="bibr" rid="ref-26">26</xref>]. However, like any great scientific discovery, deep-learning techniques have the potential to be used for malicious purposes. For example, in most Deep-Learning Side-Channel Attacks (DLSCAs), attackers start by building a deep learning model as a leakage profile, aiming to establish a relationship between sensitive data and side-channel traces collected from a copy of the target device&#x2014;a replica referred to as the profiling device, over which the adversaries have complete control. Afterwards, the profiled model can be utilized to categorize the traces collected from the target device, enabling adversaries to extract sensitive data. Typically, a thoroughly trained deep-learning model is capable of enhancing attack efficiency by several orders of magnitude, which is substantially greater than the performance of traditional signal processing methods [<xref ref-type="bibr" rid="ref-16">16</xref>], such as Correlation Power Analysis (CPA) [<xref ref-type="bibr" rid="ref-27">27</xref>] and Template Attacks (TA) [<xref ref-type="bibr" rid="ref-28">28</xref>]. Furthermore, various deep-learning models have demonstrated their capability in assisting adversaries to bypass diverse countermeasures in side-channel attacks. For example, Convolutional Neural Networks (CNNs) have proven to be effective in addressing misaligned traces and overcoming countermeasures based on jitter [<xref ref-type="bibr" rid="ref-29">29</xref>].</p>
<p>As deep-learning based SCAs continue to grow in threat and significance, the foremost concern revolves around effectively mitigating these attacks. It is essential to comprehend the capacities and restrictions of deep-learning based side-channel attacks to develop robust defensive measures in the future. Therefore, comprehensively reviewing literature of DLSCAs is crucial. Although there are certain reviews within the realm of side-channel attacks, their coverage remains inadequate. Reference [<xref ref-type="bibr" rid="ref-30">30</xref>] concludes the conventional SCAs approaches, such as Differential Power Analysis (DPA) [<xref ref-type="bibr" rid="ref-18">18</xref>], template attacks, correlation power analysis, Mutual Information Analysis (MIA) [<xref ref-type="bibr" rid="ref-31">31</xref>], and Test Vector Leakage Assessment (TVLA) [<xref ref-type="bibr" rid="ref-32">32</xref>]. Afterwards, Hettwer et al. review the attacks based on Machine Learning (ML) techniques in [<xref ref-type="bibr" rid="ref-33">33</xref>], for instance the Support Vector Machine (SVM) [<xref ref-type="bibr" rid="ref-34">34</xref>], Decision Trees (DTs) [<xref ref-type="bibr" rid="ref-35">35</xref>] and random forest [<xref ref-type="bibr" rid="ref-36">36</xref>], as presented in the 2020 JCEN journal. In 2023, Picek et al. [<xref ref-type="bibr" rid="ref-37">37</xref>] provided a systematically review of DLSCAs in ACM Computing Surveys, across a broad range of applications and side channels. Reference [<xref ref-type="bibr" rid="ref-38">38</xref>] reviews the current research status of EM-SCA in cryptographic attack scenarios, with a focus on three core dimensions: cryptographic algorithms vulnerable to cryptography, designs and device implementations resistant to attack, and promising emerging EM-SCA attack paradigms. Notably, the review does not place emphasis on the role and application impact of DL techniques in the field of SCA. Reference [<xref ref-type="bibr" rid="ref-39">39</xref>] focuses on attack analysis in SCA based on deep learning techniques, wherein it conducts systematic evaluation and comparative analysis of diverse deep learning-enhanced SCA schemes. Specifically, the performance of these schemes is assessed and compared against the ANSSI SCA database (ASCAD) as the benchmark testbed. Reference [<xref ref-type="bibr" rid="ref-40">40</xref>] provides a comprehensive review of recent advances in attack techniques targeting the Advanced Encryption Standard (AES). Specifically, the reviewed attack methods are systematically categorized into four distinct research domains, namely SCAs, fault injection attacks (FIAs), attacks based on machine learning and artificial intelligence (ML/AI), and quantum computing-enabled threats. Reference [<xref ref-type="bibr" rid="ref-41">41</xref>] investigates SCAs targeting implementations of Post-Quantum Cryptography (PQC) algorithms, and categorizes these attacks from an adversarial perspective to identify the most vulnerable components in the implementations of such algorithms. Building on these foundations, we take a step further by presenting a comprehensive summary of cutting-edge DLSCAs on AES [<xref ref-type="bibr" rid="ref-42">42</xref>] with a particular focus on power analysis and EM analysis across diverse attack scenarios, since AES is the symmetric cryptographic algorithm most widely employed. Our analysis examines these attacks from multiple perspectives, including deep-learning model architectures, hyperparameter optimization, and differences between hardware and software implementations. We need to first stress out that some approaches and works might fit into more than one categories in our survey.</p>
<p><bold><italic>Contributions and Paper Structure</italic></bold></p>
<p>We provide a comprehensive review of how deep-learning techniques can be used in power and EM based analysis to compromise different implementations of AES. AES stands as the most extensively employed symmetric cryptographic algorithm across IoT devices due to its pivotal role in maintaining a specific degree of information security.</p>
<p>For every attack vector, we conduct a systematic review on methodologies presented in the literature and make comparisons across various criteria, including the quantity of measurements needed for the attack, target physical implementations, leakage models, and countermeasures applied. Our primary contribution lies in identifying the DL methods and corresponding parameters that are suitable for specific attack scenarios.</p>
<p>The paper&#x2019;s structure is outlined below. <xref ref-type="sec" rid="s2">Section 2</xref> offers a thorough overview of AES, DLSCAs, and various DL approaches. In <xref ref-type="sec" rid="s3">Section 3</xref>, we review research studies on DLSCAs that utilize power consumption as the side channel. In the last, <xref ref-type="sec" rid="s4">Section 4</xref> exploits open questions related to DLSCAs on AES. <xref ref-type="sec" rid="s5">Sections 5</xref> and <xref ref-type="sec" rid="s6">6</xref> provide a summary of the paper and discuss future works.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Background</title>
<p>This section begins with a review of the AES. Following that, it introduces the concepts of deep learning and explains the role that deep learning plays in enabling side-channel attacks. In addition, it covers various leakage models and commonly utilized evaluation metrics for SCAs.</p>
<sec id="s2_1">
<label>2.1</label>
<title>Advanced Encryption Standard</title>
<p>AES, a symmetric encryption algorithm, was standardized by the U.S. National Institute of Standards and Technology (NIST) in 2001. It is often the preferred choice for implementing cryptographic modules in IoT embedded devices for applications that requiring secure communications and data encryption. This preference is primarily due to the fact that AES is known for its speed, efficiency, and wide support within the industry. In these cases, AES is implemented with different key sizes with <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mi>n</mml:mi></mml:math></inline-formula> bits (<inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>128</mml:mn><mml:mo>,</mml:mo><mml:mn>192</mml:mn><mml:mo>,</mml:mo><mml:mrow><mml:mtext>or</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mn>256</mml:mn></mml:math></inline-formula>), called AES-<inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:mi>n</mml:mi></mml:math></inline-formula>. The device using AES-n has a secret key shared with other authorized parties, <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mrow><mml:mi>&#x1D4A6;</mml:mi></mml:mrow></mml:math></inline-formula>. For instance, AES-128 employs a key size comprising 128 bits. Operating as a block cipher, AES-128 employs the secret key <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:mrow><mml:mi>&#x1D4A6;</mml:mi></mml:mrow></mml:math></inline-formula> to encrypt a 128-bit block of plaintext denoted <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:mrow><mml:mi>&#x1D4AB;</mml:mi></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>128</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> and generates a corresponding 128-bit block of ciphertext <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:mrow><mml:mi>&#x1D49E;</mml:mi></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>128</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>. The quantity of encryption rounds in AES varies according to the size of the key, with 10 rounds for AES-128, 12 rounds for AES-192, and 14 rounds for AES-256.</p>
<p>Using AES-128 as an example, except for the final round, each round consists of four repetitive steps: non-linear substitution (<italic>SubBytes</italic>), transposition of rows (<italic>ShiftRows</italic>), mixing of columns (<italic>MixColumns</italic>), and round key addition (<italic>AddRoundKey</italic>). The final encryption round is not included the <italic>MixColumns</italic> instruction. In the AES algorithm, the SubBytes operation involves the replacement of an 8-bit symbol with another symbol, and this substitution relies on a lookup table known as the <italic>SBox</italic>. AES-128 operations follow matrices in a column-major order of 4 <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mo>&#x00D7;</mml:mo></mml:math></inline-formula> 4, with each element representing a byte. In every encryption round, excluding the final round, initially, the 16-byte intermediate state <italic>X</italic> is rearranged into a 4 <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:mo>&#x00D7;</mml:mo></mml:math></inline-formula> 4 matrix and undergoes processing to obtain the subsequent 16-byte intermediate state <italic>Y</italic>. In each round, a unique round key <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mi>R</mml:mi><mml:mi>K</mml:mi><mml:mi>i</mml:mi></mml:math></inline-formula>, <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:mi>i</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:mn>10</mml:mn><mml:mo fence="false" stretchy="false">}</mml:mo></mml:math></inline-formula>, derived from the original key <italic>K</italic>, is applied in the <italic>AddRoundKey</italic> operation.</p>
<p>A practical SCA aims to obtain secrets, such as the encryption key used in AES implementations, where the collection of all potential keys is denoted by <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:mrow><mml:mi>&#x1D4A6;</mml:mi></mml:mrow></mml:math></inline-formula>. With the aim of retrieving the secret key, attackers often utilizes some known data (like plaintext and ciphertext) and side-channel measurements for the analysis, to identify the correlation between the measured data and the key-related sensitive value. An attack point refers to an intermediate state chosen to describe measurements obtained from side-channel analyses. The selection of attack points for the AES is determined by physical leakage characteristics under specific implementation scenarios, with the target being intermediate values whose computation induces measurable side-channel signals. For software-based AES implementations, the <italic>SubBytes</italic> operation involves substituting an 8-bit symbol with another via a pre-stored lookup table <italic>SBox</italic>, which serves as the core attack point. This is attributed to the fact that reading the <italic>SBox</italic> output to the data bus triggers charge/discharge processes of MOS capacitors&#x2014;specifically, bit flips between the input address and output data exhibit a positive correlation with power consumption peaks. In contrast, for hardware-based AES implementations utilizing FPGAs or ASICs, the round key XOR operation is critical. Owing to its CMOS-based implementation, this operation generates switching currents proportional to the switching activity of transistors. Consequently, the attack point commonly selected for hardware implementations is the XOR operation between the input and output of the final round.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Deep Learning</title>
<p>Through machine learning techniques, systems can extract features from the provided data, allowing them to acquire their own knowledge. As a subset of machine learning, deep learning makes use of deep neural networks as models. While traditional machine learning models usually work on human-engineered features, deep-learning approaches aim to use the deep neural networks for learning features directly from raw data ahead of the task. DL techniques have become the privileged tool for dealing with the many tasks such as classification and prediction. SCAs leveraging deep learning demonstrate strong potential, with performance varying significantly across architectures</p>
<p>Multilayer Perceptron (MLP) ranks among the fundamental neural network types, comprising three components: one input layer, one or more hidden layer(s), and one final output layer. In each layer, except for the output layer, multiple bias neurons are included, and each neuron maintains a full connection to the subsequen layer. Usually a back-propagation training algorithm is applied to train MLPs, in which two steps are repeated on the training sets: forward pass and backward pass. The model assesses how much the true output differs from the classified one, which is called the network output loss. During the backward pass stage, the weights of neurons are updated by calculating the gradients to reduce the loss.</p>
<p>CNN is a category of neural network that employs shape information by employing convolutional layers [<xref ref-type="bibr" rid="ref-43">43</xref>]. Typically, a CNN model is constructed with a series of convolutional modules, where each module is made up of a convolutional layer with a pooling layer coming after it. Following the final module, there are one or more dense layers. The ultimate dense layer incorporates one neuron per class, employing a softmax activation function for classification. One convolutional layer employs a defined number of convolution filters on the raw data of side-channel traces to extract and learn higher-level features, which the model subsequently utilizes for classification. The output of the convolution operation is often referred to as a feature map. A pooling layer serves to reduce the dimensionality of feature maps extracted by convolutional layers through downsampling. In the context of the side-channel attack, CNN models are usually employed to overcome countermeasures, to break masked AES and, to handle noise in traces [<xref ref-type="bibr" rid="ref-29">29</xref>].</p>
<p>Transformer Network (TN) is a neural network architecture that captures long-range dependencies via self-attention mechanisms, distinguishing it from local-feature-focused models like CNNs [<xref ref-type="bibr" rid="ref-44">44</xref>]. Typically, a Transformer model for side-channel tasks consists of stacked encoder layers, where each encoder layer comprises a multi-head self-attention sublayer and a feedforward neural network (FFN) sublayer. Following the encoder stack, there are one or more dense layers; the final dense layer integrates task-specific outputs, employing a softmax activation function for key-related classification in side-channel attacks. The multi-head self-attention sublayer computes relevance weights between all positions in side-channel trace sequences, enabling the model to focus on leakage-correlated segments across the entire trace length. The FFN sublayer then transforms the attention-augmented features to enhance their discriminative power. In the context of side-channel attacks, Transformer can readily capture dependencies between distant PoIs, making them a choice for countering cryptographic implementations protected by measures like masking [<xref ref-type="bibr" rid="ref-45">45</xref>].</p>
<p>A Graph Neural Network (GNN) [<xref ref-type="bibr" rid="ref-46">46</xref>] is a type of neural network used for graph-related deep learning, comprising three core components: an input layer, one or more message-passing hidden layers, and an output layer. In hidden layers, each node aggregates neighbor features via predefined functions while retaining its own to preserve node characteristics. Typically, GNNs are optimized via gradient-based training with two iterative steps on training data: forward and backward propagation. In backward propagation, parameter gradients are computed to update parameters iteratively, minimizing loss and enhancing GNN&#x2019;s ability to capture graph correlations. Notably, GNNs are emerging as a promising option for enhancing the accuracy and effectiveness of SCA detection models, a advantage rooted in their demonstrated efficacy in capturing relational information inherent to graph-structured data [<xref ref-type="bibr" rid="ref-47">47</xref>].</p>
<p>An autoencoder (AE) [<xref ref-type="bibr" rid="ref-48">48</xref>] converts the inputs to an efficient internal representation and then generates the output that is very similar to the input. Thus, In an autoencoder, the output layer&#x2019;s neuron count generally matches the number of inputs. An autoencoder is typically made up of two parts: an encoder and a decoder. The encoder part converts the inputs to an internal representation, and the decoder converts the obtained internal representation to the output. One way to use the autoencoder in SCA scenarios is to denoise the side-channel trace in side-channel measurements [<xref ref-type="bibr" rid="ref-49">49</xref>].</p>
<p>Attention Mechanism and Multi-Scale Convolutional Neural Network (AMCNNet) [<xref ref-type="bibr" rid="ref-50">50</xref>] is a specialized neural network architecture designed for SCA, which enhances feature extraction from multi-dimensional trace data by fusing convolutional layers with attention mechanisms. Typically, an AMCNNet model is composed of a sequence of attention-convolution modules, where each module consists of a multi-channel convolutional layer followed by a channel-wise attention sublayer.The multi-channel convolutional layer applies task-specific filters to raw side-channel traces across multiple data channels, capturing both local temporal features and cross-channel correlations. The channel-wise attention sublayer then assigns adaptive weights to different feature channels, emphasizing leakage-relevant channels while suppressing noise-interfered ones.</p>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>Deep Learning Side-Channel Attack</title>
<p>Side-channel attacks are commonly classified into two categories: <italic>non-profiled</italic> and <italic>profiled</italic>.</p>
<p>Non-profiled attacks seek to directly compromise the implementations of cryptography, exemplified by techniques including DPA [<xref ref-type="bibr" rid="ref-18">18</xref>] or CPA [<xref ref-type="bibr" rid="ref-27">27</xref>].</p>
<p>Profiled attacks begin by learning a leakage profile that establishes a connection between the captured side-channel measurements and the cryptographic algorithm&#x2019;s sensitive intermediate value that is dependent on the key. This learned profile is then utilized to carry out the attack. Profiled attacks generally demonstrate superior efficiency compared to non-profiled attacks when subjected to identical attack conditions. In some specific scenarios, the attack efficiency of profiled attacks can even be several orders of magnitude higher. Nevertheless, it is essential to highlight that setting up profiled attacks generally involves more preparation than non-profiled attacks. In order to understand the leakage profile of the device across all potential values of the sensitive intermediate state, adversaries are required to have full control over one or more copies of the victim device, called profiling device(s), to capture an extensive number of side-channel traces and associated data to construct the leakage profile. Template attacks [<xref ref-type="bibr" rid="ref-28">28</xref>], as an example, utilize traces obtained from the profiling device to generate a set of probability distributions. These distributions are then employed to characterize traces based on the relevant key-dependent intermediate value. This allows the attacker to compare the traces obtained from the targeted device against the templates, facilitating the most probable key used in the cryptographic algorithm [<xref ref-type="bibr" rid="ref-51">51</xref>].</p>
<p>Barring a small number of exceptions, [<xref ref-type="bibr" rid="ref-52">52</xref>], deep learning techniques are frequently employed for side-channel attacks in profiled scenarios in most instances [<xref ref-type="bibr" rid="ref-29">29</xref>,<xref ref-type="bibr" rid="ref-53">53</xref>&#x2013;<xref ref-type="bibr" rid="ref-56">56</xref>]. Typically, deep-learning side-channel attacks involve the following two steps (as shown in <xref ref-type="fig" rid="fig-2">Figs. 2</xref> and <xref ref-type="fig" rid="fig-3">3</xref>):</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>The overview of the profiling stage in DLSCAs on implementations of AES. It illustrates the process: capture side-channel profiling traces from a device with keys, then build a DL model to learn the leakage profile for subsequent attacks</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_74473-fig-2.tif"/>
</fig><fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>The overview of the profiling stage in DLSCAs on implementations of AES, capture traces from a victim device, input them into a trained deep-learning model, and classify to extract the key</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_74473-fig-3.tif"/>
</fig>
<p><bold>Profiling stage.</bold> During this stage, the common assumption is that adversaries possess full control over at least a single profiling device. The device resembles the victim device and programmed with the identical version of the cryptographic algorithm. Therefore, the attacker can capture numerous side-channel traces and gather related information from the profiling device(s) such as known plaintexts and keys.</p>
<p>We use <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:mrow><mml:mi mathvariant="bold-script">T</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:msub><mml:mrow><mml:mi>&#x1D4AF;</mml:mi></mml:mrow><mml:mn>1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x1D4AF;</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="bold-script">T</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mo fence="false" stretchy="false">}</mml:mo></mml:math></inline-formula>, where <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:msub><mml:mrow><mml:mi>&#x1D4AF;</mml:mi></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mi>m</mml:mi></mml:msup></mml:math></inline-formula>, and <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mi>i</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="bold-script">T</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mo fence="false" stretchy="false">}</mml:mo></mml:math></inline-formula>, to indicate a set of traces for detailed analysis. Every trace denotes the entire or partial execution process of AES. The plaintext <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:msub><mml:mrow><mml:mi>&#x1D4AB;</mml:mi></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x2208;</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:msup><mml:mo fence="false" stretchy="false">}</mml:mo><mml:mrow><mml:mn>128</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>, secret key <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:msub><mml:mrow><mml:mi>&#x1D4A6;</mml:mi></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x2208;</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:msup><mml:mo fence="false" stretchy="false">}</mml:mo><mml:mrow><mml:mn>128</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> and ciphertext <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:msub><mml:mrow><mml:mi>&#x1D49E;</mml:mi></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x2208;</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:msup><mml:mo fence="false" stretchy="false">}</mml:mo><mml:mrow><mml:mn>128</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> are all the corresponding information for the trace <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:msub><mml:mrow><mml:mi>&#x1D4AF;</mml:mi></mml:mrow><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula>. We use <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:msub><mml:mrow><mml:mi>&#x1D4AB;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:msub><mml:mrow><mml:mi>&#x1D49E;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> to represent the <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:mi>j</mml:mi></mml:math></inline-formula>th byte of the <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:mn>16</mml:mn></mml:math></inline-formula>-byte plaintext <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:msub><mml:mrow><mml:mi>&#x1D4AB;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and ciphertext <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:msub><mml:mrow><mml:mi>&#x1D49E;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, with <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:mi>j</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:mn>15</mml:mn><mml:mo fence="false" stretchy="false">}</mml:mo></mml:math></inline-formula>. Subsequently, the selected deep-learning model <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula> for the <inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:mi>j</mml:mi></mml:math></inline-formula>th subkey is trained to understand the leakage profile linking traces and the <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:mi>j</mml:mi></mml:math></inline-formula>th key-dependent labels. Based on the chosen attack point and leakage model, the label is determined (as described below).</p>
<p><bold>Attack stage.</bold> At this phase, we presume that the adversary can identify a means to capture a restricted set of side-channel traces from the victim device. In addition, the attacker can also record some known information <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:msub><mml:mrow><mml:mover><mml:mrow><mml:mi>&#x1D4B3;</mml:mi></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula>. Afterwards, the DL model <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula>, after being trained, classifies the traces obtained from the target device. Next, the attacker can use the known information recorded in collaboration with the classification results to deduce the secret subkey <inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:msub><mml:mrow><mml:mrow><mml:mi>&#x1D4A6;</mml:mi></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula>.</p>
<p>Notice that a well-optimized deep learning model is able to notably boost how efficient side-channel attacks are, often outperforming traditional signal processing methods by several orders of magnitude. For example, to compromise an AES implementation, the template attack used in [<xref ref-type="bibr" rid="ref-15">15</xref>] requires 52K traces from the target device at 1 m distance. Each trace amounts to an average of 500 measurements conducted under identical encryption conditions. When it comes to the deep-learning based method, reference [<xref ref-type="bibr" rid="ref-16">16</xref>] only requires 350 traces by using the CNN model to compromise the same AES implementation at 15 m distance, which is a five-order improvement in attack efficiency, even with a longer attack distance. However, this increased efficiency comes at a cost: DLSCAs typically demand substantially more preparation resources, such as computational power, training data, and time, compared to conventional techniques. For some cases the adversaries are unable to obtain the copy of the victim device as a profiling device to train the model, the challenge rises. Thus, a big concern for scenarios where deep learning offers clear advantages in SCAs is that adversaries have to obtain one or more profiling devices that resemble the victim device and can be programmed with the identical version of the cryptographic algorithm.</p>
<sec id="s2_3_1">
<label>2.3.1</label>
<title>Attack Point and Leakage Model</title>
<p>In a side-channel attack, the attack point refers to a specific point or component in the cryptographic algorithm or its implementation that is targeted by the attacker to exploit side-channel information. This attack point is selected based on its potential to leak information related to the secret key or other sensitive data.</p>
<p>The choice of attack point is determined by various factors, such as the side channel type, the characteristics of the target device or algorithm, and the attacker&#x2019;s knowledge and resources. In software implementations of AES (microcontrollers, microprocessors), frequently exploited points of attack include both the initial and final rounds&#x2019; <italic>SBox</italic> output. This is because in software implementations, the resulting 8-bit symbol from the SBox procedure is typically transferred from the memory to a data bus, which often leads to a higher power consumption compared to other processes. It is acknowledged that the predominant share of power use in CMOS devices stems from switching activity within a hardware implementation (such as FPGA or ASIC) of AES. As a result, the attack point for hardware implementations of AES is commonly selected as the XORed value derived from the input and output of the last round.</p>
<p>Within a side-channel attack, a leakage model refers to a mathematical or statistical representation that captures how side-channel measurements of a device relate to the underlying operations or data being processed. The leakage model can capture how side-channel measurements correlate with the internal states or computations of the cryptographic algorithm. A leakage model typically describes how the side-channel measurements of a device changes based on the values of specific variables or intermediate data during the algorithm&#x2019;s execution. Some commonly used leakage models in SCAs include:
<list list-type="simple">
<list-item><label>1.</label><p>Identity (ID) model. Under the assumption of the ID model, at the attack point side-channel, measurements are considered to be directly proportional to the processed data&#x2019;s value. For instance, when the data contains one byte, the ID model generates a set of <inline-formula id="ieqn-33"><mml:math id="mml-ieqn-33"><mml:msup><mml:mn>2</mml:mn><mml:mn>8</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn>256</mml:mn></mml:math></inline-formula> classes.</p></list-item>
<list-item><label>2.</label><p>Hamming weight (HW) model. Within the HW model, the assumption is that side-channel measurements demonstrate proportionality to the number of 1<inline-formula id="ieqn-34"><mml:math id="mml-ieqn-34"><mml:mi>s</mml:mi></mml:math></inline-formula> in the data processed at the attack point.</p></list-item>
<list-item><label>3.</label><p>Hamming distance (HD) model. Within the HD model, it is assumed that side-channel measurements exhibit proportionality to how many 0 <inline-formula id="ieqn-35"><mml:math id="mml-ieqn-35"><mml:mo stretchy="false">&#x2192;</mml:mo></mml:math></inline-formula> 1 and 1 <inline-formula id="ieqn-36"><mml:math id="mml-ieqn-36"><mml:mo stretchy="false">&#x2192;</mml:mo></mml:math></inline-formula> 0 transitions occur between two states in the cryptographic algorithm undergoing processing. Two transitions are considered to have the same impact.</p></list-item>
</list></p>
</sec>
<sec id="s2_3_2">
<label>2.3.2</label>
<title>Leakage Detection</title>
<p>In practical scenarios, the traces obtained to represent the execution of AES can consist of a large number of samples, reaching into the thousands or even millions. Performing an attack in such circumstances can require significant resources and time. To streamline the attack process, adversaries often employ leakage detection methods to identify Points of Interest (PoIs) within the traces [<xref ref-type="bibr" rid="ref-57">57</xref>]. This allows them to concentrate on a smaller subset of points where information leakage is more pronounced. The act of pinpointing the leakage interval in side-channel measurements, aimed at extracting information tied to secrets, is known as leakage detection. The TVLA technique [<xref ref-type="bibr" rid="ref-32">32</xref>], utilizing the widely recognized Welch&#x2019;s <italic>t</italic>-test [<xref ref-type="bibr" rid="ref-58">58</xref>], has gained significant prominence as a statistical method for detecting information leakage [<xref ref-type="bibr" rid="ref-59">59</xref>,<xref ref-type="bibr" rid="ref-60">60</xref>]. This approach has become widely used among the available methods to spot information leaks.</p>
<p>To conduct a TVLA, the captured side-channel measurements are initially segregated into two distinct groups based on the associated key-dependent intermediate value, known as the label, processed by the device. This division entails creating a set <inline-formula id="ieqn-37"><mml:math id="mml-ieqn-37"><mml:msub><mml:mrow><mml:mi mathvariant="bold-script">T</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> of traces where <inline-formula id="ieqn-38"><mml:math id="mml-ieqn-38"><mml:mi>H</mml:mi><mml:mi>W</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>b</mml:mi><mml:mi>e</mml:mi><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>&#x003E;</mml:mo><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula>, and another set <inline-formula id="ieqn-39"><mml:math id="mml-ieqn-39"><mml:msub><mml:mrow><mml:mi mathvariant="bold-script">T</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> containing traces where <inline-formula id="ieqn-40"><mml:math id="mml-ieqn-40"><mml:mi>H</mml:mi><mml:mi>W</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>b</mml:mi><mml:mi>e</mml:mi><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>&#x003C;</mml:mo><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula>. Here, <inline-formula id="ieqn-41"><mml:math id="mml-ieqn-41"><mml:mi>H</mml:mi><mml:mi>W</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>b</mml:mi><mml:mi>e</mml:mi><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> denotes the HW of the label set to undergo processing, representing the count of 1s in the binary form of the processed intermediate value. Afterwards, the <italic>t</italic>-test is utilized by selecting a sample from each of the two sets of traces is used to check for a significant difference, operating under the null hypothesis that the two sets have equal means.</p>
<p>In TVLA, the Second-Order Statistical Test (SOST) is employed to evaluate the difference between the two trace groups, which is defined by <xref ref-type="disp-formula" rid="eqn-1">Formula (1)</xref>.
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mi>S</mml:mi><mml:mi>O</mml:mi><mml:mi>S</mml:mi><mml:mi>T</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mrow><mml:msub><mml:mi>&#x03BC;</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:msub><mml:mi>&#x03BC;</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:mrow><mml:mrow><mml:msqrt><mml:mfrac><mml:mrow><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mn>0</mml:mn><mml:mn>2</mml:mn></mml:msubsup></mml:mrow><mml:mrow><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:msubsup></mml:mrow><mml:mrow><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:mrow></mml:mfrac></mml:msqrt></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-42"><mml:math id="mml-ieqn-42"><mml:msub><mml:mi>&#x03BC;</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-43"><mml:math id="mml-ieqn-43"><mml:msub><mml:mi>&#x03C3;</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> represent the mean and standard deviation of the set <inline-formula id="ieqn-44"><mml:math id="mml-ieqn-44"><mml:msub><mml:mrow><mml:mi mathvariant="bold-script">T</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, respectively, while <inline-formula id="ieqn-45"><mml:math id="mml-ieqn-45"><mml:msub><mml:mi>N</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> denotes the number of data within the set.</p>
<p><xref ref-type="fig" rid="fig-4">Fig. 4</xref> shows an example of how leakage detection allocates PoIs for side-channel traces. The upper picture <xref ref-type="fig" rid="fig-4">Fig. 4</xref> illustrates the leakage detection results for 16 subkeys in the first round of a STM32F3 MCU implementation of AES-128. The bottom picture <xref ref-type="fig" rid="fig-4">Fig. 4</xref> shows an example trace that represents an encryption round. The dashed red line illustrates the PoI allocation for the first SubByte operation in the first round of AES-128. By doing this, adversaries can focus on the specific trace segment and ignore other information to make the profiling stage more efficient.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>An example of how leakage detection works for side-channel traces. The top picture shows the leakage detection results for 16 subkeys in the first round of a STM32F3 MCU implementation of AES-128. The bottom picture shows an example trace which representing an encryption round</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_74473-fig-4.tif"/>
</fig>
</sec>
<sec id="s2_3_3">
<label>2.3.3</label>
<title>Evaluation Metrics</title>
<p>Within the side-channel community, Success Rate (SR) and Guessing Entropy (GE) [<xref ref-type="bibr" rid="ref-61">61</xref>] stand out as two extensively employed evaluation metrics [<xref ref-type="bibr" rid="ref-62">62</xref>]. These metrics provide a fair and standardized framework for evaluating how effectively an attacker can exploit side-channel leakage to recover cryptographic keys. SR gauges the likelihood of successfully identifying the correct key within a specified number of traces, while GE quantifies the average uncertainty by calculating the expected rank of the correct key within a key-ranking scenario. Together, they offer complementary insights on how efficient and robust side-channel attacks are under different conditions.</p>
<p>The success rate quantifies the likelihood that the correct subkey can be successfully identified using a given set of side-channel traces. It reflects the model&#x2019;s capacity to reduce uncertainty and precisely pinpoint the subkey within a specified number of traces. A higher success rate indicates that the adversary requires fewer traces to reliably recover the subkey, demonstrating how effective the model is at extracting keys from side-channel leakage. For instance, in the context of single-trace SR, this metric represents the probability that a model can correctly classify the key-dependent value using only one trace. In an experimental setup with 100 testing traces, if the model successfully recovers the key from 90 traces while failing in 10 cases, the single-trace SR for this model in that specific scenario is calculated as 90.0%.</p>
<p>In certain attack scenarios, relying solely on the SR might not be adequate as an evaluation metric, and it is advisable to include guessing entropy as an additional measure. GE offers information about the degree to which the secret key is disclosed given a specific number of traces. Guessing entropy means the degree of uncertainty or randomness associated with predicting the exposure of the secret key via side channels in side-channel attacks. As a common metric for evaluating attack complexity, guessing entropy implies that higher entropy corresponds to a lower likelihood of success for the attacker.</p>
<p>When recovering 8-bit subkeys individually, the guessing entropy is performed independently for every subkey, and the estimation metric used is the Partial Guessing Entropy (PGE), instead of GE [<xref ref-type="bibr" rid="ref-63">63</xref>]. Represented by PGE is the expected <italic>rank</italic> of the actual subkey. The <italic>rank</italic> of a subkey serves to measure the position of the correct subkey value <inline-formula id="ieqn-46"><mml:math id="mml-ieqn-46"><mml:msup><mml:mi>k</mml:mi><mml:mo>&#x2217;</mml:mo></mml:msup></mml:math></inline-formula> within the key guessing vector. For example, Through the classification of a trace set, the trained deep-learning model could generate a cumulative key guessing vector <inline-formula id="ieqn-47"><mml:math id="mml-ieqn-47"><mml:mi>g</mml:mi></mml:math></inline-formula>, which denotes the probabilities of all possible subkey values. Once the correct subkey value <inline-formula id="ieqn-48"><mml:math id="mml-ieqn-48"><mml:msup><mml:mi>k</mml:mi><mml:mo>&#x2217;</mml:mo></mml:msup></mml:math></inline-formula> holds the highest probability in <inline-formula id="ieqn-49"><mml:math id="mml-ieqn-49"><mml:mi>g</mml:mi></mml:math></inline-formula> (appearing first in <inline-formula id="ieqn-50"><mml:math id="mml-ieqn-50"><mml:mi>S</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>t</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>g</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>), the rank of the correct subkey is denoted as 0. The PGE represents the average rank of <inline-formula id="ieqn-51"><mml:math id="mml-ieqn-51"><mml:msup><mml:mi>k</mml:mi><mml:mo>&#x2217;</mml:mo></mml:msup></mml:math></inline-formula>.</p>
<p>Within DLSCAs, adversaries typically begin by constructing a deep-learning model to serve as a leakage profile, which connects side-channel traces to key-related labels. During the profiling stage, various pre-defined loss functions in the deep-learning community are employed to evaluate the extent to which the trained deep-learning classifier fits the training traces. For example, in most attacks targeting software implementations of AES [<xref ref-type="bibr" rid="ref-16">16</xref>,<xref ref-type="bibr" rid="ref-64">64</xref>,<xref ref-type="bibr" rid="ref-65">65</xref>]. It is common to use <italic>categorical cross-entropy loss</italic> for quantifying classification errors (see below), as these attacks can often be simplified into multi-class classification tasks.
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mi>C</mml:mi><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:munder><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msup><mml:mrow><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:munder><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:msub><mml:mi>s</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:math></disp-formula>where <inline-formula id="ieqn-52"><mml:math id="mml-ieqn-52"><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-53"><mml:math id="mml-ieqn-53"><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> are the ground truth and the classifier score for each class <inline-formula id="ieqn-54"><mml:math id="mml-ieqn-54"><mml:mi>i</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>C</mml:mi></mml:math></inline-formula>. To recover an 8-bit subkey <inline-formula id="ieqn-55"><mml:math id="mml-ieqn-55"><mml:msub><mml:mrow><mml:mi>&#x1D4A6;</mml:mi></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula>, each trace can belong to one of <inline-formula id="ieqn-56"><mml:math id="mml-ieqn-56"><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>C</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>256</mml:mn></mml:math></inline-formula> classes and the model&#x2019;s output is a probability array over the 256 classes for each trace.</p>
</sec>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Practical Attack Cases of DLSCAs</title>
<p>For identifying capabilities and enabling the comparison of different DLSCAs on AES implementations, power consumption is used as the side channel, we provide a comprehensive review of existing work in this section.</p>
<p>Paul Kocher first introduced power analysis in 1999 [<xref ref-type="bibr" rid="ref-18">18</xref>]. The power consumption of a device can depend on the particular data undergoing processing and the operations being executed. If an operation is correlated with some key-dependent intermediate state, the attacker is able to examine the power consumed by the target device and deduce the key. Power-based side-channel attacks capitalize on fluctuations in power usage during the execution of encryption on the victim device, with such fluctuations potentially differing according to various input data and operations [<xref ref-type="bibr" rid="ref-66">66</xref>]. <xref ref-type="fig" rid="fig-5">Fig. 5</xref> shows the comparison of two power traces corresponding to different data calculations within the same instruction, indicating that different data executed by the victim device result in distinct side-channel measurements.</p>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>The comparison of two power traces (averaged over 100 times) indicates that different executed data by the victim device results in power consumption (side-channel measurements)</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_74473-fig-5.tif"/>
</fig>
<p>One method to gauge a device&#x2019;s power consumption is to insert a small resistor in series with the power or ground input, typically referred to as a shunt. Power consumption is determined by dividing the resistor&#x2019;s voltage drop by its resistance [<xref ref-type="bibr" rid="ref-67">67</xref>]. A series of power measurements obtained by sampling these voltage drops using an oscilloscope over a specific duration is referred to as a trace.</p>
<p>In the following subsections, we start by introducing three publicly available power-based SCA datasets. Afterwards, we show existing works of DLSCAs on both software and hardware AES implementations, employing power consumption as the side channel.</p>
<sec id="s3_1">
<label>3.1</label>
<title>Publicly Available Datasets for Power Analysis</title>
<p>In power analysis, there are five widely exploited side-channel datasets: ASCAD, DPA_V2, DPA_V4, AES_RD, and AES_HD. These datasets provide traces collected from different cryptographic implementations under various scenarios, enabling the community to evaluate attack techniques and countermeasures.</p>
<sec id="s3_1_1">
<label>3.1.1</label>
<title>Datasets of Software Implementations</title>
<p><bold>DPA_V4 (DPA contest V4 Dataset)</bold>. The DPA_V4 dataset<xref ref-type="fn" rid="fn-1"><sup>1</sup></xref><fn id="fn-1"><label>1</label><p>The DPA_V4 dataset is publicly available at <ext-link ext-link-type="uri" xlink:href="https://dpacontest.telecom-paris.fr/v4/index.php">https://dpacontest.telecom-paris.fr/v4/index.php</ext-link> (accessed on 02 November 2025).</p></fn> is collected from the DPA contest at a later stage. For the DPA_V4 dataset, the target is an Atmel ATMega-163 smart-card implementation of AES with Rotating Sbox Masking (RSM) [<xref ref-type="bibr" rid="ref-68">68</xref>]. Included in it are 16Kb of in-system programmable flash, 512 bytes of EEPROM, 1Kb of internal SRAM and 32 general purpose working registers. A simple reader interface mounted on the SASEBO-W board is used to read the smartcard. The traces are collected with a LeCroy WaveRunner 6100A oscilloscope. The acquisition bandwidth reaches 200 MHz, while the sampling rate is adjusted to 500 MS/s. Containing 80K traces, the dataset has 16 different keys, with each key, there are 5K traces corresponding to the encryption of 5K different plaintexts per key. The dataset requires approximately 64 GB of storage.</p>
<p><bold>ASCAD (ANSSI SCA Dataset)</bold>. The ASCAD dataset<xref ref-type="fn" rid="fn-2"><sup>2</sup></xref><fn id="fn-2"><label>2</label><p>The ASCAD dataset is publicly available at <ext-link ext-link-type="uri" xlink:href="https://github.com/ANSSI-FR/ASCAD">https://github.com/ANSSI-FR/ASCAD</ext-link> (accessed on 02 November 2025).</p></fn> includes traces collected from two distinct victim devices: an ATMega8515 implementing a Boolean-masked AES and an STM32F303RCT7 implementing an affine-masked AES. These datasets are commonly denoted as ASCADv1 (for the ATMega8515 implementation) and ASCADv2 (for the STM32F303RCT7 implementation). ASCADv1 comprises 60K synchronized traces captured using a fixed secret key and 300K jitter-based traces obtained with variable keys. The traces&#x2019; desynchronization is simulated artificially. To sample measurements in the ASCAD dataset, a digital oscilloscope operates at a sampling rate of 2G samples every second. A temporal acquisition window is set to record the first round of the AES only. Every trace is made up of 1.4K sample points, while the dataset takes up around 77 GB of storage space. ASCADv2 contains 800K traces, each generated with random plaintexts and keys. The traces consist of 1M sample points, covering the entire AES encryption process. The dataset requires approximately 807 GB of storage; however, a smaller extracted dataset is available, which is only 7 GB in size, offering a more convenient option for quick access.</p>
<p><bold>AES_RD</bold>. AES_RD<xref ref-type="fn" rid="fn-3"><sup>3</sup></xref><fn id="fn-3"><label>3</label><p>The AES_RD dataset is publicly available at <ext-link ext-link-type="uri" xlink:href="https://github.com/ikizhvatov/randomdelays-traces">https://github.com/ikizhvatov/randomdelays-traces</ext-link> (accessed on 02 November 2025).</p></fn> contains power consumption traces captured from an 8-bit Atmel AVR microcontroller implementation of AES-128, where the algorithm is protected by a random delay countermeasure [<xref ref-type="bibr" rid="ref-69">69</xref>]. The AES_RD dataset contains 50K traces in total and each trace has 3.5K samples, which are captured with the LeCroy WaveRunner 104MXi oscilloscope. These power traces are compressed by selecting 1 sample of each CPU clock cycle. The dataset is compactly organized, occupying less than 1 GB of storage.</p>
</sec>
<sec id="s3_1_2">
<label>3.1.2</label>
<title>Datasets of Hardware Implementations</title>
<p><bold>DPA_V2 (DPA contest V2 Dataset)</bold>. The DPA_V2 dataset<xref ref-type="fn" rid="fn-4"><sup>4</sup></xref><fn id="fn-4"><label>4</label><p>The DPA_V2 dataset is publicly available at <ext-link ext-link-type="uri" xlink:href="https://dpacontest.telecom-paris.fr/v2/participate.php">https://dpacontest.telecom-paris.fr/v2/participate.php</ext-link> (accessed on 02 November 2025).</p></fn> is collected as part of the DPA Contest. As an international contest, the DPA Contest permits researchers across the globe to compete on a common basis. Introduced in 2008, this contest has seen four versions completed since that time. In the case of DPA_V2, The target here is an AES-128 hardware implementation, which is set up on the Xilinx Virtex-5 FPGA of a SASEBO GII evaluation board. A high-resolution oscilloscope is used to measure the power consumed by the device when the encryption process takes place. The dataset contains 1M traces in total with 3.2K sample points per trace. The dataset requires approximately 9 GB of storage. For research convenience, the data are well-aligned after the capturing process.</p>
<p><bold>AES_HD</bold>. AES_HD<xref ref-type="fn" rid="fn-5"><sup>5</sup></xref><fn id="fn-5"><label>5</label><p>The AES_HD dataset is publicly available at <ext-link ext-link-type="uri" xlink:href="https://github.com/AISyLab/AES_HD">https://github.com/AISyLab/AES_HD</ext-link> (accessed on 02 November 2025).</p></fn> is a dataset captured from an unprotected AES-128 hardware implementation on a Xilinx Virtex-5 FPGA, integrated into a SASEBO GII evaluation board. Note that the acquisition method is not explicitly mentioned on the AES_HD dataset website. This dataset comprises 100K traces, with each linked to a distinct random plaintext. Each trace contains 1.25K sample points. The dataset is efficiently stored, requiring less than 1 GB of storage space.</p>
<p>The characteristics of the five datasets are summarized in the subsequent <xref ref-type="table" rid="table-1">Table 1</xref>.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Key characteristics of benchmark datasets in DLSCAs</title>
</caption>
<table>
<colgroup>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/> </colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>Sampling rate</th>
<th>Sampling piont</th>
<th>Trace count</th>
</tr>
</thead>
<tbody>
<tr>
<td>ASCADv1</td>
<td>200 MS/s</td>
<td>1.4K</td>
<td>360k traces</td>
</tr>
<tr>
<td>ASCADv2</td>
<td>50 MS/s</td>
<td>1M</td>
<td>800K traces</td>
</tr>
<tr>
<td>DPA_v2</td>
<td>500 MS/s</td>
<td>3.2K</td>
<td>1M traces</td>
</tr>
<tr>
<td>DPA_v4</td>
<td>500 MS/s</td>
<td>1,704,402</td>
<td>80k traces</td>
</tr>
<tr>
<td>AES_RD</td>
<td>&#x2013;</td>
<td>3.5K</td>
<td>50k traces</td>
</tr>
<tr>
<td>AES_HD</td>
<td>&#x2013;</td>
<td>1.25K</td>
<td>100k traces</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-1fn1" fn-type="other">
<p>Note: &#x2018;&#x2013;&#x2019; indicates that the sampling rate is not provided in the original dataset documentation.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="s3_1_3">
<label>3.1.3</label>
<title>Comparison on Datasets</title>
<p>When comparing side-channel analysis datasets such as ASCAD, AES_RD, AES_HD, DPA_V2, and DPA_V4, their relevance to real-world IoT environments can be assessed based on their characteristics and limitations.</p>
<p>ASCAD, AES_RD and DPA_V4, which provide traces from microcontrollers running AES, are highly representative of software-based cryptographic implementations in real-world IoT environment. Software-based AES is widely used in low-cost and resource-constrained IoT devices. A large number of such devices function using low-power chips that lack dedicated cryptographic accelerators, making software implementations the only viable option. Among them, AES_RD and parts of the ASCAD dataset contain traces captured from 8-bit MCU implementation of AES, which can be used to represent low-power, cost-sensitive IoT applications, such as smart sensors, simple automation systems, and basic communication modules, due to their simplicity, low energy consumption, and affordability. However, implementing AES in software on an 8-bit MCU presents significant performance challenges. AES functions as a block cipher, handling data blocks of 128 bits in size, which must be split into multiple 8-bit operations when executed on an 8-bit MCU. This results in a high number of memory accesses, increased computational overhead, and slow encryption speeds.</p>
<p>For the DPA_V4 and another part of the ASCAD dataset, traces are captured from 32-bit MCU implementations of AES. In this case, the data can represent a more scalable approach for software-based AES in IoT, which achieves a more reasonable balance between the security level and the cost. Real-world IoT scenarios that commonly rely on software AES in 32-bit MCUs include industrial automation controllers, connected medical devices, and smart home gateways.</p>
<p>AES_HD, DPA_V2 can be used to represent the IoT applications with hardware cryptographic module, as they both contain traces captured from AES implementations on FPGAs. In contrast, hardware-based AES is preferred for performance-critical applications that demand fast encryption and decryption. IoT devices involved in real-time video streaming, high-throughput communication, or industrial control systems benefit significantly from dedicated cryptographic accelerators, as hardware AES completes encryption operations faster. Security-critical IoT applications, including payment terminals, secure bootloaders, and biometric authentication systems, also rely on hardware AES due to its resistance to attacks.</p>
</sec>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Power Analysis of Software Implementations</title>
<p>A software-based AES implementation denotes the execution of the AES algorithm through software or programming code. In this approach, the AES algorithm is executed on a general-purpose computing device, such as a computer or a microcontroller, using software instructions to perform the required cryptographic operations. The software implementation typically involves translating the AES algorithm&#x2019;s steps, such as key expansion, substitution, permutation, and XOR operations, into programming instructions executable by the device&#x2019;s Central Processing Unit (CPU). Software implementations of AES are commonly used in various applications, including secure communication protocols, file encryption, and cryptographic libraries.</p>
<p>Advantages of software implementations of AES compared to hardware implementations include:
<list list-type="simple">
<list-item><label>1.</label><p>Flexibility: Software implementations offer great flexibility in terms of programmability and adaptability to different platforms and operating systems [<xref ref-type="bibr" rid="ref-70">70</xref>].</p></list-item>
<list-item><label>2.</label><p>Cost-effectiveness: Software implementations typically require less upfront investment compared to hardware implementations, as they can utilize existing computing infrastructure without the need for dedicated hardware components [<xref ref-type="bibr" rid="ref-71">71</xref>].</p></list-item>
<list-item><label>3.</label><p>Compatibility: Software implementations of AES can be developed to comply with standardized cryptographic libraries and protocols, ensuring compatibility and interoperability with other software systems [<xref ref-type="bibr" rid="ref-72">72</xref>].</p></list-item>
</list></p>
<p>In 2011, Hospodar et al. proposed, in the JCEN journal, one of the initial machine learning-based attacks, trains a Least Squares Support Vector Machine (LS-SVM) [<xref ref-type="bibr" rid="ref-73">73</xref>] to classify power traces captured from an AES software implementation based on the first round&#x2019;s SBox output. Instead of recovering an actual key, Hospodar et al. [<xref ref-type="bibr" rid="ref-74">74</xref>] focus on one single SBox lookup process. The investigation focuses on three significant properties of the SBox output: whether the HW is smaller or larger than 4, whether it is odd or even, together with the value of the fourth least significant bit. This analysis aims to illustrate the impact of LS-SVM hyperparameters on classification accuracy.</p>
<p>Afterwards, in 2013, deep-learning techniques began contributing to power analysis [<xref ref-type="bibr" rid="ref-75">75</xref>]. A three-layer MLP network is applied for training to compromise an AES-128 SmartCard implementation, featuring an 8-bit microcontroller PIC16F84. In [<xref ref-type="bibr" rid="ref-75">75</xref>], the MLP model undergoes training by using the standard sigmoid as the activation function and achieves a 85.2% single-trace attack accuracy to recover the first subkey of the implementation. Since then, the development of neural network-based power analysis targeting various implementations of AES has gradually emerged.</p>
<p>By training a MLP model on the generated patterns, reference [<xref ref-type="bibr" rid="ref-76">76</xref>] is able to improve classification accuracy to 96.5% for the same implementation of AES as in [<xref ref-type="bibr" rid="ref-75">75</xref>]. Afterwards, reference [<xref ref-type="bibr" rid="ref-77">77</xref>] compares the newly introduced MLP based attack in [<xref ref-type="bibr" rid="ref-75">75</xref>,<xref ref-type="bibr" rid="ref-76">76</xref>] with the conventional template attack on the same dataset as in [<xref ref-type="bibr" rid="ref-75">75</xref>,<xref ref-type="bibr" rid="ref-76">76</xref>]. However, in [<xref ref-type="bibr" rid="ref-77">77</xref>], only 2560 traces are captured for training the MLP model, which may lead to the result far away from the optimal. In [<xref ref-type="bibr" rid="ref-77">77</xref>], the PGE result of their MLP model is 1.04 on average, which indicates that the trained model&#x2019;s capability to extract the subkey from an AES-128 Smart Card implementation which features an 8-bit microcontroller PIC16F84.</p>
<p>In addition to MLPs, reference [<xref ref-type="bibr" rid="ref-56">56</xref>] explores the efficiency of various other deep-learning models in enhancing side-channel attacks using three distinct datasets. In [<xref ref-type="bibr" rid="ref-56">56</xref>], five different deep-learning attack approaches are compared in total: MLP with Principal Component Analysis (PCA), MLP without PCA, CNN, AE and Long and Short Term Memory (LSTM). For the sake of comparison, they also test a machine learning based approach (random forest) and the template attack on the same three datasets. They experimentally show the overwhelming advantage of the deep learning based when it comes to breaking both unprotected and protected AES.</p>
<p>To further examine research on power analysis based on CNNs, Cagli et al. [<xref ref-type="bibr" rid="ref-29">29</xref>] utilize the CNN model alongside a data augmentation method [<xref ref-type="bibr" rid="ref-78">78</xref>], introduced at CHES 2017. This approach aims to overcome the trace misalignment and handle countermeasures based on jitter. Cagli et al. [<xref ref-type="bibr" rid="ref-29">29</xref>] initially highlight the challenges of the traditional template attack strategy, particularly in addressing trace misalignment. This requires the attacker to meticulously realign the captured traces. Subsequently, they conduct experiments demonstrating that the CNN based strategy significantly streamlines the attack process by eliminating the need for trace realignment and precise selection of points that are of interest. In their initial trial, they used CNNs to compromise the implementation of AES on an ATmega328P microprocessor, which featured a uniform Random Delay Interrupt (RDI). By using the HW leakage model to train their CNN network, they achieve to use 7 traces on average to recover a subkey, which confirms that their CNN model is robust to RDI. Additionally, their second experiment introduces clock instability to misaligned traces on the adversary&#x2019;s side. The findings show that the CNN method is superior to Gaussian templates in performance, regardless of whether trace realignment is performed. Meanwhile, reference [<xref ref-type="bibr" rid="ref-79">79</xref>] also examines the advantages of CNNs in comparison to various ML techniques. Their experiments reveal that methods such as random forest may be a preferable choice over CNN in certain attack scenarios, casting doubt on CNN as the optimal approach for every profiled SCA setting. Another work [<xref ref-type="bibr" rid="ref-80">80</xref>] for CNN based attacks shows that the addition of manually added non-task-specific noise in training sets may prove advantageous to the attack efficiency of the trained network. The addition of such noise may be considered being on par with incorporating a regularization term. a regularization term. CNN models trained on data with additional non-task-specific noise in [<xref ref-type="bibr" rid="ref-80">80</xref>] can successfully recover the key by using 2 traces for the AES_RD dataset. For the case of DPA_V4 dataset, the addition of noise helps the pooled template to break the implementation with 2 traces, which outperforms other CNN-based approaches.</p>
<p>During the profiling stage, the HW leakage model as well as the HD leakage model face a common challenge of dealing with imbalanced data. For instance,if the labels of the data are evenly distributed across the range from 0 to 255, certain classes occur in 1/256 instances (when the HW parameter is set to 0) while one appears in 70/256 instances (When HW parameter is set to 4) by using the HW leakage model. With the aim of reducing the effect of imbalanced profiling data, reference [<xref ref-type="bibr" rid="ref-81">81</xref>] uses a data balancing approach called Synthetic Minority Oversampling Technique (SMOTE) [<xref ref-type="bibr" rid="ref-82">82</xref>] to ensure a balanced distribution of classes.</p>
<p>In [<xref ref-type="bibr" rid="ref-88">88</xref>], neural networks are trained using each bit of the intermediate data processed at the attack point as a separate label. Consequently, for a subkey represented as a byte, a total of 8 labels are considered. To address the class imbalance challenge, this method is introduced, unlike the HW leakage model, every bit exhibits almost a uniform distribution. They evaluate the multi-label approach in several datasets. For three publicly available datasets, called ASCAD, AES_RD and AES_HD, the proposed multi-label model requires 202, 10 and 831 traces, respectively, for the subkey recovery. Power traces in AES_RD are captured from an 8-bit Atmel AVR microcontroller implementation of AES-128, where the algorithm is protected via a random delay countermeasure [<xref ref-type="bibr" rid="ref-69">69</xref>]. In all, the dataset consists of 50,000 traces, and each trace is made up of 3500 samples. These power traces are compressed by selecting 1 sample from each CPU clock cycle. <xref ref-type="fig" rid="fig-6">Fig. 6a</xref> shows an example trace of AES_RD dataset and <xref ref-type="table" rid="table-2">Table 2</xref> shows the summary of existing DLSCAs on AES_RD dataset.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>Plots of traces captured from software implementations of AES. The first figure shows a power trace with random delay captured from an 8-bit AVR microcontroller implementation of AES-128 (AES_RD dataset) and the second plot shows a trace captured from an 8-bit Atmel ATXmega128D4 microcontroller implementation of AES-128</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_74473-fig-6.tif"/>
</fig><table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Overview of current DLSCAs utilizing power consumption as the side channel on the AES_RD dataset</title>
</caption>
<table>
<colgroup>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/> </colgroup>
<thead>
<tr>
<th>Work</th>
<th>Best classifier</th>
<th>Leakage model</th>
<th>Result</th>
</tr>
</thead>
<tbody>
<tr>
<td>[<xref ref-type="bibr" rid="ref-81">81</xref>]</td>
<td>Random forest</td>
<td>HW</td>
<td><inline-formula id="ieqn-57"><mml:math id="mml-ieqn-57"><mml:mo>&#x2248;</mml:mo></mml:math></inline-formula>1619 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-80">80</xref>]</td>
<td>CNN</td>
<td>ID</td>
<td><inline-formula id="ieqn-58"><mml:math id="mml-ieqn-58"><mml:mo>&#x2248;</mml:mo></mml:math></inline-formula>600 traces to reach <inline-formula id="ieqn-59"><mml:math id="mml-ieqn-59"><mml:mi>P</mml:mi><mml:mi>G</mml:mi><mml:mi>E</mml:mi><mml:mo>&#x003C;</mml:mo><mml:mn>20</mml:mn></mml:math></inline-formula></td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-83">83</xref>]</td>
<td>CNN</td>
<td>ID</td>
<td>171 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-84">84</xref>]</td>
<td>CNN</td>
<td>One bit</td>
<td>10 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-85">85</xref>]</td>
<td>CNN</td>
<td>ID</td>
<td>5 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-45">45</xref>]</td>
<td>TransNet</td>
<td>ID</td>
<td>2 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-86">86</xref>]</td>
<td>InceptionNet</td>
<td>ID</td>
<td>3 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-87">87</xref>]</td>
<td>CNN</td>
<td>ID</td>
<td>1953 traces</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In 2022, the denoising autoencoder is applied in [<xref ref-type="bibr" rid="ref-89">89</xref>] to reduce or remove noise and the countermeasure effect before the training process. The denoising autoencoder is trained by using noisy-clean trace pairs. Regarding the ASCAD dataset, the CNN model in [<xref ref-type="bibr" rid="ref-89">89</xref>] requires 831 traces to retrieve the correct key before the denoising process that uses the trained autoencoder. After degenerating the noise level, the results decrease to 751. Afterwards, Zaid et al. [<xref ref-type="bibr" rid="ref-90">90</xref>] proposed a Conditional Variational AutoEncoder in 2023 TCHES, which bridges DL models and SCA paradigms based on theoretical findings from stochastic attacks. By following the path of [<xref ref-type="bibr" rid="ref-89">89</xref>], Hu et al. [<xref ref-type="bibr" rid="ref-65">65</xref>] further design a multi-loss DAE model which makes the power and near-field EM based SCAs more efficient, presented in 2023 TIFS. Despite their strength, deep-learning techniques face a well-known limitation: reliance on large datasets. This issue has received relatively little focus within the side-channel community. However, within practical scenarios, the quantity of profiling traces available to adversaries is often far smaller than assumed, constrained by factors such as limited preparation time or scheme-specific restrictions. As a result, improving the efficiency of the profiling stage in DLSCAs is also an important area of focus. Reference [<xref ref-type="bibr" rid="ref-91">91</xref>] proposes a Label Correlation (LC) based profiling method, in which transferring the widely used one-hot labels toward their patterns to accelerate the convergence of model profiling. Their experiments demonstrated that both CNN and MLP models could successfully recover subkeys from ASCAD traces with only 10K profiling traces, underscoring the potential of this approach to improve profiling efficiency. Reference [<xref ref-type="bibr" rid="ref-92">92</xref>] proposes a novel CPA method applicable to scenarios where cryptographic algorithms employ parallel implementations of S-boxes, and narrow the performance gap between profiling and non-profiling SCA attacks.</p>
<p>Numerous existing architectures enhance model accuracy through the stacking of multiple network layers, which consequently introduces several challenges: elevated algorithmic and computational complexity, overfitting phenomena, reduced efficiency in the training process, and constrained feature representation capacity. In addition, deep learning methods depend on data correlation, and noise presence often reduces this correlation, thus making attacks more difficult. TN exhibit a strong capability to capture dependency relationships between distant POIs in side-channel traces; leveraging this advantage, Hajra et al. [<xref ref-type="bibr" rid="ref-45">45</xref>] employ TN to launch attacks against cryptographic implementations equipped with protective measures such as masking and desynchronization. Zhang et al. [<xref ref-type="bibr" rid="ref-93">93</xref>] propose a CNN-Transformer architecture, which integrates power analysis techniques to automatically identify and prioritize relevant POIs in side channel power consumption traces. Empirical evaluations on dataset demonstrate that, compared with LSTM and CNN models, this hybrid architecture achieves significantly higher attack efficiency in DL-SCAs. In 2024, reference [<xref ref-type="bibr" rid="ref-50">50</xref>] proposes an AMCNNet for better extraction of the feature and temporal information from traces. Reference [<xref ref-type="bibr" rid="ref-86">86</xref>] suggests applying a network structure based on InceptionNet to side-channel attacks. The proposed network employs a reduced number of training parameters, attains accelerated convergence via parallel processing of input data, and enhances attack efficiency. Additionally, a network architecture based on LU-Net is put forward to denoise side-channel datasets. On the AES_RD and ASCAD datasets, 3 and 30 traces are used to recover the subkeys. Reference [<xref ref-type="bibr" rid="ref-87">87</xref>] designs a lightweight deep learning model by incorporating random convolutional kernels to address the exponentially increasing training time caused by excessive features. Compared with the most advanced methods, the quantity of power traces required as well as the trainable parameters are reduced by over 70% and 94%, respectively. Reference [<xref ref-type="bibr" rid="ref-94">94</xref>] proposes a general side-channel evaluation metric called Leading Degree (LD) for assessing the performance of deep learning models. Through the use of LD as the reward function in the tuning of model hyperparameters based on reinforcement learning, a better model structure is obtained compared with previous state-of-the-art models.</p>
<sec id="s3_2_1">
<label>3.2.1</label>
<title>Hyperparameter Tuning</title>
<p>While many existing works focus on the attack efficiency of the designed model, they always keep their hyper-parameterization as a secret. The comprehensive study on the selection of hyperparameters has not been fully explored when developing deep-learning models across various side-channel attacks&#x2019; scenarios [<xref ref-type="bibr" rid="ref-95">95</xref>]. In order to address this limitation, reference [<xref ref-type="bibr" rid="ref-96">96</xref>] investigates the impact of modifying hyper-parameters in MLP and CNN models for side-channel attacks. The methodologies for selecting hyper-parameters can serve as a guidance for the following researchers to optimize their own deep-learning models. Specifically, the study not only puts forward a customized hyperparameter selection framework for DL models, encompassing pivotal parameters such as batch size, epochs, filter configurations, and fully-connected layers, but also systematically elucidates how adjustments to these hyperparameters directly modulate critical model attributes, thereby exerting a decisive influence on the reliability of DL-SCA models in practical key recovery tasks. The impact of some hyperparameters on the model is shown in the <xref ref-type="fig" rid="fig-7">Fig. 7</xref>. In [<xref ref-type="bibr" rid="ref-96">96</xref>], their best CNN model is about to recover a single subkey from a masked AES implementation on an 8-bit ATMega8515 microcontroller utilizing around 200 traces without any desynchronization. For the case of traces with a maximal desynchronization value of 50, they succeeded in achieving a mean rank close to 20 with 5K traces. In the scenario where the maximum desynchronization value is 100, the model requires 5K traces to reach a mean rank near 40. In addition, reference [<xref ref-type="bibr" rid="ref-96">96</xref>] shows that attacks based on MLPs and CNNs exhibit notable superiority over template attacks when traces are desynchronized.</p>
<fig id="fig-7">
<label>Figure 7</label>
<caption>
<title>An analysis of deep learning for SCA using the ASCAD database, investigate the impacts of different hyperparameters (epochs, batch sizes, blocks, CONV layers, filters, fully-connected layers) and desynchronization levels on model performance</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_74473-fig-7.tif"/>
</fig>
<p>Hyperparameter optimization plays plays a vital role in enhancing both the performance and robustness of models used in SCAs. Traditional approaches like grid search and random search systematically explore different hyperparameter combinations, though they can be computationally expensive, as shown in many SCA cases [<xref ref-type="bibr" rid="ref-65">65</xref>,<xref ref-type="bibr" rid="ref-97">97</xref>,<xref ref-type="bibr" rid="ref-98">98</xref>]. More sophisticated techniques that include Bayesian optimization offer a more efficient method by modeling the hyperparameter space probabilistically, allowing the search to focus on promising regions of the attack [<xref ref-type="bibr" rid="ref-99">99</xref>,<xref ref-type="bibr" rid="ref-100">100</xref>]. A traditional CNN model is optimized by incorporating an attention mechanism into the convolutional layers, thereby strengthening the model&#x2019;s capability to capture global information and improving the extraction of leakage information [<xref ref-type="bibr" rid="ref-101">101</xref>]. Another effective strategy is learning rate scheduling, where the learning rate is adjusted dynamically during training, often using techniques like cosine annealing, step decay, or adaptive methods [<xref ref-type="bibr" rid="ref-102">102</xref>]. To elucidate the impact of each hyperparameter of neural networks in side-channel attack scenarios during the feature selection phase, reference [<xref ref-type="bibr" rid="ref-85">85</xref>] employs three visualization techniques to illustrate the internal operations of models: weight visualization [<xref ref-type="bibr" rid="ref-103">103</xref>], gradient visualization [<xref ref-type="bibr" rid="ref-104">104</xref>], and heatmaps [<xref ref-type="bibr" rid="ref-105">105</xref>]. By employing these visualization approaches, reference [<xref ref-type="bibr" rid="ref-85">85</xref>] demonstrates different methodologies to build suitable CNN architectures for different attack scenarios. They show that their optimal model achieves using 3 traces to recover a subkey from DPA_V4 dataset. Comparing this result to [<xref ref-type="bibr" rid="ref-80">80</xref>] and [<xref ref-type="bibr" rid="ref-81">81</xref>], they managed to lower the model&#x2019;s complexity without degenerating the model accuracy. For the case of ASE_HD, ASCAD and AES_RD datasets, the CNN models in [<xref ref-type="bibr" rid="ref-85">85</xref>] require 1050, 191 and 5 traces to recover the subkey. To address the threats posed by SCAs, one defensive measure involves injecting random noise to enhance security. Reference [<xref ref-type="bibr" rid="ref-106">106</xref>] proposes a novel denoising approachthat combines wavelet coefficient analysis with Generative Adversarial Networks (GANs). This integration is tailored to mitigate noise artifacts induced by side-channel countermeasures. Through the preprocessing of noise reduction on traces, only 108 traces are required on the ASCAD dataset compared with [<xref ref-type="bibr" rid="ref-85">85</xref>].</p>
<p>Afterwards, References [<xref ref-type="bibr" rid="ref-99">99</xref>,<xref ref-type="bibr" rid="ref-100">100</xref>] propose two distinct methods for automatically tuning neural networks&#x2019; hyperparameters. Reference [<xref ref-type="bibr" rid="ref-100">100</xref>] builds a custom framework denoted as AutoSCA based on Bayesian Optimization, in which the model is selected from 50 iterations of testing various hyperparameter combinations. During every iteration, the Bayesian Optimization function generates a group of hyperparameters for model construction, which is then followed by the training process. They compare different types of neural networks and show that the AutoSCA MLP model reaches the best performance for the ASCAD dataset with the shortest training time, which uses 129 traces to recover a subkey. Another automated hyperparameter tuning approach for SCA is proposed by [<xref ref-type="bibr" rid="ref-99">99</xref>], in which the reinforcement learning framework [<xref ref-type="bibr" rid="ref-107">107</xref>] is used with two reward functions for side-channel metrics to adjust the hyperparameters of convolutional neural network. In [<xref ref-type="bibr" rid="ref-99">99</xref>], 202 traces need to be used for the CNN model to recover a subkey for the ASCAD dataset. Reference [<xref ref-type="bibr" rid="ref-108">108</xref>] formulates SCA problems as graph signal processing (GSP) problems. Leveraging the inherent advantage of GNNs in modeling and analyzing graph-structured data, the study further applies GNNs to SCA tasks to enhance the extraction and utilization of leakage-related features. <xref ref-type="table" rid="table-3">Table 3</xref> shows the summary of existing DLSCAs on ASCAD dataset.</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Summary of existing DLSCAs on the ASCAD dataset</title>
</caption>
<table>
<colgroup>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/> </colgroup>
<thead>
<tr>
<th>Work</th>
<th>Best classifier</th>
<th>Leakage model</th>
<th>Result</th>
</tr>
</thead>
<tbody>
<tr>
<td>[<xref ref-type="bibr" rid="ref-49">49</xref>]</td>
<td>AE &#x002B; TA</td>
<td>ID</td>
<td>160 traces to reach <inline-formula id="ieqn-60"><mml:math id="mml-ieqn-60"><mml:mi>P</mml:mi><mml:mi>G</mml:mi><mml:mi>E</mml:mi><mml:mo>&#x003C;</mml:mo><mml:mn>2</mml:mn></mml:math></inline-formula></td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-83">83</xref>]</td>
<td>CNN</td>
<td>ID</td>
<td>552 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-84">84</xref>]</td>
<td>CNN</td>
<td>One bit</td>
<td>202 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-85">85</xref>]</td>
<td>CNN</td>
<td>ID</td>
<td>191 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-86">86</xref>]</td>
<td>InceptionNet</td>
<td>ID</td>
<td>30 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-99">99</xref>]</td>
<td>CNN</td>
<td>ID</td>
<td>202 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-100">100</xref>]</td>
<td>MLP</td>
<td>ID</td>
<td>129 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-45">45</xref>]</td>
<td>TransNet</td>
<td>ID</td>
<td><inline-formula id="ieqn-61"><mml:math id="mml-ieqn-61"><mml:mo>&#x2248;</mml:mo></mml:math></inline-formula>200 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-50">50</xref>]</td>
<td>AMCNNet</td>
<td>ID</td>
<td>155 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-101">101</xref>]</td>
<td>CNN</td>
<td>ID</td>
<td>48 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-106">106</xref>]</td>
<td>CNN</td>
<td>ID</td>
<td>108 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-94">94</xref>]</td>
<td>CNN</td>
<td>ID</td>
<td>102 traces</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3_2_2">
<label>3.2.2</label>
<title>Board Diversity</title>
<p>Most of reported deep-learning based side-channel attacks on AES by 2019 do not account for the influence of board diversity. They conduct training as well as testing of the deep-learning models using the same board&#x2019;s traces, which is not realistic in a genuine attack scenario.</p>
<p>To examine the impact of board diversity on the efficiency of DLSCAs, Das et al. [<xref ref-type="bibr" rid="ref-109">109</xref>] experimentally demonstrate the success rate gap when the trained model is applied to the profiling device vs. the victim device, as presented at the DAC conference. They train deep-learning models using traces obtained from one 8-bit ATxmega128D4 microcontroller implementation of AES and tests this model using traces obtained from another board with the same chip and the same version of AES. Afterwards, Wang et al. [<xref ref-type="bibr" rid="ref-97">97</xref>] investigate to which extent the trained models&#x2019; attack efficiency can be mitigated when targeting the victim device with the same implementation but in a different Printed Circuit Board (PCB) as the profiling device. Although the MLP model in [<xref ref-type="bibr" rid="ref-97">97</xref>] demonstrates the ability to recover the key from the training board in 88. 5% of the cases, the success rate drops significantly to 13.7% for the testing board. This demonstrates the potential for overestimating classification accuracy if the training and testing the model on the traces obtained of the same equipment. Notably, the model structure in [<xref ref-type="bibr" rid="ref-97">97</xref>] is relatively simple compared to the complex architectures typically used in computer vision and natural language processing, particularly when the captured traces exhibit a high level of leakage. For instance, the peak leakage value, represented by the SOST value, for the first subkey derived from 5K power traces collected from the ATxmega128D4 microcontroller&#x2019;s AES implementation, is approximately 70. This value is roughly 15 times higher than the detectable leakage threshold (SOST value <inline-formula id="ieqn-62"><mml:math id="mml-ieqn-62"><mml:mo>&#x003C;</mml:mo><mml:mn>4.5</mml:mn></mml:math></inline-formula>), enabling adversaries to construct relatively simple models.</p>
<p>To alleviate the influence of board diversity and improve the efficiency of deep-learning models, a data-level aggregation approach is proposed by [<xref ref-type="bibr" rid="ref-84">84</xref>,<xref ref-type="bibr" rid="ref-109">109</xref>,<xref ref-type="bibr" rid="ref-110">110</xref>]. The fundamental concept of the data-level aggregation approach is to train DL models using traces obtained from multiple boards, rather than one, as shown in <xref ref-type="fig" rid="fig-8">Fig. 8a</xref>. For example, in [<xref ref-type="bibr" rid="ref-84">84</xref>], 9 profiling devices (the same implementations as in [<xref ref-type="bibr" rid="ref-97">97</xref>]) are used to train a MLP model and increase the probability of key recovery from a single trace, increasing it from 40.0% to 86.1%. To further increase the attack efficiency, reference [<xref ref-type="bibr" rid="ref-110">110</xref>] uses 30 devices for profiling and captured traces are preprocessed by using the PCA technique before the profiling stage. As a result, they successfully achieve a <inline-formula id="ieqn-63"><mml:math id="mml-ieqn-63"><mml:mo>&#x2265;</mml:mo><mml:mspace width="negativethinmathspace" /><mml:mspace width="negativethinmathspace" /><mml:mn>90</mml:mn><mml:mi mathvariant="normal">&#x0025;</mml:mi></mml:math></inline-formula> single-trace attack accuracy to break the same 8-bit ATXmega128D4 microcontroller implementation of AES as in [<xref ref-type="bibr" rid="ref-84">84</xref>,<xref ref-type="bibr" rid="ref-97">97</xref>,<xref ref-type="bibr" rid="ref-109">109</xref>]. <xref ref-type="fig" rid="fig-6">Fig. 6b</xref> shows an example power trace obtained from an 8-bit ATXmega128D4 microcontroller implementation of AES, which represents 16 SBox operations. <xref ref-type="table" rid="table-4">Table 4</xref> summarizes the analyzed DLSCAs on ATXmega128D4 microcontroller implementations of AES. Reference [<xref ref-type="bibr" rid="ref-116">116</xref>] proposes an automated trace segmentation approach based on reinforcement learning, which is applicable to a broad range of common implementations of public-key algorithms. The authors experimentally verified the transferability of the proposed framework on over ten datasets. Reference [<xref ref-type="bibr" rid="ref-115">115</xref>] introduces Switch-T, a neural architecture grounded in the Transformer framework. This approach integrates elastic weight consolidation (EWC) with a multi-task learning paradigm to facilitate coordinated multi-task adversarial attacks. The experimental results demonstrate that the Switch-T model is capable of retaining the knowledge acquired from prior tasks in cross-architecture and cross-channel SCAs scenarios. Reference [<xref ref-type="bibr" rid="ref-117">117</xref>] proposes a novel cross-device attack method, which integrates the Denoising Diffusion Probability Model (DDPM) for universal model construction, adopts an adaptive multi-task loss function to balance multiple training objectives, and conducts evaluation of the proposed strategy on five cross-device SCA datasets.</p>
<fig id="fig-8">
<label>Figure 8</label>
<caption>
<title>Illustration of different aggregation methods to mitigate the impact caused by the board diversity in DLSCAs on AES. <xref ref-type="fig" rid="fig-8">Fig. 8a</xref> presents data-level aggregation, where a model is trained on traces from different devices. <xref ref-type="fig" rid="fig-8">Fig. 8b</xref> shows model-level aggregation for SCA based on the horizontal federated learning framework, with N participants jointly building a federated deep-learning model. <xref ref-type="fig" rid="fig-8">Fig. 8c</xref> illustrates output-level aggregation: N classifiers are independently trained on traces from N devices. (<bold>a</bold>) Data-level aggregation approach [<xref ref-type="bibr" rid="ref-109">109</xref>,<xref ref-type="bibr" rid="ref-110">110</xref>]. (<bold>b</bold>) Model-level aggregation approach [<xref ref-type="bibr" rid="ref-111">111</xref>]. (<bold>c</bold>) Output-level aggregation approach [<xref ref-type="bibr" rid="ref-111">111</xref>]</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_74473-fig-8.tif"/>
</fig><table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Summary of existing DLSCAs on ATxmega128D4 microcontroller implementations of AES-128</title>
</caption>
<table>
<colgroup>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/>
<col align="center" width="35mm"/> </colgroup>
<thead>
<tr>
<th>Work</th>
<th>Best classifier</th>
<th>Leakage model</th>
<th>Result</th>
</tr>
</thead>
<tbody>
<tr>
<td>[<xref ref-type="bibr" rid="ref-97">97</xref>]</td>
<td>MLP</td>
<td>ID</td>
<td>13.7% with 1 trace</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-109">109</xref>]</td>
<td>MLP</td>
<td>ID</td>
<td>&#x003E;90.0% with 1 trace</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-110">110</xref>]</td>
<td>MLP</td>
<td>ID</td>
<td>91.72% with 1 trace</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-111">111</xref>]</td>
<td>MLP</td>
<td>ID</td>
<td>77.7% with 1 trace</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-84">84</xref>]</td>
<td>MLP</td>
<td>ID</td>
<td>88.7% with 1 trace</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-112">112</xref>]</td>
<td>CNN</td>
<td>ID</td>
<td>93.0% with 1 trace</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-93">93</xref>]</td>
<td>CNN-Transformer</td>
<td>ID</td>
<td>99.76% with 1 trace</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-113">113</xref>]</td>
<td>DNN&#x002B;VA</td>
<td>HW</td>
<td>901 trace</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-114">114</xref>]</td>
<td>MLP</td>
<td>ID</td>
<td>1 trace</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-115">115</xref>]</td>
<td>Swith-T Transformer</td>
<td>ID</td>
<td>98.7% with 1 trace</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>When datasets are dispersed across various sources, the task of matching data while preserving the privacy of the datasets is a widely demanded task. The preservation of the privacy of distributed data, particularly the data stored in individual edge devices, is shown to be consistently crucial [<xref ref-type="bibr" rid="ref-118">118</xref>]. Another technique to combine multiple deep-learning models is the well-known Federated Learning (FL) [<xref ref-type="bibr" rid="ref-119">119</xref>]. FL facilitates allows for training of a global deep-learning model by several participants without the need to share their individual local training data [<xref ref-type="bibr" rid="ref-120">120</xref>]. Like any significant scientific breakthrough, FL has the potential to be employed for malicious purposes. Reference [<xref ref-type="bibr" rid="ref-111">111</xref>] uses the FL framework as a model-level aggregation for the deep-learning based SCA. <xref ref-type="fig" rid="fig-8">Fig. 8b</xref> illustrates the model-level aggregation of DLSCAs to reduce the impact brought about by board diversity. In their experiments, three 8-bit ATxmega128D4 microcontroller implementations of AES are used to train three MLP models and these models are aggregated into one global MLP. The federated model in [<xref ref-type="bibr" rid="ref-111">111</xref>] achieves a 77.7% average single-trace attack accuracy even though they take board diversity into account. Another aggregation approach proposed in [<xref ref-type="bibr" rid="ref-111">111</xref>] is at the output level, which uses the ensemble learning scheme [<xref ref-type="bibr" rid="ref-121">121</xref>] to integrate the classification outcomes of multiple models trained on various profiling devices as shown in <xref ref-type="fig" rid="fig-8">Fig. 8c</xref>.</p>

<p>Subsequently, reference [<xref ref-type="bibr" rid="ref-64">64</xref>] achieves in demonstrating the first power analysis using deep learning to a commercial USIM card that implements MILENAGE (based on AES). The researchers successfully trained a CNN model on one USIM and demonstrated its ability to regain the key from a different USIM, using an average of only 20 traces. Afterwards, in 2023, reference [<xref ref-type="bibr" rid="ref-122">122</xref>] introduces a novel method to address the portability issue. This method presents a neural network layer evaluation method based on the ablation paradigm. By evaluating the sensitivity and resilience of each layer, this approach provides useful insights for constructing a Multiple Device Model from Single Device (MDMSD). Physical side-channel analysis generally works under the premise that plaintext or ciphertext is known, but this premise often breaks down in diverse scenarios. Blind SCA tackles this challenge by functioning without awareness of plaintext or ciphertext. Reference [<xref ref-type="bibr" rid="ref-113">113</xref>] proposes the first successful blind SCA on hiding countermeasures, introducing the Multi-point Cluster-based (MC) labeling technique. It is validated on four datasets, including symmetric-key algorithms (AES, ASCON) and the post-quantum cryptography algorithm Kyber. Reference [<xref ref-type="bibr" rid="ref-114">114</xref>] investigates the portability of deep learning-based side-channel attacks on EM traces and perform a comparative analysis of a set of preprocessing and unsupervised domain shift methods. And a large-scale public dataset is provided for benchmarking and reproducing side-channel attacks that handle domain shifts over EM traces.</p>
<p>Although the varied power profiles of IoT devices present a unique challenge for DLSCAs, adversaries still pose a considerable threat. As mentioned earlier, an attacker can prepare a dedicated deep-learning model for a specific target by acquiring an identical device from the market, allowing them to train on highly similar power traces. This approach enables precise modeling of the device&#x2019;s power consumption characteristics, significantly improving the attack&#x2019;s effectiveness. Since many consumer IoT devices are mass-produced with identical components and firmware, attackers can exploit this uniformity to extract sensitive information like cryptographic keys or executed operations with high precision.</p>
<p>However, when the target is a dedicated or custom-built device, and the adversary has no opportunity to obtain an identical copy, the challenge becomes more significant. Variations in architecture, firmware optimizations, and even the version of the cryptographic algorithm introduce discrepancies in power consumption, thereby increasing the difficulty for attackers to generalize a pre-trained model.</p>
</sec>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Power Analysis of Hardware Implementations</title>
<p>Time can affect leakage in software implementations of AES, and the leakage tends to be less noisy due to the sequential execution of instructions. In contrast to the challenges posed by hardware implementations of AES, this characteristic facilitates deep-learning models in exploring features linked to each subkey through a divide-and-conquer strategy. Traces obtained from hardware implementations often exhibit overlapping features arising from multiple concurrent operations as a result of parallel instruction execution. Consequently, side-channel attacks become intrinsically more difficult, especially within advanced process technology. Here are some advantages of hardware implementations of AES compared to software implementations:
<list list-type="simple">
<list-item><label>1.</label><p>Speed and efficiency. Hardware implementations of AES can provide significantly faster encryption and decryption speeds compared to software implementations [<xref ref-type="bibr" rid="ref-123">123</xref>].</p></list-item>
<list-item><label>2.</label><p>Lower power consumption. Hardware implementations of AES can be more power-efficient compared to software implementations, especially in resource-constrained devices [<xref ref-type="bibr" rid="ref-124">124</xref>].</p></list-item>
<list-item><label>3.</label><p>Physical security. Encryption and decryption operations are performed within dedicated hardware components, rendering it more difficult for attackers to retrieve sensitive data via side-channel attacks or software vulnerabilities [<xref ref-type="bibr" rid="ref-95">95</xref>].</p></list-item>
</list></p>
<p>Many existing attacks targeting hardware implementations of AES are founded on two widely recognized datasets: DPA contest V2 [<xref ref-type="bibr" rid="ref-125">125</xref>] and AES_HD [<xref ref-type="bibr" rid="ref-81">81</xref>]. These datasets&#x2019; traces are obtained from AES implementations on Xilinx Virtex-5 FPGA series. <xref ref-type="table" rid="table-5">Table 5</xref> summarizes some existing attack results on these two well-known datasets.</p>
<table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>Summary of existing DLSCAs on AES_HD, DPA_V2 and DPA_V4 datasets</title>
</caption>
<table>
<colgroup>
<col align="center" width="20mm"/>
<col align="center" width="20mm"/>
<col align="center" width="25mm"/>
<col align="center" width="25mm"/>
<col align="center" width="35mm"/> </colgroup>
<thead>
<tr>
<th>Work</th>
<th>Dataset</th>
<th>Best classifier</th>
<th>Leakage model</th>
<th>Result</th>
</tr>
</thead>
<tbody>
<tr>
<td>[<xref ref-type="bibr" rid="ref-56">56</xref>]</td>
<td>DPA_V2</td>
<td>CNN</td>
<td>ID</td>
<td><inline-formula id="ieqn-64"><mml:math id="mml-ieqn-64"><mml:mo>&#x2248;</mml:mo></mml:math></inline-formula>200 traces (full key)</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-81">81</xref>]</td>
<td>DPA_V4</td>
<td>SVM</td>
<td>HW</td>
<td>3 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-81">81</xref>]</td>
<td>AES_HD</td>
<td>TA</td>
<td>HW</td>
<td>700 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-80">80</xref>]</td>
<td>AES_HD</td>
<td>CNN</td>
<td>ID</td>
<td><inline-formula id="ieqn-65"><mml:math id="mml-ieqn-65"><mml:mo>&#x2248;</mml:mo></mml:math></inline-formula>800 traces to reach <inline-formula id="ieqn-66"><mml:math id="mml-ieqn-66"><mml:mi>P</mml:mi><mml:mi>G</mml:mi><mml:mi>E</mml:mi><mml:mo>&#x003C;</mml:mo><mml:mn>50</mml:mn></mml:math></inline-formula></td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-49">49</xref>]</td>
<td>DPA_V2</td>
<td>AE &#x002B; TA</td>
<td>ID</td>
<td>450 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-83">83</xref>]</td>
<td>AES_HD</td>
<td>CNN</td>
<td>ID</td>
<td><inline-formula id="ieqn-67"><mml:math id="mml-ieqn-67"><mml:mo>&#x2248;</mml:mo></mml:math></inline-formula>2.1K traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-83">83</xref>]</td>
<td>DPA_V4</td>
<td>CNN</td>
<td>ID</td>
<td><inline-formula id="ieqn-68"><mml:math id="mml-ieqn-68"><mml:mo>&#x2248;</mml:mo></mml:math></inline-formula>3 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-85">85</xref>]</td>
<td>AES_HD</td>
<td>CNN</td>
<td>ID</td>
<td>1050 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-84">84</xref>]</td>
<td>AES_HD</td>
<td>CNN</td>
<td>One bit</td>
<td>831 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-85">85</xref>]</td>
<td>DPA_V4</td>
<td>CNN</td>
<td>ID</td>
<td>3 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-108">108</xref>]</td>
<td>DPA_V4</td>
<td>GNN</td>
<td>HW</td>
<td>10 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-45">45</xref>]</td>
<td>DPA_V4</td>
<td>TransNet</td>
<td>ID</td>
<td>2 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-45">45</xref>]</td>
<td>AES_HD</td>
<td>TransNet</td>
<td>ID</td>
<td><inline-formula id="ieqn-69"><mml:math id="mml-ieqn-69"><mml:mo>&#x2248;</mml:mo></mml:math></inline-formula>900 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-92">92</xref>]</td>
<td>DPA_V2</td>
<td>CPA</td>
<td>HD</td>
<td>495 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-50">50</xref>]</td>
<td>DPA_V4</td>
<td>AMCNNet</td>
<td>ID</td>
<td>1 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-50">50</xref>]</td>
<td>AES_HD</td>
<td>AMCNNet</td>
<td>ID</td>
<td>683 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-87">87</xref>]</td>
<td>AES_HD</td>
<td>CNN</td>
<td>ID</td>
<td>1974 traces</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-86">86</xref>]</td>
<td>DPA_V4</td>
<td>InceptionNet</td>
<td>ID</td>
<td>1 traces</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In [<xref ref-type="bibr" rid="ref-81">81</xref>], the random forest approach requires in excess of 5000 traces to retrieve a subkey from the Virtex-5 implementation of AES. Reference [<xref ref-type="bibr" rid="ref-126">126</xref>] investigates the theoretical soundness of CNN in the context of SCAs on hardware implementations of AES. References [<xref ref-type="bibr" rid="ref-56">56</xref>,<xref ref-type="bibr" rid="ref-79">79</xref>,<xref ref-type="bibr" rid="ref-80">80</xref>] showcased effective assaults on Virtex-5 FPGAs through the application of CNNs. In [<xref ref-type="bibr" rid="ref-80">80</xref>], the CNN models trained on data with additional non-task-specific noise in [<xref ref-type="bibr" rid="ref-80">80</xref>] are able to recover a subkey with 25,000 traces for the AES_HD dataset.</p>
<p>Apart from Xilinx Virtex-5 FPGA, a non-profiled attack [<xref ref-type="bibr" rid="ref-127">127</xref>] employs roughly 3.7K traces to compromise a lightweight Artix-7 FPGA implementation of AES. Additionally, reference [<xref ref-type="bibr" rid="ref-128">128</xref>] demonstrates the efficacy of CNN-based side-channel attacks on ASICs. Afterwards, reference [<xref ref-type="bibr" rid="ref-102">102</xref>] introduces a multi-point attack framework named tandem scheme to attack hardware implementations of AES. This approach combines the classification outcomes of CNN models trained at various attack points. In [<xref ref-type="bibr" rid="ref-102">102</xref>], the proposed tandem scheme successfully recovered a subkey from a Xilinx Artix-7 FPGA implementation of AES using 219 traces.</p>
<p>When targeting advanced embedded system implementations of AES with parallel computing and countermeasures, the preprocessing step gains significant importance, especially for degenerating the noise level in the captured traces. Initially, electronic noise inherently exists in cryptographic devices, particularly in hardware implementations. Moreover, noise serves as a fundamental component in many countermeasures designed to mitigate SCA [<xref ref-type="bibr" rid="ref-49">49</xref>]. These noise sources play a crucial role in significantly diminishing the Signal-to-Noise Ratio (SNR) of side-channel leakages, thus increasing the level of complexity in side-channel attacks. For the purpose of reducing the noise level in captured power or EM traces, the SCA community begins to use deep learning techniques, especially autoencoders, to mitigate the effects caused by environmental noise or countermeasures. In [<xref ref-type="bibr" rid="ref-49">49</xref>], the training of the Convolutional Denoising Autoencoder (CDAE) involves learning a unique non-linear mapping that converts noisy traces into clean ones throughout the profiling phase. In the attack stage, reference [<xref ref-type="bibr" rid="ref-49">49</xref>] first uses the trained CDAE autoencoder to obtain the &#x2018;clean&#x2019; traces. Afterwards, these traces are applied to a trained classifier to extract the secret key. They test their model on two publicly available datasets. For the DPA_V2 dataset, they use the ID of the register value written in the final round as the power model to train the classifier. By utilizing the trained CDAE model for the denoising process, they achieve using 450 traces to recover the key. As for the ASCAD dataset, they achieve to use 160 traces to reach GE less than 2.</p>
<p><xref ref-type="fig" rid="fig-9">Fig. 9</xref> shows plots of example traces obtained from hardware implementations of AES. <xref ref-type="fig" rid="fig-9">Fig. 9a</xref> shows a power trace obtained from a Xilinx Virtex-5 FPGA of AES-128 (AES_HD dataset) and <xref ref-type="fig" rid="fig-9">Fig. 9b</xref> is a trace obtained from a Xilinx Artix-7 FPGA implementation of AES-128. From <xref ref-type="fig" rid="fig-6">Fig. 6b</xref> in the section of software implementation, distinct operations and executions associated with different subkeys are clearly observable. However, when it comes to the <xref ref-type="fig" rid="fig-9">Fig. 9b</xref>, we can observe that the trace segment representing 10-round operation of AES-128 obtained from an Artix-7 FPGA implementation execute the entire encryption process within several clock cycles. Intuitively, the task of training deep-learning models becomes more challenging for adversaries due to the requirement of dealing with overlapping features. Besides, by comparing the existing DLSCAs results between software and hardware implementations of AES (as shown in the tables above), it becomes evident that hardware implementations of AES consistently exhibit higher resistance against side-channel attacks compared to software implementations. Considering this, our recommendation for designing a cryptographic module with a focus on DLSCA resistance is to employ hardware implementations accompanied by suitable countermeasures. However, it is unfortunate that many cryptographic modules, particularly those utilized in lightweight IoT edge computing devices, continue to rely on unprotected software implementations due to resource limitations [<xref ref-type="bibr" rid="ref-17">17</xref>]. This leaves a significant number of systems vulnerable to the potential compromise of information security through practical, non-invasive, and highly effective attacks. It is imperative to recognize the existing threat landscape and take proactive measures to address these vulnerabilities.</p>
<fig id="fig-9">
<label>Figure 9</label>
<caption>
<title>Graphs of traces collected from hardware implementations of AES. The initial figure depicts a power trace collected from a Xilinx Virtex-5 FPGA of AES-128 (AES_HD dataset) and the second plot shows a power trace obtained from a Xilinx Artix-7 FPGA implementation of AES-128</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_74473-fig-9.tif"/>
</fig>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Discussion</title>
<p>This section begins with a discussion of the performance of deep learning models used in SCAs. Afterwards, we expand the discussion a bit on the unresolved issues in DLSCAs on AES, followed by an exploration of potential mitigation strategies.</p>
<sec id="s4_1">
<label>4.1</label>
<title>Technical Taxonomy</title>
<p>The characteristics of the dataset exert a direct influence on the selection of DL models. For sequential side-channel trace datasets (e.g., ASCAD, AES_RD), where traces exhibit temporal or local feature correlations, CNNs are well-suited for extracting local sequential features through their convolutional layers [<xref ref-type="bibr" rid="ref-96">96</xref>], while Transformers demonstrate superior performance in capturing long-range dependencies via self-attention mechanisms [<xref ref-type="bibr" rid="ref-115">115</xref>]. For datasets with high noise levels and low feature dimensions (e.g., early traces in DPA_V2), MLPs are applicable for basic feature fitting; their performance can be further improved through integration with CNNs or Transformers to achieve enhanced robustness. For graph-structured datasets (e.g., datasets where side-channel trace dependencies are modeled as graphs), GNNs are the only models capable of effectively learning the inter-node relationships within the graph [<xref ref-type="bibr" rid="ref-47">47</xref>].</p>
<p>The complexity of leakage patterns dictates the level of &#x201C;expressiveness&#x201D; demanded of a DL model. For the ID leakage model, in which leakage exhibits a direct mapping to sensitive data and follows a simplistic pattern, the multilayer perceptron structure of MLPs can adequately fit such linear or weakly nonlinear relationships [<xref ref-type="bibr" rid="ref-88">88</xref>]. For the HD and HW leakage models, in which leakage involves fine-grained patterns related to bit differences or bit weights, the local feature extraction capability of CNNs and the global pattern capture ability of Transformers prove crucial, these models enable the excavation of subtle bit-level patterns corresponding to HD/HW from side-channel traces. In summary, the &#x201C;complexity level&#x201D; of the leakage model requires that the DL models possess commensurate feature learning capabilities, ranging from basic fitting to advanced pattern recognition. A technical taxonomy for DL-based side-channel analyses is shown in the <xref ref-type="fig" rid="fig-10">Fig. 10</xref>.</p>
<fig id="fig-10">
<label>Figure 10</label>
<caption>
<title>A technical taxonomy for DL-based side-channel analyses, comprising leakage models, DL models, and datasets</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_74473-fig-10.tif"/>
</fig>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Analysis on Deep Learning Models</title>
<p>Deep learning methods have grown progressively prominent within the domain of SCAs due to their capacity to deal with complex patterns from side-channel measurements. As highlighted in this work, well-trained deep learning models have proven to be significantly more efficient at extracting secret keys from side-channel traces compared to traditional signal processing methods, especially when the leakage in the captured traces is relatively low. In this subsection, we expand on how different deep learning models discussed in this paper are suited for various attack scenarios.</p>
<p>MLPs are among the simplest deep learning models used in SCAs, typically applied when the side-channel traces are already pre-processed or feature-engineered. For example, as described in <xref ref-type="sec" rid="s3">Section 3</xref>, it is feasible to achieve a high single-trace success rate by using a straightforward MLP model when targeting ATXmega128D4 implementations of AES-128. This is because the traces leaked from the target are typically well-synchronized and noise-filtered.</p>
<p>CNNs, by contrast, are highly effective in capturing spatial hierarchies within data, making them well-suited for side-channel traces, especially when dealing with noisy, desynchronized traces or devices implementing countermeasures. CNNs apply convolutional layers to extract local features in the data, making them especially efficient in identifying subtle leakage patterns without the need for manual feature engineering. In side-channel analysis, CNNs can learn to identify both spatial (across multiple samples) and temporal (across successive clock cycles) correlations, which is critical for capturing key-dependent information from traces. For instance, CNNs have been shown to outperform MLPs in cases where traces contain low leakage levels or are noisy, as they are better at identifying the underlying features that correspond to the secret key. A major benefit of CNNs is their capacity to work with raw side-channel data, reducing the reliance on prior data processing steps and improving attack efficiency, especially in large-scale datasets.</p>
<p>AEs, are another deep learning model that has gained attention in SCAs. AEs are designed to learn efficient data representations by encoding and decoding input data, and are particularly useful for anomaly detection, denoising, or feature extraction. In SCAs, AEs can be used to lower the dimensionality of the side-channel traces, removing noise and focusing the model&#x2019;s attention on the most important features. AEs, especially when used for anomaly detection, can help identify unusual or anomalous side-channel traces, which could then be further processed by another classifier (such as an MLP or CNN) to extract the key. The ability of AEs to compress side-channel data while preserving crucial information makes them useful in situations where available traces are noisy or contain outlier data points that might otherwise interfere with the accuracy of the model.</p>
<p>To summarize, the choice of model depends heavily on the particular attack scenario and the features of the side-channel data. Next, we expand on unresolved issues with specific examples in DLSCAs on AES.</p>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Analysis on Leakage Model</title>
<p>The selection of an appropriate leakage model serves as a pivotal determinant that dictates the successfulness and efficiency of DL-SCAs. This is given that it delineates the core learning objective of the neural network and exerts a direct influence on the final attack accuracy metrics. Among the commonly adopted leakage models in current research, each exhibits unique advantages and inherent limitations&#x2014;factors that further modulate their applicability scope and performance in specific attack scenarios.</p>
<p>The HD model represents a robust analytical choice that is exclusively deployed in the context of hardware - based cryptographic implementations. The efficacy of this model hinges on the attacker&#x2019;s capacity to mathematically model the bit - flip events occurring within a specific register during the interval between two consecutive cryptographic operations. When such precise architectural knowledge is accessible, the HD model enables highly accurate characterization of power consumption behaviors. This capability frequently facilitates rapid and successful cryptographic key recovery attacks, thereby demonstrating its effectiveness in targeted hardware-oriented scenarios. Nevertheless, a fundamental limitation of the HD model lies in its inapplicability to softwares-based cryptographic implementations, as well as any scenario where the dynamic changes in the internal state of the cryptographic system cannot be precisely delineated. This constraint ultimately restricts the HD model to a relatively narrow domain of side-channel attack applications.</p>
<p>For software - oriented cryptographic targets, the primary comparative focus lies between the ID model and the HW model. The HW model, which categorizes intermediate cryptographic values into distinct classes according to the count of bits set to logic one, imposes a more generalized leakage assumption compared to the ID model&#x2014;this generalization enhances its adaptability to the variable execution environments commonly associated with software&#x2014;based cryptographic implementations. While the HW model demonstrates theoretical validity, this generalization gives rise to a notable practical limitation: severe label imbalance. The distribution of HW values conforms to a binomial distribution, which implies that certain classes exhibit substantially higher occurrence frequencies than others. This label imbalance poses challenges to model training: it may bias the model toward majority classes, impairing its ability to converge to a high-accuracy, robust solution across all cryptographic key hypotheses and undermining the deep-learning side-channel attack&#x2019;s effectiveness.</p>
<p>In contrast, the ID model has evolved into the predominant choice for state-of-the-art (SOTA) DL-SCAs. By treating each distinct intermediate cryptographic value as a unique class label, the ID model completely circumvents the label imbalance issue inherent to alternative leakage models. This approach formulates a classification task for the deep learning network that exhibits greater uniformity in label distribution. This methodological design enables the model to directly learn the nuanced correlation between the exact sensitive data values and their corresponding power consumption signals. Consequently, attacks utilizing the ID model consistently demonstrate faster convergence, superior generalization capability, and higher overall attack accuracy compared to those relying on the HW model, solidifying its position as the preferred leakage model for software implementations.</p>
</sec>
<sec id="s4_4">
<label>4.4</label>
<title>Unresolved Issues in DLSCAs</title>
<p>Unresolved issues in DLSCAs on AES often revolve around enhancing deep-learning models&#x2019; generalization capability and interpretability. At present, There exist studies that employ deep-learning models as tools for segmenting traces across various victims, rather than as an attacking scheme. For instance, reference [<xref ref-type="bibr" rid="ref-116">116</xref>] puts forward an automated trace segmentation approach based on reinforcement learning that applies to many common implementations of public-key algorithms. Through experiments, they proved the transferability of the proposed framework on in excess of 10 datasets. Another unresolved challenge in DLSCA on AES is the lack of model interpretability. Deep learning models, particularly those involving complex architectures like CNNs or AEs, often function as black boxes, offering limited insights into the features they learn or the decision-making processes they follow. For example, while a CNN might effectively retrieve the secret key from power traces with jitter-based countermeasures [<xref ref-type="bibr" rid="ref-29">29</xref>], it remains unclear which specific features of traces contribute the most to the success. This lack of interpretability not only hinders the debugging and fine-tuning of models but also complicates the evaluation of countermeasures.</p>
<p>In the current research landscape, the evaluation indicator system for DLSCA remains unstandardized: due to differences in model architectures, dataset characteristics, and training strategies, core performance indicators like SR and PGE show significant result variability across different deep learning models, while practical indicators such as key recovery time and computational complexity lack unified, clear definitions in existing literature. Additionally, experimental settings vary substantially across studies, collectively creating significant barriers to objective, quantitative cross-study comparisons of DLSCA performance.</p>
<p>As quantum computing progresses from theoretical constructs toward practical hybrid architectures, its potential influence on side-channel analysis and DLSCA frameworks is becoming increasingly significant. Quantum algorithms, such as Grover&#x2019;s search and variational quantum classifiers, could dramatically accelerate key recovery or leakage modeling tasks when integrated into classical deep learning pipelines. Conversely, quantum hardware itself introduces novel side-channel vectors&#x2014;arising from decoherence dynamics, qubit control signals, and cryogenic interfaces&#x2014;that may be exploitable using techniques conceptually similar to classical DLSCAs. This convergence raises new security concerns where quantum and classical leakage paths may interact, amplifying vulnerability in hybrid cryptographic environments. Consequently, future research on DLSCAs should consider quantum-aware threat models and hybrid analysis frameworks capable of evaluating cross-domain leakage between quantum and classical computation layers.</p>
<p>Next, we explore various potential mitigation strategies highlighted in the review.</p>
</sec>
<sec id="s4_5">
<label>4.5</label>
<title>Potential Mitigation Strategies</title>
<p>From the review, it is evident that countermeasures are effective in mitigating the impact of SCAs. In this section, we categorize potential mitigation strategies into three different levels: algorithmic, implementation, and physical levels.</p>
<p><bold>Algorithm level.</bold> From the review, we can find that it is always more difficult for adversaries to compromise some protected version of AES (such as masking) compared to unprotected AES in the same implementation. For example, Aysu et al. [<xref ref-type="bibr" rid="ref-129">129</xref>] propose a lightweight yet effective countermeasure, which leverages the specific opportunities of Binary Ring-Learning With Errors (B-RLWE) and is based on the randomization of intermediate states and masked threshold decoding. And Cui and Balasch [<xref ref-type="bibr" rid="ref-130">130</xref>] propose extending the RISC-V core with custom instructions to accelerate AES finite field arithmetic, thereby combating SCAs. At the same time, the reasonable configuration of the algorithm can also demonstrate extraordinary anti-side-channel capabilities in certain specific SCAs. For example, Barthe et al. [<xref ref-type="bibr" rid="ref-131">131</xref>] present a general method grounded in the concept of constant-time simulation to against cache-based timing attacks.</p>
<p><bold>Implementation level.</bold> Another path to protect IoT edge devices from DLSCAs is to implement a cryptographic algorithm with interference. Techniques introducing noise and randomness have shown great effectiveness in this review to make the side-channel leakage hard to be detected. For example, Hardware implementations, in general, are more resistant to side-channel attacks than software implementations, since they are designed with fixed, streamlined circuits that minimize variations in power consumption, timing, and EM emissions. Unlike software, which relies on sequential execution of instructions that can leak information through observable patterns, hardware circuits can execute operations in parallel, making it harder to correlate side-channel information with secret data [<xref ref-type="bibr" rid="ref-95">95</xref>]. Meanwhile, due to the limitations of software circuits in hardware devices, their functional implementation lags behind that of hardware-based SCA countermeasures. For SCA-related issues that existing hardware cannot address with software algorithms, new approaches emerge in hardware design. For example, Bhandari et al. introduce LiCSPA, a novel countermeasure strategy addressing a critical threat to cryptographic hardware in modern technology nodes [<xref ref-type="bibr" rid="ref-132">132</xref>]. Implementation-level countermeasures that introduce noise are another effective strategy for mitigating the impact of SCAs. For instance, the AES duplication technique proposed in [<xref ref-type="bibr" rid="ref-133">133</xref>] ensures that a duplicated block within the device generates algorithmic noise closely resembling the power profile of the primary block and dependent on its input. This approach adds leakage-like noise to the side-channel traces, effectively complicating the task for adversaries by making it more challenging to distinguish actual leakage from the noise. For the current mainstream SCA methods based on deep learning, the introduction of noise will make the target features less obvious, thereby reducing the attacker&#x2019;s attack capability.</p>
<p><bold>Physical level.</bold> The final suggested potential mitigation strategy in this paper is to directly utilize some other physical products to stop or slow down adversaries to capture side-channel measurements. As demonstrated in [<xref ref-type="bibr" rid="ref-134">134</xref>], placing a hidden camera behind a door or in an adjacent room significantly complicates detection via EM radiation. Reference [<xref ref-type="bibr" rid="ref-135">135</xref>] investigates the influence of physical countermeasures on side-channel vulnerability through an experimental setup featuring a PCB-based protective enclosure designed to shield the integrated circuit from electromagnetic and power analysis attacks. The experimental results demonstrate that the side-channel countermeasures at the physical layer will have a negative impact on the attack effect of the attacker.</p>
<p>Effective countermeasures in these areas require continuous research and development to keep pace with the evolving sophistication of attack techniques. From the review, it is evident that for the security-critical applications, conducting multi-level countermeasure that integrates algorithmic, physical, and implementation approaches is essential.</p>
</sec>
<sec id="s4_6">
<label>4.6</label>
<title>Impact of Microcontrollers on Leakage Characteristics</title>
<p>The effectiveness of a side-channel attack is profoundly influenced by the underlying hardware architecture executing the cryptographic algorithm. A critical, often overlooked, factor in profiling attacks is the correlation between the microcontroller&#x2019;s data path width and the resulting leakage model.</p>
<p>In 8-bit architectures, operations are inherently byte-oriented. When a 128-bit AES key is processed, each 8-bit subkey is typically manipulated in a separate instruction cycle. This high locality and specificity of leakage make 8-bit MCUs comparatively more vulnerable to SCAs, as the attacker can easily model and target the leakage of individual key bytes. In contrast, 32-bit architectures like the ARM Cortex-M series introduce significant complexity. These processors possess a 32-bit data bus and can perform word-level operations. Consequently, a single instruction might process up to four AES key bytes simultaneously. This parallelism causes the leakage of multiple subkeys to overlap in the time domain, creating a composite leakage signal that is much harder to dissect. The attacker&#x2019;s model, which often assumes leakage from a single small part of the state, becomes less accurate. As illustrated in the <xref ref-type="fig" rid="fig-11">Fig. 11a</xref> presents the <italic>t</italic>-test results of the XMega dataset, while <xref ref-type="fig" rid="fig-11">Fig. 11b</xref> depicts the <italic>t</italic>-test results of the STM32 dataset. It can be observed from the results that the side-channel leakage of the XMega dataset is prominent with a relatively sparse distribution; in contrast, the side-channel leakage of the data collected from the STM32 platform exhibits a more compact distribution, rendering it more difficult to conduct SCA.</p>
<fig id="fig-11">
<label>Figure 11</label>
<caption>
<title>Graphs presents test results for AES-128 implementations across differenton diverse MCU architectures are presented: the first figure corresponds to ATxmega128D4 microcontroller, the second forto STM32F3 MCU, and the third forto Artix-7 FPGA, with side-channel leakage trace examples proviincluded for each architecture</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_74473-fig-11.tif"/>
</fig>
<p>Furthermore, in an FPGA, the cryptographic algorithm is often fully or partially unrolled and deeply pipelined. Different operations on multiple blocks of data occur concurrently across the chip&#x2019;s fabric. The side-channel signal becomes an aggregate of numerous concurrent switching activities, making the contribution of any specific operation extremely weak and noisy. Therefore, attacking a well-designed FPGA implementation generally requires orders of magnitude more traces and more advanced analysis techniques compared to attacking a software implementation on a sequential MCU.</p>
<p>In summary, the attack complexity escalates from the byte-level serial processing of 8-bit AVR, through the word-level parallelism of 32-bit ARM cores, to the massive spatial and temporal concurrency of FPGAs. Any comprehensive SCA study must account for these architectural differences when selecting attack strategies and setting performance expectations.</p>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Conclusions</title>
<p>This paper offers a thorough review of different deep learning methods used in side-channel attacks on AES implementations. We begin with an introduction to deep learning-based side-channel attacks, followed by an extensive comparison of diverse deep learning techniques. Subsequently, we conduct an extensive review of relevant research works, organizing them based on the targeted implementations and utilized side channels. The findings and steps of each approach are reported in detail. In summary, this study highlights the growing concerns and potential risks associated with the use of deep learning in side-channel attacks. Although deep learning offers powerful capabilities in extracting and analyzing side-channel information, it also faces challenges such as the requirement for substantial volumes of training data, vulnerability to adversarial attacks, and the potential for overfitting.</p>
<p>While real-world applications of DLSCAs are rarely reported, their potential to threaten cryptographic system security is well recognized through research and proof-of-concept demonstrations. With the ongoing advancement of the deep learning field, it becomes imperative for researchers and practitioners to develop robust countermeasures and defense strategies to mitigate the risks associated with these attacks. This survey functions as a valuable resource to understand the current state of DLSCAs, providing information on their capabilities, challenges, and potential avenues for future research.</p>
</sec>
<sec id="s6">
<label>6</label>
<title>Future Works</title>
<p>Future research about DLSCAs on AES in IoT environments should address critical challenges to enhance their effectiveness and adaptability. A key direction involves improving model generalization to account for real-world variability. Current studies often rely on controlled datasets, but IoT devices operate in dynamic conditions with noise and interference. Techniques such as transfer learning, domain adaptation, and the creation of more diverse datasets could improve the robustness of the model in diverse operational environments.</p>
<p>Another crucial issue is the interpretability of deep learning models in SCA tasks. Despite their high performance, these models often serve as &#x201C;black boxes,&#x201D; offering little understanding of the features they exploit. Research should focus on applying explainability techniques, such as feature attribution or visualizing model outputs, to identify key characteristics in side-channel traces. Developing inherently interpretable architectures tailored for side-channel analysis would also advance this area.</p>
<p>Lastly, the most important future work is the development of countermeasures. Techniques such as running-time anomaly detection, adaptive cryptographic mechanisms, and active interference sources could significantly mitigate risks. Collaboration between academia and industry will be essential to translate these strategies into practical tools to secure IoT systems.</p>
<p>The development of DLSCAs raises ethical concerns, as the same research that strengthens cybersecurity can also be exploited by malicious actors. While exploiting how DLSCAs can pose a threat to cryptographic implementations is essential for evaluating vulnerabilities of physical devices, publicly available shared DLSCA techniques&#x2019; details may inspire adversaries&#x2019; malicious activities. However, without exploring advanced attack techniques and case studies, it is difficult to keep up-to-date on the capacity of these malicious actions. An essential approach to balancing ethical considerations with academic requirements in DLSCA research involves restricting the experimental setup to a controlled environment and a predefined scope of targets.</p>
</sec>
</body>
<back>
<ack>
<p>The authors would like to acknowledge the support from the Key R&#x0026;D Program of Hunan Province (Grant No. 2025AQ2024) and Distinguished Young Scientists Fund (Grant No. 24B0446) for providing the research environment. Additionally, we extend our gratitude to the researchers who have made their datasets publicly available, enabling comprehensive comparative analysis in this field.</p>
</ack>
<sec>
<title>Funding Statement</title>
<p>This work is supported by the following research grants.</p>
<p>1. The Key R&#x0026;D Program of Hunan Province (Grant No. 2025AQ2024) of the Department of Science and Technology of Hunan Province.</p>
<p>2. Distinguished Young Scientists Fund (Grant No. 24B0446) of Hunan Education Department.</p>
</sec>
<sec>
<title>Author Contributions</title>
<p>Junnian Wang conceptualized the study framework and methodology, conducted the literature analysis and synthesis and wrote the original draft. Xiaoxia Wang performed systematic literature investigation data extraction and validation and contributed to manuscript structuring. Zexin Luo provided critical analysis of research trends and contributed to manuscript review and editing. Qixiang Ouyang supervised project progress and contributed to manuscript review and editing. Chao Zhou conducted comparative analysis of methodologies and contributed to results interpretation and manuscript revision. Huanyu Wang as the corresponding author oversaw project coordination provided overall supervision, provided resources and led the manuscript finalization process. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability">
<title>Availability of Data and Materials</title>
<p>There are no data generated in this manuscript.</p>
</sec>
<sec>
<title>Ethics Approval</title>
<p>The manuscript does not have any ethical problem.</p>
</sec>
<sec sec-type="COI-statement">
<title>Conflicts of Interest</title>
<p>The authors declare no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Admass</surname> <given-names>WS</given-names></string-name>, <string-name><surname>Munaye</surname> <given-names>YY</given-names></string-name>, <string-name><surname>Diro</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Cyber security: state of the art, challenges and future directions</article-title>. <source>Cyber Secur Appl</source>. <year>2023</year>;<volume>2</volume>(<issue>1</issue>):<fpage>100031</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.csa.2023.100031</pub-id>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>C</given-names></string-name>, <string-name><surname>Chakraborty</surname> <given-names>A</given-names></string-name>, <string-name><surname>Chawla</surname> <given-names>N</given-names></string-name>, <string-name><surname>Roggel</surname> <given-names>N</given-names></string-name></person-group>. <article-title>Frequency throttling side-channel attack</article-title>. In: <conf-name>ACM SIGSAC Conference on Computer and Communications Security (CCS); 2022 Nov 7&#x2013;11; Los Angeles, CA, USA</conf-name>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Tosun</surname> <given-names>T</given-names></string-name>, <string-name><surname>Savas</surname> <given-names>E</given-names></string-name></person-group>. <article-title>Zero-value filtering for accelerating non-profiled side-channel attack on incomplete NTT based implementations of lattice-based cryptography</article-title>. <source>IEEE Trans Inf Forensics Secur (TIFS)</source>. <year>2024</year>;<volume>19</volume>(<issue>3</issue>):<fpage>3353</fpage>&#x2013;<lpage>65</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tifs.2024.3359890</pub-id>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Gao</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Qiu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>B</given-names></string-name>, <string-name><surname>Ma</surname> <given-names>H</given-names></string-name>, <string-name><surname>Abuadbba</surname> <given-names>A</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>DeepTheft: stealing DNN model architectures through power side channel</article-title>. In: <conf-name>IEEE Symposium on Security and Privacy (SP)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2024</year>. p. <fpage>3311</fpage>&#x2013;<lpage>26</lpage>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Batina</surname> <given-names>L</given-names></string-name>, <string-name><surname>Bhasin</surname> <given-names>S</given-names></string-name>, <string-name><surname>Jap</surname> <given-names>D</given-names></string-name>, <string-name><surname>Picek</surname> <given-names>S</given-names></string-name></person-group>. <article-title>CSI NN: reverse engineering of neural network architectures through electromagnetic side channel</article-title>. In: <conf-name>USENIX Security Symposium (USENIX Security); 2019 Aug 14&#x2013;16; Santa Clara, CA , USA</conf-name>. p. <fpage>515</fpage>&#x2013;<lpage>32</lpage>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Moradi</surname> <given-names>A</given-names></string-name>, <string-name><surname>Schneider</surname> <given-names>T</given-names></string-name></person-group>. <article-title>Improved side-channel analysis attacks on Xilinx bitstream encryption of 5, 6, and 7 series</article-title>. In: <conf-name>International Workshop on Constructive Side-Channel Analysis and Secure Design</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2016</year>. p. <fpage>71</fpage>&#x2013;<lpage>87</lpage>. doi:<pub-id pub-id-type="doi">10.1007/978-3-319-43283-0_5</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Paccagnella</surname> <given-names>R</given-names></string-name>, <string-name><surname>Gang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Vasquez</surname> <given-names>WR</given-names></string-name>, <string-name><surname>Kohlbrenner</surname> <given-names>D</given-names></string-name>, <string-name><surname>Shacham</surname> <given-names>H</given-names></string-name>, <etal>et al.</etal></person-group> <article-title>GPU. zip: on the side-channel implications of hardware-based graphical data compression</article-title>. In: <conf-name>2024 IEEE Symposium on Security and Privacy (SP); 2024 May 19&#x2013;23; San Francisco, CA, USA</conf-name>. p. <fpage>3716</fpage>&#x2013;<lpage>34</lpage>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yang</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Gasti</surname> <given-names>P</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>G</given-names></string-name>, <string-name><surname>Farajidavar</surname> <given-names>A</given-names></string-name>, <string-name><surname>Balagani</surname> <given-names>KS</given-names></string-name></person-group>. <article-title>On inferring browsing activity on smartphones via USB power analysis side-channel</article-title>. <source>IEEE Trans Inf Forensics Secur (TIFS)</source>. <year>2016</year>;<volume>12</volume>(<issue>5</issue>):<fpage>1056</fpage>&#x2013;<lpage>66</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tifs.2016.2639446</pub-id>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Yu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Moraitis</surname> <given-names>M</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name></person-group>. <article-title>Why deep learning makes it difficult to keep secrets in FPGAs</article-title>. In: <conf-name>Workshop in Dynamic and Novel Advances in Machine Learning and Intelligent Cyber Security; 2020 Dec 7; Lexington, MA, USA</conf-name>. p. <fpage>1</fpage>&#x2013;<lpage>9</lpage>. doi:<pub-id pub-id-type="doi">10.1145/3477997.3478001</pub-id>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>He</surname> <given-names>D</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Deng</surname> <given-names>T</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Improving IIoT security: unveiling threats through advanced side-channel analysis</article-title>. <source>Comput Secur</source>. <year>2024</year>;<volume>148</volume>(<issue>1</issue>):<fpage>104135</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.cose.2024.104135</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wei</surname> <given-names>L</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>K</given-names></string-name>, <string-name><surname>Xu</surname> <given-names>W</given-names></string-name>, <string-name><surname>Xu</surname> <given-names>Q</given-names></string-name></person-group>. <article-title>On code execution tracking via power side-channel</article-title>. In: <conf-name>ACM SIGSAC Conference on Computer and Communications Security (CCS)</conf-name>. <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2016</year>. p. <fpage>1019</fpage>&#x2013;<lpage>31</lpage>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Hu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zheng</surname> <given-names>T</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>H</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Password-stealing without hacking: Wi-Fi enabled practical keystroke eavesdropping</article-title>. In: <conf-name>ACM SIGSAC Conference on Computer and Communications Security (CCS)</conf-name>. <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2023</year>. p. <fpage>239</fpage>&#x2013;<lpage>52</lpage>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Cardaioli</surname> <given-names>M</given-names></string-name>, <string-name><surname>Conti</surname> <given-names>M</given-names></string-name>, <string-name><surname>Balagani</surname> <given-names>K</given-names></string-name>, <string-name><surname>Gasti</surname> <given-names>P</given-names></string-name></person-group>. <article-title>Your pin sounds good! Augmentation of pin guessing strategies via audio leakage</article-title>. In: <conf-name>European Symposium on Research in Computer Security (ESORICS)</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2020</year>. p. <fpage>720</fpage>&#x2013;<lpage>35</lpage>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Ni</surname> <given-names>T</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhao</surname> <given-names>Q</given-names></string-name></person-group>. <article-title>Recovering fingerprints from in-display fingerprint sensors via electromagnetic side channel</article-title>. In: <conf-name>ACM SIGSAC Conference on Computer and Communications Security (CCS)</conf-name>. <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2023</year>. p. <fpage>253</fpage>&#x2013;<lpage>67</lpage>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Camurati</surname> <given-names>G</given-names></string-name>, <string-name><surname>Poeplau</surname> <given-names>S</given-names></string-name>, <string-name><surname>Muench</surname> <given-names>M</given-names></string-name>, <string-name><surname>Hayes</surname> <given-names>T</given-names></string-name>, <string-name><surname>Francillon</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Screaming channels: when electromagnetic side channels meet radio transceivers</article-title>. In: <conf-name>ACM SIGSAC Conference on Computer and Communications Security (CCS)</conf-name>. <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2018</year>. p. <fpage>163</fpage>&#x2013;<lpage>77</lpage>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>R</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name>, <string-name><surname>Brisfors</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Advanced far field EM side-channel attack on AES</article-title>. In: <conf-name>ACM Cyber-Physical System Security Workshop (CPSS)</conf-name>. <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2021</year>. p. <fpage>29</fpage>&#x2013;<lpage>39</lpage>. doi:<pub-id pub-id-type="doi">10.1145/3411504.3421214</pub-id>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>H</given-names></string-name></person-group>. <article-title>Amplitude-modulated EM side-channel attack on provably secure masked AES</article-title>. <source>J Cryptogr Eng</source>. <year>2024</year>;<volume>14</volume>(<issue>3</issue>):<fpage>537</fpage>&#x2013;<lpage>49</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s13389-024-00347-3</pub-id>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Kocher</surname> <given-names>P</given-names></string-name>, <string-name><surname>Jaffe</surname> <given-names>J</given-names></string-name>, <string-name><surname>Jun</surname> <given-names>B</given-names></string-name></person-group>. <article-title>Differential power analysis</article-title>. In: <conf-name>Annual International Cryptology Conference</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>1999</year>. p. <fpage>388</fpage>&#x2013;<lpage>97</lpage>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Kocher</surname> <given-names>PC</given-names></string-name></person-group>. <article-title>Timing attacks on implementations of diffie-hellman, RSA, DSS, and other systems</article-title>. In: <conf-name>Annual International Cryptology Conference</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>1996</year>. p. <fpage>104</fpage>&#x2013;<lpage>13</lpage>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Nassi</surname> <given-names>B</given-names></string-name>, <string-name><surname>Vayner</surname> <given-names>O</given-names></string-name>, <string-name><surname>Iluz</surname> <given-names>E</given-names></string-name>, <string-name><surname>Nassi</surname> <given-names>D</given-names></string-name>, <string-name><surname>Jancar</surname> <given-names>J</given-names></string-name>, <string-name><surname>Genkin</surname> <given-names>D</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Optical cryptanalysis: recovering cryptographic keys from power LED light fluctuations</article-title>. In: <conf-name>ACM SIGSAC Conference on Computer and Communications Security (CCS)</conf-name>. <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2023</year>. p. <fpage>268</fpage>&#x2013;<lpage>80</lpage>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Genkin</surname> <given-names>D</given-names></string-name>, <string-name><surname>Shamir</surname> <given-names>A</given-names></string-name>, <string-name><surname>Tromer</surname> <given-names>E</given-names></string-name></person-group>. <article-title>Acoustic cryptanalysis</article-title>. <source>J Cryptol</source>. <year>2017</year>;<volume>30</volume>(<issue>2</issue>):<fpage>392</fpage>&#x2013;<lpage>443</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s00145-015-9224-2</pub-id>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Agrawal</surname> <given-names>D</given-names></string-name>, <string-name><surname>Archambeault</surname> <given-names>B</given-names></string-name>, <string-name><surname>Rao</surname> <given-names>JR</given-names></string-name>, <string-name><surname>Rohatgi</surname> <given-names>P</given-names></string-name></person-group>. <article-title>The EM side&#x2014;channel(s)</article-title>. In: <conf-name>International Workshop on Cryptographic Hardware and Embedded Systems (CHES)</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2002</year>. p. <fpage>29</fpage>&#x2013;<lpage>45</lpage>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Bayoudh</surname> <given-names>K</given-names></string-name></person-group>. <article-title>A survey of multimodal hybrid deep learning for computer vision: architectures, applications, trends, and challenges</article-title>. <source>Inf Fusion</source>. <year>2024</year>;<volume>105</volume>:<fpage>102217</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.inffus.2023.102217</pub-id>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Yao</surname> <given-names>X</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Xue</surname> <given-names>J</given-names></string-name></person-group>. <article-title>JAN: joint attention networks for automatic ICD coding</article-title>. <source>IEEE J Biomed Health Inform</source>. <year>2022</year>;<volume>26</volume>(<issue>10</issue>):<fpage>5235</fpage>&#x2013;<lpage>46</lpage>. doi:<pub-id pub-id-type="doi">10.1109/jbhi.2022.3189404</pub-id>; <pub-id pub-id-type="pmid">35802549</pub-id></mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hua</surname> <given-names>H</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>T</given-names></string-name>, <string-name><surname>Dong</surname> <given-names>N</given-names></string-name>, <string-name><surname>Li</surname> <given-names>W</given-names></string-name>, <string-name><surname>Cao</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Edge computing with artificial intelligence: a machine learning perspective</article-title>. <source>ACM Comput Surv</source>. <year>2023</year>;<volume>55</volume>(<issue>9</issue>):<fpage>1</fpage>&#x2013;<lpage>35</lpage>. doi:<pub-id pub-id-type="doi">10.1145/3555802</pub-id>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Golightly</surname> <given-names>L</given-names></string-name>, <string-name><surname>Modesti</surname> <given-names>P</given-names></string-name>, <string-name><surname>Garcia</surname> <given-names>R</given-names></string-name>, <string-name><surname>Chang</surname> <given-names>V</given-names></string-name></person-group>. <article-title>Securing distributed systems: a survey on access control techniques for cloud, blockchain, IoT and SDN</article-title>. <source>Cyber Secur Appl</source>. <year>2023</year>;<volume>1</volume>(<issue>7</issue>):<fpage>100015</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.csa.2023.100015</pub-id>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Brier</surname> <given-names>E</given-names></string-name>, <string-name><surname>Clavier</surname> <given-names>C</given-names></string-name>, <string-name><surname>Olivier</surname> <given-names>F</given-names></string-name></person-group>. <article-title>Correlation power analysis with a leakage model</article-title>. In: <conf-name>International Workshop on Cryptographic Hardware and Embedded Systems (CHES)</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2004</year>. p. <fpage>16</fpage>&#x2013;<lpage>29</lpage>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Chari</surname> <given-names>S</given-names></string-name>, <string-name><surname>Rao</surname> <given-names>JR</given-names></string-name>, <string-name><surname>Rohatgi</surname> <given-names>P</given-names></string-name></person-group>. <article-title>Template attacks</article-title>. In: <conf-name>International Workshop on Cryptographic Hardware and Embedded Systems (CHES)</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2002</year>. p. <fpage>13</fpage>&#x2013;<lpage>28</lpage>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Cagli</surname> <given-names>E</given-names></string-name>, <string-name><surname>Dumas</surname> <given-names>C</given-names></string-name>, <string-name><surname>Prouff</surname> <given-names>E</given-names></string-name></person-group>. <article-title>Convolutional neural networks with data augmentation against jitter-based countermeasures</article-title>. In: <conf-name>International Conference on Cryptographic Hardware and Embedded Systems (CHES)</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2017</year>. p. <fpage>45</fpage>&#x2013;<lpage>68</lpage>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Randolph</surname> <given-names>M</given-names></string-name>, <string-name><surname>Diehl</surname> <given-names>W</given-names></string-name></person-group>. <article-title>Power side-channel attack analysis: a review of 20 years of study for the layman</article-title>. <source>Cryptography</source>. <year>2020</year>;<volume>4</volume>(<issue>2</issue>):<fpage>15</fpage>. doi:<pub-id pub-id-type="doi">10.3390/cryptography4020015</pub-id>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Gierlichs</surname> <given-names>B</given-names></string-name>, <string-name><surname>Batina</surname> <given-names>L</given-names></string-name>, <string-name><surname>Tuyls</surname> <given-names>P</given-names></string-name>, <string-name><surname>Preneel</surname> <given-names>B</given-names></string-name></person-group>. <article-title>Mutual information analysis: a generic side-channel distinguisher</article-title>. In: <conf-name>International Workshop on Cryptographic Hardware and Embedded Systems (CHES)</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2008</year>. p. <fpage>426</fpage>&#x2013;<lpage>42</lpage>.</mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Goodwill</surname> <given-names>G</given-names></string-name>, <string-name><surname>Jun</surname> <given-names>B</given-names></string-name>, <string-name><surname>Jaffe</surname> <given-names>J</given-names></string-name>, <string-name><surname>Rohatgi</surname> <given-names>P</given-names></string-name></person-group>. <article-title>A testing methodology for side-channel resistance validation</article-title>. In: <conf-name>NIST Non-Invasive Attack Testing Workshop; 2011 Sep 26&#x2013;27; Nara, Japan</conf-name>; Vol. <volume>7</volume>, p. <fpage>115</fpage>&#x2013;<lpage>36</lpage>.</mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hettwer</surname> <given-names>B</given-names></string-name>, <string-name><surname>Gehrer</surname> <given-names>S</given-names></string-name>, <string-name><surname>G&#x00FC;neysu</surname> <given-names>T</given-names></string-name></person-group>. <article-title>Applications of machine learning techniques in side-channel attacks: a survey</article-title>. <source>J Cryptogr Eng</source>. <year>2020</year>;<volume>10</volume>(<issue>2</issue>):<fpage>135</fpage>&#x2013;<lpage>62</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s13389-019-00212-8</pub-id>.</mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Cortes</surname> <given-names>C</given-names></string-name>, <string-name><surname>Vapnik</surname> <given-names>V</given-names></string-name></person-group>. <article-title>Support-vector networks</article-title>. <source>Mach Learn</source>. <year>1995</year>;<volume>20</volume>(<issue>3</issue>):<fpage>273</fpage>&#x2013;<lpage>97</lpage>. doi:<pub-id pub-id-type="doi">10.1023/a:1022627411411</pub-id>.</mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Murthy</surname> <given-names>SK</given-names></string-name></person-group>. <article-title>Automatic construction of decision trees from data: a multi-disciplinary survey</article-title>. <source>Data Min Knowl Discov</source>. <year>1998</year>;<volume>2</volume>(<issue>4</issue>):<fpage>345</fpage>&#x2013;<lpage>89</lpage>. doi:<pub-id pub-id-type="doi">10.1023/a:1009744630224</pub-id>.</mixed-citation></ref>
<ref id="ref-36"><label>[36]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Breiman</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Random forests</article-title>. <source>Mach Learn</source>. <year>2001</year>;<volume>45</volume>:<fpage>5</fpage>&#x2013;<lpage>32</lpage>.</mixed-citation></ref>
<ref id="ref-37"><label>[37]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Picek</surname> <given-names>S</given-names></string-name>, <string-name><surname>Perin</surname> <given-names>G</given-names></string-name>, <string-name><surname>Mariot</surname> <given-names>L</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Batina</surname> <given-names>L</given-names></string-name></person-group>. <article-title>SoK: deep learning-based physical side-channel analysis</article-title>. <source>ACM Comput Surv</source>. <year>2023</year>;<volume>55</volume>(<issue>11</issue>):<fpage>1</fpage>&#x2013;<lpage>35</lpage>. doi:<pub-id pub-id-type="doi">10.1145/3569577</pub-id>.</mixed-citation></ref>
<ref id="ref-38"><label>[38]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Zunaidi</surname> <given-names>MR</given-names></string-name>, <string-name><surname>Sayakkara</surname> <given-names>A</given-names></string-name>, <string-name><surname>Scanlon</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Systematic literature review of EM-SCA attacks on encryption</article-title>. <comment>arXiv:2402.10030. 2024</comment>.</mixed-citation></ref>
<ref id="ref-39"><label>[39]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Panoff</surname> <given-names>M</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Shan</surname> <given-names>H</given-names></string-name>, <string-name><surname>Jin</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>A review and comparison of AI-enhanced side channel analysis</article-title>. <source>J Emerg Technol Comput Syst</source>. <year>2022</year>;<volume>18</volume>(<issue>3</issue>):<fpage>62</fpage>. doi:<pub-id pub-id-type="doi">10.1145/3517810</pub-id>.</mixed-citation></ref>
<ref id="ref-40"><label>[40]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Naserelden</surname> <given-names>S</given-names></string-name>, <string-name><surname>Alias</surname> <given-names>N</given-names></string-name>, <string-name><surname>Altigani</surname> <given-names>A</given-names></string-name>, <string-name><surname>Mohamed</surname> <given-names>A</given-names></string-name>, <string-name><surname>Badreddine</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Advance attacks on AES: a comprehensive review of side channel, fault injection, machine learning and quantum techniques</article-title>. <source>Edelweiss Appl Sci Technol</source>. <year>2025</year>;<volume>9</volume>(<issue>4</issue>):<fpage>2471</fpage>&#x2013;<lpage>86</lpage>. doi:<pub-id pub-id-type="doi">10.55214/25768484.v9i4.6586</pub-id>.</mixed-citation></ref>
<ref id="ref-41"><label>[41]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Dobias</surname> <given-names>P</given-names></string-name>, <string-name><surname>Rezaeezade</surname> <given-names>A</given-names></string-name>, <string-name><surname>Chmielewski</surname> <given-names>L</given-names></string-name>, <string-name><surname>Malina</surname> <given-names>L</given-names></string-name>, <string-name><surname>Batina</surname> <given-names>L</given-names></string-name></person-group>. <article-title>SoK: reassessing side-channel vulnerabilities and countermeasures in PQC implementations. Cryptology ePrint Archive; 2025. Paper 2025/1222. [cited 2025 Nov 2]</article-title>. Available from: <ext-link ext-link-type="uri" xlink:href="https://eprint.iacr.org/2025/1222">https://eprint.iacr.org/2025/1222</ext-link>.</mixed-citation></ref>
<ref id="ref-42"><label>[42]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Daemen</surname> <given-names>J</given-names></string-name>, <string-name><surname>Rijmen</surname> <given-names>V</given-names></string-name></person-group>. <source>The design of Rijndael</source>. Vol. <volume>2</volume>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2002</year>.</mixed-citation></ref>
<ref id="ref-43"><label>[43]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Wu</surname> <given-names>J</given-names></string-name></person-group>. <source>Introduction to convolutional neural networks</source>. <publisher-loc>Nanjing, China</publisher-loc>: <publisher-name>National Key Lab for Novel Software Technology Nanjing University China</publisher-name>; <year>2017</year>.</mixed-citation></ref>
<ref id="ref-44"><label>[44]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Vaswani</surname> <given-names>A</given-names></string-name>, <string-name><surname>Shazeer</surname> <given-names>N</given-names></string-name>, <string-name><surname>Parmar</surname> <given-names>N</given-names></string-name>, <string-name><surname>Uszkoreit</surname> <given-names>J</given-names></string-name>, <string-name><surname>Jones</surname> <given-names>L</given-names></string-name>, <string-name><surname>Gomez</surname> <given-names>AN</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Attention is all you need</article-title>. <source>Adv Neural Inf Process Syst</source>. <year>2017</year>;<volume>30</volume>:<fpage>6000</fpage>&#x2013;<lpage>10</lpage>.</mixed-citation></ref>
<ref id="ref-45"><label>[45]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Hajra</surname> <given-names>S</given-names></string-name>, <string-name><surname>Saha</surname> <given-names>S</given-names></string-name>, <string-name><surname>Alam</surname> <given-names>M</given-names></string-name>, <string-name><surname>Mukhopadhyay</surname> <given-names>D</given-names></string-name></person-group>. <article-title>TransNet: shift invariant transformer network for side channel analysis</article-title>. In: <conf-name>International Conference on Cryptology in Africa</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2022</year>. p. <fpage>371</fpage>&#x2013;<lpage>96</lpage>.</mixed-citation></ref>
<ref id="ref-46"><label>[46]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Scarselli</surname> <given-names>F</given-names></string-name>, <string-name><surname>Gori</surname> <given-names>M</given-names></string-name>, <string-name><surname>Tsoi</surname> <given-names>AC</given-names></string-name>, <string-name><surname>Hagenbuchner</surname> <given-names>M</given-names></string-name>, <string-name><surname>Monfardini</surname> <given-names>G</given-names></string-name></person-group>. <article-title>The graph neural network model</article-title>. <source>IEEE Trans Neural Netw</source>. <year>2009</year>;<volume>20</volume>(<issue>1</issue>):<fpage>61</fpage>&#x2013;<lpage>80</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tnn.2008.2005605</pub-id>; <pub-id pub-id-type="pmid">19068426</pub-id></mixed-citation></ref>
<ref id="ref-47"><label>[47]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Abbas</surname> <given-names>S</given-names></string-name>, <string-name><surname>Ojo</surname> <given-names>S</given-names></string-name>, <string-name><surname>Bouazzi</surname> <given-names>I</given-names></string-name>, <string-name><surname>Sampedro</surname> <given-names>GA</given-names></string-name>, <string-name><surname>Al Hejaili</surname> <given-names>A</given-names></string-name>, <string-name><surname>Almadhor</surname> <given-names>AS</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Securing data from side-channel attacks: a graph neural network-based approach for smartphone-based side channel attack detection</article-title>. <source>IEEE Access</source>. <year>2024</year>;<volume>12</volume>(<issue>70</issue>):<fpage>138904</fpage>&#x2013;<lpage>20</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2024.3465662</pub-id>.</mixed-citation></ref>
<ref id="ref-48"><label>[48]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Rumelhart</surname> <given-names>DE</given-names></string-name>, <string-name><surname>Hinton</surname> <given-names>GE</given-names></string-name>, <string-name><surname>Williams</surname> <given-names>RJ</given-names></string-name></person-group>. <chapter-title>Learning internal representations by error propagation</chapter-title>. In: <source>Parallel distributed processing: explorations in the microstructure of cognition: foundations</source>. <publisher-loc>Cambridge, MA, USA</publisher-loc>: <publisher-name>MIT Press</publisher-name>; <year>1985</year>. p. <fpage>318</fpage>&#x2013;<lpage>62</lpage>.</mixed-citation></ref>
<ref id="ref-49"><label>[49]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Yang</surname> <given-names>G</given-names></string-name>, <string-name><surname>Li</surname> <given-names>H</given-names></string-name>, <string-name><surname>Ming</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>CDAE: towards empowering denoising in side-channel analysis</article-title>. In: <conf-name>International Conference on Information and Communications Security</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2019</year>. p. <fpage>269</fpage>&#x2013;<lpage>86</lpage>.</mixed-citation></ref>
<ref id="ref-50"><label>[50]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>He</surname> <given-names>P</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Gan</surname> <given-names>H</given-names></string-name>, <string-name><surname>Ma</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>H</given-names></string-name></person-group>. <article-title>Side-channel attacks based on attention mechanism and multi-scale convolutional neural network</article-title>. <source>Comput Electr Eng</source>. <year>2024</year>;<volume>119</volume>(<issue>2</issue>):<fpage>109515</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.compeleceng.2024.109515</pub-id>.</mixed-citation></ref>
<ref id="ref-51"><label>[51]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Zhao</surname> <given-names>Z</given-names></string-name></person-group>. <article-title>Far field electromagnetic side channel analysis of AES [master&#x2019;s thesis]. Stockholm, Sweden: KTH Royal Institute of Technology</article-title>; <year>2020</year>.</mixed-citation></ref>
<ref id="ref-52"><label>[52]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Timon</surname> <given-names>B</given-names></string-name></person-group>. <article-title>Non-profiled deep learning-based side-channel attacks with sensitivity analysis</article-title>. <source>IACR Trans Cryptogr Hardw Embed Syst (TCHES)</source>. <year>2019</year>;<volume>2019</volume>(<issue>2</issue>):<fpage>107</fpage>&#x2013;<lpage>31</lpage>. doi:<pub-id pub-id-type="doi">10.46586/tches.v2019.i2.107-131</pub-id>.</mixed-citation></ref>
<ref id="ref-53"><label>[53]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Ji</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>R</given-names></string-name>, <string-name><surname>Ngo</surname> <given-names>K</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name>, <string-name><surname>Backlund</surname> <given-names>L</given-names></string-name></person-group>. <article-title>A side-channel attack on a hardware implementation of CRYSTALS-Kyber</article-title>. In: <conf-name>IEEE European Test Symposium (ETS)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2023</year>. p. <fpage>1</fpage>&#x2013;<lpage>5</lpage>.</mixed-citation></ref>
<ref id="ref-54"><label>[54]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>R</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name></person-group>. <chapter-title>A side-channel secret key recovery attack on CRYSTALS-kyber using k chosen ciphertexts</chapter-title>. In: <source>Codes, cryptology and information security</source>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2023</year>. p. <fpage>109</fpage>&#x2013;<lpage>28</lpage>.</mixed-citation></ref>
<ref id="ref-55"><label>[55]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Ngo</surname> <given-names>K</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>R</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name></person-group>. <article-title>Higher-order boolean masking does not prevent side-channel attacks on LWE/LWR-based PKE/KEMs</article-title>. In: <conf-name>International Symposium on Multiple-Valued Logic (ISMVL)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2023</year>. p. <fpage>190</fpage>&#x2013;<lpage>5</lpage>.</mixed-citation></ref>
<ref id="ref-56"><label>[56]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Maghrebi</surname> <given-names>H</given-names></string-name>, <string-name><surname>Portigliatti</surname> <given-names>T</given-names></string-name>, <string-name><surname>Prouff</surname> <given-names>E</given-names></string-name></person-group>. <article-title>Breaking cryptographic implementations using deep learning techniques</article-title>. In: <conf-name>International Conference on Security, Privacy, and Applied Cryptography Engineering</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2016</year>. p. <fpage>3</fpage>&#x2013;<lpage>26</lpage>.</mixed-citation></ref>
<ref id="ref-57"><label>[57]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>K</given-names></string-name></person-group>. <article-title>Far field EM side-channel attack based on deep learning with automated hyperparameter tuning [master&#x2019;s thesis]. Stockholm, Sweden: KTH Royal Institute of Technology</article-title>; <year>2021</year>.</mixed-citation></ref>
<ref id="ref-58"><label>[58]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Welch</surname> <given-names>BL</given-names></string-name></person-group>. <article-title>The generalization of &#x2018;STUDENT&#x2019;S problem when several different population varlances are involved</article-title>. <source>Biometrika</source>. <year>1947</year>;<volume>34</volume>(<issue>1&#x2013;2</issue>):<fpage>28</fpage>&#x2013;<lpage>35</lpage>. doi:<pub-id pub-id-type="doi">10.1093/biomet/34.1-2.28</pub-id>; <pub-id pub-id-type="pmid">20287819</pub-id></mixed-citation></ref>
<ref id="ref-59"><label>[59]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Sim</surname> <given-names>BY</given-names></string-name>, <string-name><surname>Kang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Han</surname> <given-names>DG</given-names></string-name></person-group>. <article-title>Key bit-dependent side-channel attacks on protected binary scalar multiplication</article-title>. <source>Appl Sci</source>. <year>2018</year>;<volume>8</volume>(<issue>11</issue>):<fpage>2168</fpage>. doi:<pub-id pub-id-type="doi">10.3390/app8112168</pub-id>.</mixed-citation></ref>
<ref id="ref-60"><label>[60]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Elaabid</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Meynard</surname> <given-names>O</given-names></string-name>, <string-name><surname>Guilley</surname> <given-names>S</given-names></string-name>, <string-name><surname>Danger</surname> <given-names>JL</given-names></string-name></person-group>. <article-title>Combined side-channel attacks</article-title>. In: <conf-name>International Workshop on Information Security Applications</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2010</year>. p. <fpage>175</fpage>&#x2013;<lpage>90</lpage>.</mixed-citation></ref>
<ref id="ref-61"><label>[61]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Standaert</surname> <given-names>FX</given-names></string-name>, <string-name><surname>Malkin</surname> <given-names>TG</given-names></string-name>, <string-name><surname>Yung</surname> <given-names>M</given-names></string-name></person-group>. <article-title>A unified framework for the analysis of side-channel key recovery attacks</article-title>. In: <conf-name>Annual International Conference on the Theory and Applications of Cryptographic Techniques</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2009</year>. p. <fpage>443</fpage>&#x2013;<lpage>61</lpage>.</mixed-citation></ref>
<ref id="ref-62"><label>[62]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zheng</surname> <given-names>M</given-names></string-name>, <string-name><surname>Nan</surname> <given-names>J</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>N</given-names></string-name></person-group>. <article-title>A novel evaluation metric for deep learning-based side channel analysis and its extended application to imbalanced data</article-title>. <source>IACR Trans Cryptogr Hardw Embed Syst (TCHES)</source>. <year>2020</year>;<volume>2020</volume>(<issue>3</issue>):<fpage>73</fpage>&#x2013;<lpage>96</lpage>. doi:<pub-id pub-id-type="doi">10.46586/tches.v2020.i3.73-96</pub-id>.</mixed-citation></ref>
<ref id="ref-63"><label>[63]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Pahlevanzadeh</surname> <given-names>H</given-names></string-name>, <string-name><surname>Dofe</surname> <given-names>J</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>Q</given-names></string-name></person-group>. <article-title>Assessing CPA resistance of AES with different fault tolerance mechanisms</article-title>. In: <conf-name>Asia and South Pacific Design Automation Conference (ASP-DAC)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2016</year>. p. <fpage>661</fpage>&#x2013;<lpage>6</lpage>.</mixed-citation></ref>
<ref id="ref-64"><label>[64]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Brisfors</surname> <given-names>M</given-names></string-name>, <string-name><surname>Forsmark</surname> <given-names>S</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name></person-group>. <article-title>How deep learning helps compromising USIM</article-title>. In: <conf-name>International Conference on Smart Card Research and Advanced Applications</conf-name>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2020</year>. p. <fpage>135</fpage>&#x2013;<lpage>50</lpage>.</mixed-citation></ref>
<ref id="ref-65"><label>[65]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hu</surname> <given-names>F</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>J</given-names></string-name>, <string-name><surname>Vijayakumar</surname> <given-names>P</given-names></string-name></person-group>. <article-title>Side-channel attacks based on multi-loss regularized denoising AutoEncoder</article-title>. <source>IEEE Trans Inf Foren Secur</source>. <year>2024</year>;<volume>19</volume>:<fpage>2051</fpage>&#x2013;<lpage>65</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tifs.2023.3343947</pub-id>.</mixed-citation></ref>
<ref id="ref-66"><label>[66]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Mangard</surname> <given-names>S</given-names></string-name>, <string-name><surname>Oswald</surname> <given-names>E</given-names></string-name>, <string-name><surname>Popp</surname> <given-names>T</given-names></string-name></person-group>. <source>Power analysis attacks: revealing the secrets of smart cards</source>. Vol. <volume>31</volume>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer Science &#x0026; Business Media</publisher-name>; <year>2008</year>.</mixed-citation></ref>
<ref id="ref-67"><label>[67]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Koeune</surname> <given-names>F</given-names></string-name>, <string-name><surname>Standaert</surname> <given-names>FX</given-names></string-name></person-group>. <chapter-title>A tutorial on physical security and side-channel attacks</chapter-title>. In: <source>Foundations of security analysis and design III (FOSAD 2005, FOSAD 2004)</source>. <publisher-loc>Berlin/Heidelberg, Germany</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2004</year>. p. <fpage>78</fpage>&#x2013;<lpage>108</lpage>.</mixed-citation></ref>
<ref id="ref-68"><label>[68]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Bhasin</surname> <given-names>S</given-names></string-name>, <string-name><surname>Danger</surname> <given-names>JL</given-names></string-name>, <string-name><surname>Guilley</surname> <given-names>S</given-names></string-name>, <string-name><surname>Najm</surname> <given-names>Z</given-names></string-name></person-group>. <article-title>A low-entropy first-degree secure provable masking scheme for resource-constrained devices</article-title>. In: <conf-name>WESS &#x2019;13: Proceedings of the Workshop on Embedded Systems Security</conf-name>. <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2013</year>. p. <fpage>1</fpage>&#x2013;<lpage>10</lpage>.</mixed-citation></ref>
<ref id="ref-69"><label>[69]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Coron</surname> <given-names>JS</given-names></string-name>, <string-name><surname>Kizhvatov</surname> <given-names>I</given-names></string-name></person-group>. <article-title>An efficient method for random delay generation in embedded software</article-title>. In: <conf-name>International Workshop on Cryptographic Hardware and Embedded Systems (CHES)</conf-name>. <publisher-loc>Berlin/Heidelberg, Germany</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2009</year>. p. <fpage>156</fpage>&#x2013;<lpage>70</lpage>.</mixed-citation></ref>
<ref id="ref-70"><label>[70]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Bogdanov</surname> <given-names>A</given-names></string-name>, <string-name><surname>Khovratovich</surname> <given-names>D</given-names></string-name>, <string-name><surname>Rechberger</surname> <given-names>C</given-names></string-name></person-group>. <article-title>Biclique cryptanalysis of the full AES</article-title>. In: <conf-name>International Conference on the Theory and Application of Cryptology and Information Security</conf-name>. <publisher-loc>Berlin/Heidelberg, Germany</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2011</year>. p. <fpage>344</fpage>&#x2013;<lpage>71</lpage>.</mixed-citation></ref>
<ref id="ref-71"><label>[71]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Zorba</surname> <given-names>BB</given-names></string-name>, <string-name><surname>Alkar</surname> <given-names>AZ</given-names></string-name>, <string-name><surname>Aydos</surname> <given-names>M</given-names></string-name>, <string-name><surname>Kolukisa-Tarhan</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Software implementation performances of block ciphers: a systematic literature review</article-title>. In: <conf-name>International Workshop on Big Data and Information Security (IWBIS)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2022</year>. p. <fpage>65</fpage>&#x2013;<lpage>74</lpage>.</mixed-citation></ref>
<ref id="ref-72"><label>[72]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Dworkin</surname> <given-names>M</given-names></string-name></person-group>. <source>Recommendation for block cipher modes of operation: galois/counter mode (GCM) and GMAC</source>. <publisher-name>Gaithersburg, MD, USA: NIST Special Publication</publisher-name>; <year>2007</year>.</mixed-citation></ref>
<ref id="ref-73"><label>[73]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Suykens</surname> <given-names>JA</given-names></string-name>, <string-name><surname>Vandewalle</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Least squares support vector machine classifiers</article-title>. <source>Neural Process Lett</source>. <year>1999</year>;<volume>9</volume>(<issue>3</issue>):<fpage>293</fpage>&#x2013;<lpage>300</lpage>. doi:<pub-id pub-id-type="doi">10.1023/a:1018628609742</pub-id>.</mixed-citation></ref>
<ref id="ref-74"><label>[74]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hospodar</surname> <given-names>G</given-names></string-name>, <string-name><surname>Gierlichs</surname> <given-names>B</given-names></string-name>, <string-name><surname>De Mulder</surname> <given-names>E</given-names></string-name>, <string-name><surname>Verbauwhede</surname> <given-names>I</given-names></string-name>, <string-name><surname>Vandewalle</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Machine learning in side-channel analysis: a first study</article-title>. <source>J Cryptographic Eng</source>. <year>2011</year>;<volume>1</volume>(<issue>4</issue>):<fpage>293</fpage>. doi:<pub-id pub-id-type="doi">10.1007/s13389-011-0023-x</pub-id>.</mixed-citation></ref>
<ref id="ref-75"><label>[75]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Martinasek</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zeman</surname> <given-names>V</given-names></string-name></person-group>. <article-title>Innovative method of the power analysis</article-title>. <source>Radioengineering</source>. <year>2013</year>;<volume>22</volume>(<issue>2</issue>):<fpage>586</fpage>&#x2013;<lpage>94</lpage>.</mixed-citation></ref>
<ref id="ref-76"><label>[76]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Martinasek</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Hajny</surname> <given-names>J</given-names></string-name>, <string-name><surname>Malina</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Optimization of power analysis using neural network</article-title>. In: <conf-name>International Conference on Smart Card Research and Advanced Applications</conf-name>. <publisher-loc>Berlin/Heidelberg, Germany</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2013</year>. p. <fpage>94</fpage>&#x2013;<lpage>107</lpage>.</mixed-citation></ref>
<ref id="ref-77"><label>[77]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Martinasek</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Malina</surname> <given-names>L</given-names></string-name>, <string-name><surname>Trasy</surname> <given-names>K</given-names></string-name></person-group>. <chapter-title>Profiling power analysis attack based on multi-layer perceptron network</chapter-title>. In: <source>Computational problems in science and engineering</source>. <publisher-loc>Berlin/Heidelberg, Germany</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2015</year>. p. <fpage>317</fpage>&#x2013;<lpage>39</lpage>.</mixed-citation></ref>
<ref id="ref-78"><label>[78]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Wong</surname> <given-names>SC</given-names></string-name>, <string-name><surname>Gatt</surname> <given-names>A</given-names></string-name>, <string-name><surname>Stamatescu</surname> <given-names>V</given-names></string-name>, <string-name><surname>McDonnell</surname> <given-names>MD</given-names></string-name></person-group>. <article-title>Understanding data augmentation for classification: when to warp?</article-title> In: <conf-name>International Conference on Digital Image Computing: Techniques and Applications (DICTA)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2016</year>. p. <fpage>1</fpage>&#x2013;<lpage>6</lpage>.</mixed-citation></ref>
<ref id="ref-79"><label>[79]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Picek</surname> <given-names>S</given-names></string-name>, <string-name><surname>Samiotis</surname> <given-names>IP</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>J</given-names></string-name>, <string-name><surname>Heuser</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bhasin</surname> <given-names>S</given-names></string-name>, <string-name><surname>Legay</surname> <given-names>A</given-names></string-name></person-group>. <article-title>On the performance of convolutional neural networks for side-channel analysis</article-title>. In: <conf-name>International Conference on Security, Privacy, and Applied Cryptographic Engineering</conf-name>. <publisher-loc>Berlin/Heidelberg, Germany</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2018</year>. p. <fpage>157</fpage>&#x2013;<lpage>76</lpage>.</mixed-citation></ref>
<ref id="ref-80"><label>[80]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kim</surname> <given-names>J</given-names></string-name>, <string-name><surname>Picek</surname> <given-names>S</given-names></string-name>, <string-name><surname>Heuser</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bhasin</surname> <given-names>S</given-names></string-name>, <string-name><surname>Hanjalic</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Make some noise. unleashing the power of convolutional neural networks for profiled side-channel analysis</article-title>. <source>IACR Trans Cryptogr Hardw Embed Syst (TCHES)</source>. <year>2019</year>;<volume>2019</volume>(<issue>3</issue>):<fpage>148</fpage>&#x2013;<lpage>79</lpage>. doi:<pub-id pub-id-type="doi">10.46586/tches.v2019.i3.148-179</pub-id>.</mixed-citation></ref>
<ref id="ref-81"><label>[81]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Picek</surname> <given-names>S</given-names></string-name>, <string-name><surname>Heuser</surname> <given-names>A</given-names></string-name>, <string-name><surname>Jovic</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bhasin</surname> <given-names>S</given-names></string-name>, <string-name><surname>Regazzoni</surname> <given-names>F</given-names></string-name></person-group>. <article-title>The curse of class imbalance and conflicting metrics with machine learning for side-channel evaluations</article-title>. <source>IACR Trans Cryptogr Hardw Embed Syst (TCHES)</source>. <year>2018</year>;<volume>2019</volume>(<issue>1</issue>):<fpage>209</fpage>&#x2013;<lpage>37</lpage>. doi:<pub-id pub-id-type="doi">10.46586/tches.v2019.i1.209-237</pub-id>.</mixed-citation></ref>
<ref id="ref-82"><label>[82]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Chawla</surname> <given-names>NV</given-names></string-name>, <string-name><surname>Bowyer</surname> <given-names>KW</given-names></string-name>, <string-name><surname>Hall</surname> <given-names>LO</given-names></string-name>, <string-name><surname>Kegelmeyer</surname> <given-names>WP</given-names></string-name></person-group>. <article-title>SMOTE: synthetic minority over-sampling technique</article-title>. <source>J Artif Intelligence Res</source>. <year>2002</year>;<volume>16</volume>:<fpage>321</fpage>&#x2013;<lpage>57</lpage>. doi:<pub-id pub-id-type="doi">10.1613/jair.953</pub-id>.</mixed-citation></ref>
<ref id="ref-83"><label>[83]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Jin</surname> <given-names>M</given-names></string-name>, <string-name><surname>Zheng</surname> <given-names>M</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>N</given-names></string-name></person-group>. <article-title>An enhanced convolutional neural network in side-channel attacks and its visualization</article-title>. <comment>arXiv:2009.08898. 2020</comment>.</mixed-citation></ref>
<ref id="ref-84"><label>[84]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Forsmark</surname> <given-names>S</given-names></string-name>, <string-name><surname>Brisfors</surname> <given-names>M</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name></person-group>. <article-title>Multi-source training deep-learning side-channel attacks</article-title>. In: <conf-name>IEEE International Symposium on Multiple-Valued Logic (ISMVL)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2020</year>. p. <fpage>58</fpage>&#x2013;<lpage>63</lpage>.</mixed-citation></ref>
<ref id="ref-85"><label>[85]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zaid</surname> <given-names>G</given-names></string-name>, <string-name><surname>Bossuet</surname> <given-names>L</given-names></string-name>, <string-name><surname>Habrard</surname> <given-names>A</given-names></string-name>, <string-name><surname>Venelli</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Methodology for efficient CNN architectures in profiling attacks</article-title>. <source>IACR Trans Cryptogr Hardw Embed Syst (TCHES)</source>. <year>2020</year>;<volume>2020</volume>(<issue>1</issue>):<fpage>1</fpage>&#x2013;<lpage>36</lpage>. doi:<pub-id pub-id-type="doi">10.46586/tches.v2020.i1.1-36</pub-id>.</mixed-citation></ref>
<ref id="ref-86"><label>[86]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Huang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Tang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhao</surname> <given-names>S</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>B</given-names></string-name></person-group>. <article-title>Deep learning-based improved side-channel attacks using data denoising and feature fusion</article-title>. <source>PLoS One</source>. <year>2025</year>;<volume>20</volume>(<issue>4</issue>):<fpage>e0315340</fpage>. doi:<pub-id pub-id-type="doi">10.1371/journal.pone.0315340</pub-id>; <pub-id pub-id-type="pmid">40203055</pub-id></mixed-citation></ref>
<ref id="ref-87"><label>[87]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ou</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wei</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Rodr&#x00ED;guez-Aldama</surname> <given-names>R</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>F</given-names></string-name></person-group>. <article-title>A lightweight deep learning model for profiled SCA based on random convolution kernels</article-title>. <source>Information</source>. <year>2025</year>;<volume>16</volume>(<issue>5</issue>):<fpage>351</fpage>. doi:<pub-id pub-id-type="doi">10.3390/info16050351</pub-id>.</mixed-citation></ref>
<ref id="ref-88"><label>[88]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Xing</surname> <given-names>X</given-names></string-name>, <string-name><surname>Fan</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Multi-label deep learning based side channel attack</article-title>. In: <conf-name>Asian Hardware Oriented Security and Trust Symposium (AsianHOST)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2019</year>. p. <fpage>1</fpage>&#x2013;<lpage>6</lpage>.</mixed-citation></ref>
<ref id="ref-89"><label>[89]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Picek</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Remove some noise: on pre-processing of side-channel measurements with autoencoders</article-title>. <source>IACR Trans Cryptogr Hardw Embed Syst (TCHES)</source>. <year>2020</year>;<volume>2020</volume>(<issue>4</issue>):<fpage>389</fpage>&#x2013;<lpage>415</lpage>. doi:<pub-id pub-id-type="doi">10.46586/tches.v2020.i4.389-415</pub-id>.</mixed-citation></ref>
<ref id="ref-90"><label>[90]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zaid</surname> <given-names>G</given-names></string-name>, <string-name><surname>Bossuet</surname> <given-names>L</given-names></string-name>, <string-name><surname>Carbone</surname> <given-names>M</given-names></string-name>, <string-name><surname>Habrard</surname> <given-names>A</given-names></string-name>, <string-name><surname>Venelli</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Conditional variational autoencoder based on stochastic attacks</article-title>. <source>IACR Trans Cryptogr Hardw Embed Syst (TCHES)</source>. <year>2023</year>;<volume>2023</volume>(<issue>2</issue>):<fpage>310</fpage>&#x2013;<lpage>57</lpage>. doi:<pub-id pub-id-type="doi">10.46586/tches.v2023.i2.310-357</pub-id>.</mixed-citation></ref>
<ref id="ref-91"><label>[91]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Weissbart</surname> <given-names>L</given-names></string-name>, <string-name><surname>Kr&#x010D;ek</surname> <given-names>M</given-names></string-name>, <string-name><surname>Li</surname> <given-names>H</given-names></string-name>, <string-name><surname>Perin</surname> <given-names>G</given-names></string-name>, <string-name><surname>Batina</surname> <given-names>L</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Label correlation in deep learning-based side-channel analysis</article-title>. <source>IEEE Trans Inf Foren Secur (TIFS)</source>. <year>2023</year>;<volume>18</volume>:<fpage>3849</fpage>&#x2013;<lpage>61</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tifs.2023.3287728</pub-id>.</mixed-citation></ref>
<ref id="ref-92"><label>[92]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yao</surname> <given-names>F</given-names></string-name>, <string-name><surname>Wei</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>H</given-names></string-name>, <string-name><surname>Pasalic</surname> <given-names>E</given-names></string-name></person-group>. <article-title>Improving the performance of CPA attacks for ciphers using parallel implementation of S-boxes</article-title>. <source>IET Inf Secur</source>. <year>2023</year>;<volume>2023</volume>(<issue>1</issue>):<fpage>6653956</fpage>. doi:<pub-id pub-id-type="doi">10.1049/2023/6653956</pub-id>.</mixed-citation></ref>
<ref id="ref-93"><label>[93]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>B</given-names></string-name>, <string-name><surname>Cao</surname> <given-names>J</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>Z</given-names></string-name></person-group>. <article-title>A novel power system side channel attack method based on machine learning CNN-transformer</article-title>. <source>J Phys Conf Ser</source>. <year>2023</year>;<volume>2615</volume>(<issue>1</issue>):<fpage>012011</fpage>. doi:<pub-id pub-id-type="doi">10.1088/1742-6596/2615/1/012011</pub-id>.</mixed-citation></ref>
<ref id="ref-94"><label>[94]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Lu</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Leading degree: a metric for model performance evaluation and hyperparameter tuning in deep learning-based side-channel analysis</article-title>. <source>IACR Trans Cryptogr Hardw Embed Syst</source>. <year>2025</year>;<volume>2025</volume>(<issue>2</issue>):<fpage>333</fpage>&#x2013;<lpage>61</lpage>. doi:<pub-id pub-id-type="doi">10.46586/tches.v2025.i2.333-361</pub-id>.</mixed-citation></ref>
<ref id="ref-95"><label>[95]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>H</given-names></string-name></person-group>. <article-title>Deep learning side-channel attacks on advanced encryption standard [dissertation]. Stockholm, Sweden: KTH Royal Institute of Technology</article-title>; <year>2023</year>.</mixed-citation></ref>
<ref id="ref-96"><label>[96]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Benadjila</surname> <given-names>R</given-names></string-name>, <string-name><surname>Prouff</surname> <given-names>E</given-names></string-name>, <string-name><surname>Strullu</surname> <given-names>R</given-names></string-name>, <string-name><surname>Cagli</surname> <given-names>E</given-names></string-name>, <string-name><surname>Dumas</surname> <given-names>C</given-names></string-name></person-group>. <article-title>Deep learning for side-channel analysis and introduction to ASCAD database</article-title>. <source>J Cryptogr Eng</source>. <year>2020</year>;<volume>10</volume>(<issue>2</issue>):<fpage>163</fpage>&#x2013;<lpage>88</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s13389-019-00220-8</pub-id>.</mixed-citation></ref>
<ref id="ref-97"><label>[97]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Brisfors</surname> <given-names>M</given-names></string-name>, <string-name><surname>Forsmark</surname> <given-names>S</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name></person-group>. <article-title>How diversity affects deep-learning side-channel attacks</article-title>. In: <conf-name>IEEE Nordic Circuits and Systems Conference (NORCAS)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2019</year>. p. <fpage>1</fpage>&#x2013;<lpage>7</lpage>.</mixed-citation></ref>
<ref id="ref-98"><label>[98]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>R</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name></person-group>. <article-title>Far field EM side-channel attack on AES using deep learning</article-title>. In: <conf-name>ACM Workshop on Attacks and Solutions in Hardware Security (ASHES)</conf-name>. <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2020</year>. p. <fpage>35</fpage>&#x2013;<lpage>44</lpage>.</mixed-citation></ref>
<ref id="ref-99"><label>[99]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Rijsdijk</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Perin</surname> <given-names>G</given-names></string-name>, <string-name><surname>Picek</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Reinforcement learning for hyperparameter tuning in deep learning-based side-channel analysis</article-title>. <source>IACR Trans Cryptogr Hardw Embed Syst (TCHES)</source>. <year>2021</year>;<volume>2021</volume>(<issue>3</issue>):<fpage>677</fpage>&#x2013;<lpage>707</lpage>. doi:<pub-id pub-id-type="doi">10.46586/tches.v2021.i3.677-707</pub-id>.</mixed-citation></ref>
<ref id="ref-100"><label>[100]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Perin</surname> <given-names>G</given-names></string-name>, <string-name><surname>Picek</surname> <given-names>S</given-names></string-name></person-group>. <article-title>I choose you: automated hyperparameter tuning for deep learning-based side-channel analysis</article-title>. <source>IEEE Trans Emerg Top Comput</source>. <year>2024</year>;<volume>12</volume>(<issue>2</issue>):<fpage>546</fpage>&#x2013;<lpage>57</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tetc.2022.3218372</pub-id>.</mixed-citation></ref>
<ref id="ref-101"><label>[101]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Feng</surname> <given-names>T</given-names></string-name>, <string-name><surname>Gao</surname> <given-names>H</given-names></string-name>, <string-name><surname>Li</surname> <given-names>X</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>C</given-names></string-name></person-group>. <article-title>Side-channel attacks on convolutional neural networks based on the hybrid attention mechanism</article-title>. <source>Discov Appl Sci</source>. <year>2025</year>;<volume>7</volume>(<issue>5</issue>):<fpage>1</fpage>&#x2013;<lpage>15</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s42452-025-06854-0</pub-id>.</mixed-citation></ref>
<ref id="ref-102"><label>[102]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name></person-group>. <source>Tandem deep learning side-channel attack on FPGA implementation of AES</source>. Vol. <volume>2</volume>. <publisher-loc>Berlin/Heidelberg</publisher-loc>, <publisher-name>Germany: Springer</publisher-name>; <year>2021</year>. p. <fpage>1</fpage>&#x2013;<lpage>12</lpage>.</mixed-citation></ref>
<ref id="ref-103"><label>[103]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Bischof</surname> <given-names>H</given-names></string-name>, <string-name><surname>Pinz</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kropatsch</surname> <given-names>WG</given-names></string-name></person-group>. <article-title>Visualization methods for neural networks</article-title>. In: <conf-name>IAPR International Conference on Pattern Recognition</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>1992</year>. p. <fpage>581</fpage>&#x2013;<lpage>5</lpage>.</mixed-citation></ref>
<ref id="ref-104"><label>[104]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Masure</surname> <given-names>L</given-names></string-name>, <string-name><surname>Dumas</surname> <given-names>C</given-names></string-name>, <string-name><surname>Prouff</surname> <given-names>E</given-names></string-name></person-group>. <article-title>Gradient visualization for general characterization in profiling attacks</article-title>. In: <conf-name>International Workshop on Constructive Side-Channel Analysis and Secure Design</conf-name>. <publisher-loc>Berlin/Heidelberg, Germany</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2019</year>. p. <fpage>145</fpage>&#x2013;<lpage>67</lpage>.</mixed-citation></ref>
<ref id="ref-105"><label>[105]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Zeiler</surname> <given-names>MD</given-names></string-name>, <string-name><surname>Fergus</surname> <given-names>R</given-names></string-name></person-group>. <article-title>Visualizing and understanding convolutional networks</article-title>. In: <conf-name>European Conference on Computer Vision</conf-name>. <publisher-loc>Berlin/Heidelberg, Germany</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2014</year>. p. <fpage>818</fpage>&#x2013;<lpage>33</lpage>.</mixed-citation></ref>
<ref id="ref-106"><label>[106]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Feng</surname> <given-names>T</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>C</given-names></string-name></person-group>. <article-title>Wavelet coefficients based generative adversarial networks for side-channel attack preprocessing</article-title>. <source>Signal Image Video Process</source>. <year>2025</year>;<volume>19</volume>(<issue>9</issue>):<fpage>697</fpage>. doi:<pub-id pub-id-type="doi">10.1007/s11760-025-04272-8</pub-id>.</mixed-citation></ref>
<ref id="ref-107"><label>[107]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Sutton</surname> <given-names>RS</given-names></string-name>, <string-name><surname>Barto</surname> <given-names>AG</given-names></string-name></person-group>. <source>Reinforcement learning: an introduction</source>. <publisher-loc>Cambridge, MA, USA</publisher-loc>: <publisher-name>MIT press</publisher-name>; <year>2018</year>.</mixed-citation></ref>
<ref id="ref-108"><label>[108]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>de Bruijn</surname> <given-names>V</given-names></string-name></person-group>. <article-title>Side-channel analysis with graph neural networks [master&#x2019;s thesis]. Delft, The Netherlands: Delft University of Technology</article-title>; <year>2021</year>.</mixed-citation></ref>
<ref id="ref-109"><label>[109]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Das</surname> <given-names>D</given-names></string-name>, <string-name><surname>Golder</surname> <given-names>A</given-names></string-name>, <string-name><surname>Danial</surname> <given-names>J</given-names></string-name>, <string-name><surname>Ghosh</surname> <given-names>S</given-names></string-name>, <string-name><surname>Raychowdhury</surname> <given-names>A</given-names></string-name>, <string-name><surname>Sen</surname> <given-names>S</given-names></string-name></person-group>. <article-title>X-DeepSCA: cross-device deep learning side channel attack</article-title>. In: <conf-name>2019 56th ACM/IEEE Design Automation Conference (DAC)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2019</year>. p. <fpage>1</fpage>&#x2013;<lpage>6</lpage>.</mixed-citation></ref>
<ref id="ref-110"><label>[110]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Golder</surname> <given-names>A</given-names></string-name>, <string-name><surname>Das</surname> <given-names>D</given-names></string-name>, <string-name><surname>Danial</surname> <given-names>J</given-names></string-name>, <string-name><surname>Ghosh</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sen</surname> <given-names>S</given-names></string-name>, <string-name><surname>Raychowdhury</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Practical approaches toward deep-learning-based cross-device power side-channel attack</article-title>. <source>IEEE Trans Very Large Scale Integr Syst</source>. <year>2019</year>;<volume>27</volume>(<issue>12</issue>):<fpage>2720</fpage>&#x2013;<lpage>33</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tvlsi.2019.2926324</pub-id>.</mixed-citation></ref>
<ref id="ref-111"><label>[111]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name></person-group>. <article-title>Federated learning in side-channel analysis</article-title>. In: <conf-name>Information Security and Cryptology-ICISC 2020: 23rd International Conference</conf-name>. <publisher-loc>Berlin/Heidelberg, Germany</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2021</year>. p. <fpage>257</fpage>&#x2013;<lpage>72</lpage>.</mixed-citation></ref>
<ref id="ref-112"><label>[112]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hu</surname> <given-names>F</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Cross subkey SCA based on small samples</article-title>. <source>Sci Rep</source>. <year>2022</year>;<volume>12</volume>(<issue>1</issue>):<fpage>6254</fpage>; <pub-id pub-id-type="pmid">35428761</pub-id></mixed-citation></ref>
<ref id="ref-113"><label>[113]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Rezaeezade</surname> <given-names>A</given-names></string-name>, <string-name><surname>Yap</surname> <given-names>T</given-names></string-name>, <string-name><surname>Jap</surname> <given-names>D</given-names></string-name>, <string-name><surname>Bhasin</surname> <given-names>S</given-names></string-name>, <string-name><surname>Picek</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Breaking the blindfold: deep learning-based blind side-channel analysis. Cryptology ePrint Archive; 2025. Paper 2025/157. [cited 2025 Nov 2]</article-title>. Available from: <ext-link ext-link-type="uri" xlink:href="https://eprint.iacr.org/2025/157">https://eprint.iacr.org/2025/157</ext-link>.</mixed-citation></ref>
<ref id="ref-114"><label>[114]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Ninan</surname> <given-names>M</given-names></string-name>, <string-name><surname>Nimmo</surname> <given-names>E</given-names></string-name>, <string-name><surname>Reilly</surname> <given-names>S</given-names></string-name>, <string-name><surname>Smith</surname> <given-names>C</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>W</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>B</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>A second look at the portability of deep learning side-channel attacks over EM traces</article-title>. <comment>[cited 2025 Nov 2]</comment>. Available from: <ext-link ext-link-type="uri" xlink:href="https://par.nsf.gov/biblio/10566521">https://par.nsf.gov/biblio/10566521</ext-link>.</mixed-citation></ref>
<ref id="ref-115"><label>[115]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liao</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Tang</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Switch-T: a novel multi-task deep-learning network for cross-device side-channel attack</article-title>. <source>J Inf Secur Appl</source>. <year>2025</year>;<volume>93</volume>(<issue>1</issue>):<fpage>104146</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.jisa.2025.104146</pub-id>.</mixed-citation></ref>
<ref id="ref-116"><label>[116]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Ding</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>A</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wei</surname> <given-names>C</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>S</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>SPA-GPT: general pulse tailor for simple power analysis based on reinforcement learning</article-title>. <source>IACR Trans Cryptogr Hardw Embed Syst (TCHES)</source>. <year>2024</year>;<volume>2024</volume>(<issue>4</issue>):<fpage>40</fpage>&#x2013;<lpage>83</lpage>. doi:<pub-id pub-id-type="doi">10.46586/tches.v2024.i4.40-83</pub-id>.</mixed-citation></ref>
<ref id="ref-117"><label>[117]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>B</given-names></string-name>, <string-name><surname>Su</surname> <given-names>C</given-names></string-name>, <string-name><surname>Li</surname> <given-names>A</given-names></string-name>, <string-name><surname>Tang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Li</surname> <given-names>G</given-names></string-name></person-group>. <article-title>Enhancing model generalization for efficient cross-device side-channel analysis</article-title>. <source>IEEE Trans Inf Foren Secur</source>. <year>2025</year>;<volume>20</volume>(<issue>2</issue>):<fpage>10114</fpage>&#x2013;<lpage>29</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tifs.2025.3611696</pub-id>.</mixed-citation></ref>
<ref id="ref-118"><label>[118]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Lv</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zheng</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Luo</surname> <given-names>T</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>F</given-names></string-name>, <string-name><surname>Tang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Hua</surname> <given-names>L</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Data-free evaluation of user contributions in federated learning</article-title>. In: <conf-name>International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2021</year>. p. <fpage>1</fpage>&#x2013;<lpage>8</lpage>.</mixed-citation></ref>
<ref id="ref-119"><label>[119]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>McMahan</surname> <given-names>B</given-names></string-name>, <string-name><surname>Moore</surname> <given-names>E</given-names></string-name>, <string-name><surname>Ramage</surname> <given-names>D</given-names></string-name>, <string-name><surname>Hampson</surname> <given-names>S</given-names></string-name>, <string-name><surname>Arcas</surname> <given-names>BA</given-names></string-name></person-group>. <chapter-title>Communication-efficient learning of deep networks from decentralized data</chapter-title>. In: <source>Artificial Intelligence and Statistics</source>. <publisher-loc>London, UK</publisher-loc>: <publisher-name>PMLR</publisher-name>; <year>2017</year>. p. <fpage>1273</fpage>&#x2013;<lpage>82</lpage>.</mixed-citation></ref>
<ref id="ref-120"><label>[120]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Lv</surname> <given-names>H</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>X</given-names></string-name>, <string-name><surname>Ma</surname> <given-names>C</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>F</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>L</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Similarity and diversity: PCA-based contribution evaluation in federated learning</article-title>. <source>IEEE Internet Things J</source>. <year>2025</year>;<volume>12</volume>(<issue>12</issue>):<fpage>20393</fpage>&#x2013;<lpage>405</lpage>. doi:<pub-id pub-id-type="doi">10.1109/jiot.2025.3546679</pub-id>.</mixed-citation></ref>
<ref id="ref-121"><label>[121]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Polikar</surname> <given-names>R</given-names></string-name></person-group>. <chapter-title>Ensemble learning</chapter-title>. In: <source>Ensemble machine learning</source>. <publisher-loc>Berlin/Heidelberg, Germany</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2012</year>. p. <fpage>1</fpage>&#x2013;<lpage>34</lpage>.</mixed-citation></ref>
<ref id="ref-122"><label>[122]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Won</surname> <given-names>YS</given-names></string-name>, <string-name><surname>Jap</surname> <given-names>D</given-names></string-name>, <string-name><surname>Perin</surname> <given-names>G</given-names></string-name>, <string-name><surname>Bhasin</surname> <given-names>S</given-names></string-name>, <string-name><surname>Picek</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Ablation analysis for multi-device deep learning-based physical side-channel analysis</article-title>. <source>IEEE Trans Depend Secure Comput</source>. <year>2024</year>;<volume>21</volume>(<issue>3</issue>):<fpage>1331</fpage>&#x2013;<lpage>41</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tdsc.2023.3278857</pub-id>.</mixed-citation></ref>
<ref id="ref-123"><label>[123]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Priya</surname> <given-names>SSS</given-names></string-name>, <string-name><surname>Karthigaikumar</surname> <given-names>P</given-names></string-name>, <string-name><surname>Teja</surname> <given-names>NR</given-names></string-name></person-group>. <article-title>FPGA implementation of AES algorithm for high speed applications</article-title>. <source>Analog Integr Circuits Signal Process</source>. <year>2022</year>;<volume>112</volume>(<issue>1</issue>):<fpage>115</fpage>&#x2013;<lpage>25</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s10470-021-01959-z</pub-id>.</mixed-citation></ref>
<ref id="ref-124"><label>[124]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Lee</surname> <given-names>U</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>HK</given-names></string-name>, <string-name><surname>Lim</surname> <given-names>YJ</given-names></string-name>, <string-name><surname>Sunwoo</surname> <given-names>MH</given-names></string-name></person-group>. <article-title>Resource-efficient FPGA implementation of advanced encryption standard</article-title>. In: <conf-name>IEEE International Symposium on Circuits and Systems (ISCAS)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2022</year>. p. <fpage>1165</fpage>&#x2013;<lpage>9</lpage>.</mixed-citation></ref>
<ref id="ref-125"><label>[125]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><collab>TELECOM ParisTech SEN research group T</collab></person-group>. <article-title>DPA contest v2. 2010 [Internet]. [cited 2025 Nov 2]</article-title>. Available from: <ext-link ext-link-type="uri" xlink:href="http://www.DPAcontest.org/v2/">http://www.DPAcontest.org/v2/</ext-link>.</mixed-citation></ref>
<ref id="ref-126"><label>[126]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Masure</surname> <given-names>L</given-names></string-name>, <string-name><surname>Dumas</surname> <given-names>C</given-names></string-name>, <string-name><surname>Prouff</surname> <given-names>E</given-names></string-name></person-group>. <article-title>A comprehensive study of deep learning for side-channel analysis</article-title>. <source>IACR Trans Cryptogr Hardw Embed Syst (TCHES)</source>. <year>2020</year>;<volume>2020</volume>(<issue>1</issue>):<fpage>348</fpage>&#x2013;<lpage>75</lpage>. doi:<pub-id pub-id-type="doi">10.46586/tches.v2020.i1.348-375</pub-id>.</mixed-citation></ref>
<ref id="ref-127"><label>[127]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ramezanpour</surname> <given-names>K</given-names></string-name>, <string-name><surname>Ampadu</surname> <given-names>P</given-names></string-name>, <string-name><surname>Diehl</surname> <given-names>W</given-names></string-name></person-group>. <article-title>SCAUL: power side-channel analysis with unsupervised learning</article-title>. <source>IEEE Trans Comput</source>. <year>2020</year>;<volume>69</volume>(<issue>11</issue>):<fpage>1626</fpage>&#x2013;<lpage>38</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tc.2020.3013196</pub-id>.</mixed-citation></ref>
<ref id="ref-128"><label>[128]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Kubota</surname> <given-names>T</given-names></string-name>, <string-name><surname>Yoshida</surname> <given-names>K</given-names></string-name>, <string-name><surname>Shiozaki</surname> <given-names>M</given-names></string-name>, <string-name><surname>Fujino</surname> <given-names>T</given-names></string-name></person-group>. <article-title>Deep learning side-channel attack against hardware implementations of AES</article-title>. In: <conf-name>Euromicro Conference on Digital System Design</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2019</year>. p. <fpage>261</fpage>&#x2013;<lpage>8</lpage>.</mixed-citation></ref>
<ref id="ref-129"><label>[129]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Aysu</surname> <given-names>A</given-names></string-name>, <string-name><surname>Orshansky</surname> <given-names>M</given-names></string-name>, <string-name><surname>Tiwari</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Binary ring-LWE hardware with power side-channel countermeasures</article-title>. In: <conf-name>2018 Design, Automation &#x0026; Test in Europe Conference &#x0026; Exhibition (DATE)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2018</year>. p. <fpage>1253</fpage>&#x2013;<lpage>8</lpage>.</mixed-citation></ref>
<ref id="ref-130"><label>[130]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Cui</surname> <given-names>S</given-names></string-name>, <string-name><surname>Balasch</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Efficient software masking of AES through instruction set extensions</article-title>. In: <conf-name>2023 Design, Automation &#x0026; Test in Europe Conference &#x0026; Exhibition (DATE)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2023</year>. p. <fpage>1</fpage>&#x2013;<lpage>6</lpage>.</mixed-citation></ref>
<ref id="ref-131"><label>[131]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Barthe</surname> <given-names>G</given-names></string-name>, <string-name><surname>Gr&#x00E9;goire</surname> <given-names>B</given-names></string-name>, <string-name><surname>Laporte</surname> <given-names>V</given-names></string-name></person-group>. <article-title>Secure compilation of side-channel countermeasures: the case of cryptographic &#x201C;constant-time&#x201D;</article-title>. In: <conf-name>2018 IEEE 31st Computer Security Foundations Symposium (CSF)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2018</year>. p. <fpage>328</fpage>&#x2013;<lpage>43</lpage>.</mixed-citation></ref>
<ref id="ref-132"><label>[132]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Bhandari</surname> <given-names>J</given-names></string-name>, <string-name><surname>Nabeel</surname> <given-names>M</given-names></string-name>, <string-name><surname>Mankali</surname> <given-names>L</given-names></string-name>, <string-name><surname>Sinanoglu</surname> <given-names>O</given-names></string-name>, <string-name><surname>Karri</surname> <given-names>R</given-names></string-name>, <string-name><surname>Knechtel</surname> <given-names>J</given-names></string-name></person-group>. <article-title>LiCSPA: lightweight countermeasure against static power side-channel attacks</article-title>. In: <conf-name>2025 IEEE International Symposium on Circuits and Systems (ISCAS)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2025</year>. p. <fpage>1</fpage>&#x2013;<lpage>5</lpage>.</mixed-citation></ref>
<ref id="ref-133"><label>[133]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Moraitis</surname> <given-names>M</given-names></string-name>, <string-name><surname>Brisfors</surname> <given-names>M</given-names></string-name>, <string-name><surname>Dubrova</surname> <given-names>E</given-names></string-name>, <string-name><surname>Lindskog</surname> <given-names>N</given-names></string-name>, <string-name><surname>Englund</surname> <given-names>H</given-names></string-name></person-group>. <article-title>A side-channel resistant implementation of AES combining clock randomization with duplication</article-title>. In: <conf-name>IEEE International Symposium on Circuits and Systems (ISCAS)</conf-name>. <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2023</year>. p. <fpage>1</fpage>&#x2013;<lpage>5</lpage>.</mixed-citation></ref>
<ref id="ref-134"><label>[134]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>D</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Cao</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zeng</surname> <given-names>F</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>H</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Eye of sauron: long-range hidden spy camera detection and positioning with inbuilt memory EM radiation</article-title>. In: <conf-name>USENIX Security Symposium (USENIX Security)</conf-name>. <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2024</year>. p. <fpage>109</fpage>&#x2013;<lpage>26</lpage>.</mixed-citation></ref>
<ref id="ref-135"><label>[135]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Chen</surname> <given-names>P</given-names></string-name>, <string-name><surname>Li</surname> <given-names>J</given-names></string-name>, <string-name><surname>Cheng</surname> <given-names>W</given-names></string-name>, <string-name><surname>Cheng</surname> <given-names>C</given-names></string-name></person-group>. <article-title>Uncover secrets through the cover: a deep learning-based side-channel attack against Kyber implementations with anti-tampering covers</article-title>. <source>IEEE Trans Comput</source>. <year>2025</year>;<volume>74</volume>(<issue>6</issue>):<fpage>2159</fpage>&#x2013;<lpage>67</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tc.2025.3547610</pub-id>.</mixed-citation></ref>
</ref-list>
</back></article>