<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">73076</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2025.073076</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Dynamic Malware Detection Method Based on API Multiple Subsequences</article-title>
<alt-title alt-title-type="left-running-head">Dynamic Malware Detection Method Based on API Multiple Subsequences</alt-title>
<alt-title alt-title-type="right-running-head">Dynamic Malware Detection Method Based on API Multiple Subsequences</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Liang</surname><given-names>Jinhuo</given-names></name></contrib>
<contrib id="author-2" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Shen</surname><given-names>Jinan</given-names></name><email>shenjinan@hbmzu.edu.cn</email></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Wang</surname><given-names>Pengfei</given-names></name></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Liang</surname><given-names>Fang</given-names></name></contrib>
<contrib id="author-5" contrib-type="author">
<name name-style="western"><surname>Deng</surname><given-names>Xuejian</given-names></name></contrib>
<aff id="aff-1">
<institution>College of Intelligent Systems Science and Engineering, Hubei Minzu University</institution>, <addr-line>Enshi, 445000</addr-line>, <country>China</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Jinan Shen. Email: <email>shenjinan@hbmzu.edu.cn</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2026</year>
</pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>10</day><month>2</month><year>2026</year>
</pub-date>
<volume>87</volume>
<issue>1</issue>
<elocation-id>76</elocation-id>
<history>
<date date-type="received">
<day>10</day>
<month>09</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>10</day>
<month>12</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2026 The Authors.</copyright-statement>
<copyright-year>2026</copyright-year>
<copyright-holder>Published by Tech Science Press.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_73076.pdf"></self-uri>
<abstract>
<p>The method for malware detection based on Application Programming Interface (API) call sequences, as a primary research focus within dynamic detection technologies, currently lacks attention to subsequences of API calls, the variety of API call types, and the length of sequences. This oversight leads to overly complex call sequences. To address this issue, a dynamic malware detection approach based on multiple subsequences is proposed. Initially, APIs are remapped and encoded, with the introduction of percentile lengths to process sequences. Subsequently, a combination of One-Dimensional Convolutional Neural Network (1D-CNN) and Bidirectional Long Short-Term Memory (Bi-LSTM) networks, along with an attention mechanism, is employed to extract features from subsequences of varying lengths for feature fusion and classification. Experiments conducted on two widely used public API-based datasets, namely MalBehavD-V1 and Alibaba Cloud, demonstrate that the proposed method reduces the number of API call types by approximately 20% compared to representative deep learning&#x2013;based API sequence detection methods, while achieving a peak accuracy of 98.70%. Additionally, experimental results indicate that sequence length at the 95th percentile represents the optimal solution that balances classification performance and computational efficiency.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Malware detection</kwd>
<kwd>API call types</kwd>
<kwd>percentile</kwd>
<kwd>deep learning</kwd>
</kwd-group>
<funding-group>
<award-group id="awg1">
<funding-source>National Natural Science Foundation of China</funding-source>
<award-id>62262020</award-id>
</award-group>
<award-group id="awg2">
<funding-source>Graduate Education Innovation Project of Hubei Minzu University</funding-source>
<award-id>MYK2024025</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>With the escalating severity of network attacks, the number of malicious software has exhibited exponential growth. According to a report by AV-TEST [<xref ref-type="bibr" rid="ref-1">1</xref>], there are currently over 1.4 billion malicious software programs and potentially unwanted applications (PUAs) globally, a figure that continues to rise rapidly, posing a significant challenge in the realm of cybersecurity.</p>
<p>Traditional malware detection methodologies predominantly rely on signature matching and static analysis [<xref ref-type="bibr" rid="ref-2">2</xref>]. While these approaches are effective against known malware, they fall short in providing robust protection against novel or unknown malware variants. Contemporary malware not only employs traditional attack vectors but also emphasizes stealth and evasion techniques. Attackers utilize obfuscation, encryption, and anti-sandboxing strategies [<xref ref-type="bibr" rid="ref-3">3</xref>&#x2013;<xref ref-type="bibr" rid="ref-5">5</xref>], rendering malware increasingly difficult for detection systems to identify. Consequently, behavior-based dynamic detection methods have emerged as a response to these evolving threats.</p>
<p>Behavior-based dynamic detection methods assess whether a program is malicious by observing its behavioral characteristics during runtime. For instance, malware often accesses or modifies sensitive files, sends a large number of network requests [<xref ref-type="bibr" rid="ref-6">6</xref>,<xref ref-type="bibr" rid="ref-7">7</xref>], or performs operations using privileged access. These behaviors are typically reflected in the invocation of interface, application programming (API) calls. Existing malware detection methods based on API call sequences usually model and classify these sequences as linear text sequences, treating each API as a &#x201C;word&#x201D; or &#x201C;token,&#x201D; and representing the entire calling process as a &#x201C;sentence.&#x201D; This approach leverages sequence modeling techniques from natural language processing for feature extraction and classification. Such methods can capture the sequential relationships and local dependencies between APIs to some extent, offering good expressive power and classification performance. However, they often overlook the structural features and behavioral logic inherent in subsequences of varying lengths, thus failing to fully reflect the overall behavioral patterns of malware. Furthermore, with increasing complexity in programs, the number of API types may sharply increase, leading to excessively long call sequences that result in sparse features, increased computational costs, and reduced modeling capabilities. Additionally, such methods generally struggle to effectively handle complex control flow characteristics like multidrop structures in system calls, loop execution paths, and concurrent behaviors. Consequently, their detection capability may be significantly limited when dealing with obfuscated samples that exhibit complex behavioral pathways or rely heavily on contextual semantics.</p>
<p>To address the aforementioned issues, this paper proposes a dynamic detection methodology based on subsequences of varying lengths within API call sequences. By mining critical subsequence features from the sequences, the method effectively determines whether a software sample is malicious. Specifically, the API call sequence serves as the original feature. Initially, it undergoes remapping encoding, with sequence lengths determined using percentiles. Subsequently, subsequence features of different lengths are extracted and fused to enhance the accuracy of malicious software detection.</p>
<p>The primary contributions of this paper are as follows.
<list list-type="bullet">
<list-item>
<p>A dynamic malware detection model is proposed that represents software behavior using multiple key API subsequences of varying lengths, enabling multi-scale behavioral pattern learning beyond conventional single-sequence representations.</p></list-item>
<list-item>
<p>An API remapping encoding algorithm based on API suffixes has been proposed, which reduces the variety of API calls and allows the model to focus on specific behaviors of APIs.</p></list-item>
<list-item>
<p>The concept of percentiles from statistics has been introduced into the field of malicious software detection, enabling a quantitative analysis of the optimal length for sequences of API calls, thereby avoiding simplistic truncation operations aimed at unifying their lengths.</p></list-item>
<list-item>
<p>To evaluate the effectiveness of the method presented in this paper, tests were conducted on various public datasets consisting of different sequences of API calls. Comparative experiments were performed using sequences with different percentile lengths, and ablation experiments were carried out to verify the effectiveness of the proposed API remapping encoding algorithm. Experimental results demonstrate that the method presented in this paper performs well across all utilized datasets, outperforming existing methods that use the same datasets.</p></list-item>
</list></p>
<p>The remainder of this paper is structured as follows: <xref ref-type="sec" rid="s2">Section 2</xref> delves into the background of related work and prior research. <xref ref-type="sec" rid="s3">Section 3</xref> introduces the proposed method for malicious software detection. <xref ref-type="sec" rid="s4">Section 4</xref> presents the experimental setup, results, and comparative analysis. <xref ref-type="sec" rid="s5">Section 5</xref> discusses the limitations of the study and future research directions. Finally, <xref ref-type="sec" rid="s6">Section 6</xref> concludes the paper.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Work</title>
<p>In recent years, the field of malicious software detection has witnessed a series of studies that integrate API call sequence characteristics with deep learning techniques. These approaches typically employ deep neural networks to model dynamic behavior sequences, demonstrating significant superiority over traditional detection methods such as signature matching and static analysis in terms of feature extraction, behavior pattern recognition, and classification accuracy.</p>
<p>de Oliveira and Sassi [<xref ref-type="bibr" rid="ref-8">8</xref>] have proposed a malware detection approach based on Deep Graph Convolutional Neural Networks (DGCNNs) and introduced a dataset of 42,797 malware and 1079 benign software API call sequences (API-Call-Sequences). This approach converts API call sequences into malware behavior graphs, which are then input into a classifier for classification purposes. The study found that Long Short-Term Memory (LSTM) networks outperform the DGCNN model in classifying imbalanced datasets. However, the method only utilizes the first 100 API calls from each sequence, failing to comprehensively analyze the calling patterns of the API call sequences and identifying 307 distinct API calls, which results in an excessive number of API call types being input into the model.</p>
<p>Agrawal et al. [<xref ref-type="bibr" rid="ref-9">9</xref>] proposed a classification method based on Long Short-Term Memory (LSTM) networks, which necessitates the additional input of system API call parameters. Kang et al. [<xref ref-type="bibr" rid="ref-10">10</xref>] introduced an LSTM classification mock-up that employs word2vec to vectorize and encode APIs. Catak et al. [<xref ref-type="bibr" rid="ref-11">11</xref>] developed a malware detection approach utilizing embedding layers and LSTM. These methodologies transform the classification problem of API call sequences into a text classification task, wherein forward analysis of API call sequences is conducted to leverage LSTM&#x2019;s capability in capturing temporal relationships and sequential characteristics among APIs for classification purposes. However, such approaches solely focus on the forward invocation features of APIs, neglecting backward invocation characteristics that may encapsulate crucial backward control flow information, which is vital for identifying more sophisticated malicious behaviors. Secondly, these methods are primarily designed for short API call sequences; LSTM may encounter issues such as gradient vanishing or explosion when dealing with longer sequences, thereby affecting its processing capacity and accuracy. Consequently, these methods may be constrained by sequence length in practical applications, rendering them less effective in handling long sequences or complex malware behaviors.</p>
<p>Xiaofeng et al. [<xref ref-type="bibr" rid="ref-12">12</xref>] proposed a hybrid detection architecture named ASSCA, which employs a bidirectional residual LSTM network and random forest to process API sequences and their statistical features, respectively. Li et al. [<xref ref-type="bibr" rid="ref-13">13</xref>] proposed a malicious software detection framework based on Convolutional Neural Network (CNN) and Bi-LSTM, which captures and integrates intrinsic features of API sequences, including software behavior, API semantic information, and the relationships between APIs, to comprehensively assess the maliciousness of samples. However, to achieve such feature fusion, in addition to the original API call sequence, this method requires the introduction of &#x201C;level&#x201D; information indicating the degree of impact each API has on the computer system, serving to assist the model in decollating the importance of different APIs. This level information often relies on manual annotation or prior evaluation based on domain knowledge, resulting in additional computational and annotation overheads in practical applications, thereby reducing the purpose and general applicability and scalability of the method. Iqbal et al. [<xref ref-type="bibr" rid="ref-14">14</xref>] proposed a two-stage ransomware detection framework based on signatures and API calls, significantly reducing the dimensionality of API features through feature selection, thereby validating the feasibility of low-dimensional features in maintaining detection accuracy.</p>
<p>The aforementioned methodology employs API call sequences as dynamic features, utilizing embedding techniques to transform APIs into numerical vectors, followed by detection and classification through deep learning methods. However, the application of this approach is constrained by the number of API call types and the length of sequences, and some methods necessitate additional analysis of APIs, thereby diminishing the general applicability of the model. To address these issues, this paper proposes a detection method that relies solely on the original API call sequences, employs remapping encoding and percentile lengths to reduce the number of API call types and sequence lengths, and combines 1D-CNN and Bi-LSTM to extract features from subsequences of varying lengths.</p>
</sec>
<sec id="s3">
<label>3</label>
<title>Methodology</title>
<p>The proposed method in this paper consists of two primary stages: the API remapping and encoding phase and the classification phase. During the API remapping and encoding phase, APIs are initially subjected to remapping and encoding representation, ensuring uniform encoding and truncation for APIs with different character encodings and varying sequence lengths. In the classification phase, the standardized input is separately fed into a 1D-CNN and a Bi-LSTM network to extract and integrate features from subsequences of varying lengths. The final output is then passed through a fully connected layer to yield the classification results, as illustrated in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Overall framework</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73076-fig-1.tif"/>
</fig>
<sec id="s3_1">
<label>3.1</label>
<title>API Remapping and Encoding Phase</title>
<p>In the Windows API, due to the variance in character set encoding methods, the preprocessor appends suffixes such as &#x201C;A&#x201D;, &#x201C;W&#x201D;, &#x201C;Ex&#x201D;, &#x201C;ExA&#x201D;, and &#x201C;ExW&#x201D; to the end of the general prototype during API invocation, based on the specific encoding type, to denote different versions (see <xref ref-type="table" rid="table-1">Table 1</xref>). However, during behavioral analysis of API sequences, these suffixes cause APIs with identical functionalities to be treated as distinct features, resulting in varying API sequences for the same behavior across different environments, thereby diminishing the model&#x2019;s generalization capability. To enable the model to focus on the core functionality of APIs, this paper maps APIs of different encoding methods to their general prototypes, thereby eliminating the influence of API suffixes, as illustrated in <xref ref-type="fig" rid="fig-2">Fig. 2</xref>: <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mi>f</mml:mi><mml:mspace width="negativethinmathspace" /><mml:mspace width="negativethinmathspace" /><mml:mo>:</mml:mo><mml:mi>A</mml:mi><mml:mo stretchy="false">&#x2192;</mml:mo><mml:mi>B</mml:mi></mml:math></inline-formula>, where <italic>A</italic> represents the set of APIs, <italic>B</italic> denotes the set of API prototypes, and x signifies the API invocation prototype.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Different suffixes of windows API</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Suffixes</th>
<th>Explanation</th>
</tr>
</thead>
<tbody>
<tr>
<td>-A</td>
<td>ANSI character encoding (single-byte), employed for compatibility with legacy systems or non-Unicode environments.</td>
</tr>
<tr>
<td>-W</td>
<td>Wide Character Encoding (UTF-16) facilitates global multilingual support.</td>
</tr>
<tr>
<td>-Ex</td>
<td>Extended version typically incorporate additional parameters or options.</td>
</tr>
<tr>
<td>-ExA</td>
<td>Extended version of ANSI character encoding.</td>
</tr>
<tr>
<td>-ExW</td>
<td>Extended version of Unicode character encoding.</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>The mapping rules of the API</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73076-fig-2.tif"/>
</fig>
<p>Based on the mapping rules defined above, this paper designs an API remapping algorithm. Algorithm 1 outlines the specific processing procedure: initially, each API is evaluated to determine if it contains a valid suffix identifier through regular expression matching. Upon successful matching, the WordSegment tokenization tool is employed to segment the API, followed by a verification of whether the final word belongs to a predefined suffix set. If the final word is identified as a member of this set, it is regarded as an appended suffix and subsequently removed, retaining only the prototype name of the API. Conversely, if the final word does not belong to the set, the original API is preserved without alteration. Ultimately, the algorithm yields an API call sequence that has undergone remapping, serving as a standardized input for subsequent feature extraction.</p>
<fig id="fig-9">
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73076-fig-9.tif"/>
</fig>
<p>The API call sequence, post-remapping, is represented using the one-hot encoding method. Specifically, one-hot encoding transforms each distinct API into a sparse vector, where only one dimension is set to 1, with all other dimensions being 0. This encoding approach underscores the semantic independence among various APIs, introducing no implicit prior relationships and thus preventing potential interference caused by numerically similar vector values.</p>
<p>Moreover, one-hot encoding boasts the advantages of simplicity in implementation and the absence of reliance on external prior knowledge, providing a clear and unambiguous input representation for subsequent classification stages. Although this encoding method may result in a higher dimensionality, the API remapping in this approach has significantly reduced the variety of API calls, effectively mitigating the issue of excessive dimensionality.</p>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Classification Phase</title>
<p>The classification phase employs a 1D-CNN and a Bi-LSTM network for feature extraction. The 1D-CNN, by nature, excels in processing sequential data, adept at capturing local continuous patterns, thereby effectively extracting short-range dependency features from sequences. API call sequences, as direct representations of program dynamic behavior, often contain several key operational segments that reveal specific functional intents. These segments are typically composed of a series of consecutive APIs, forming semantically related subsequences. For instance, <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mi>S</mml:mi><mml:mi>o</mml:mi><mml:mi>c</mml:mi><mml:mi>k</mml:mi><mml:mi>e</mml:mi><mml:mi>t</mml:mi><mml:mo stretchy="false">&#x2192;</mml:mo><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>t</mml:mi></mml:math></inline-formula> signifies a network connection, <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:mi>V</mml:mi><mml:mi>i</mml:mi><mml:mi>r</mml:mi><mml:mi>t</mml:mi><mml:mi>u</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>A</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>c</mml:mi><mml:mo stretchy="false">&#x2192;</mml:mo><mml:mi>m</mml:mi><mml:mi>e</mml:mi><mml:mi>m</mml:mi><mml:mi>c</mml:mi><mml:mi>p</mml:mi><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x2192;</mml:mo><mml:mi>V</mml:mi><mml:mi>i</mml:mi><mml:mi>r</mml:mi><mml:mi>t</mml:mi><mml:mi>u</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>F</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>e</mml:mi></mml:math></inline-formula> describes the process of memory allocation, writing, and release, while <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mi>C</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>F</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>e</mml:mi><mml:mo stretchy="false">&#x2192;</mml:mo><mml:mi>W</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>F</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>e</mml:mi><mml:mo stretchy="false">&#x2192;</mml:mo><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>F</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>e</mml:mi><mml:mo stretchy="false">&#x2192;</mml:mo><mml:mi>C</mml:mi><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>H</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi><mml:mi>l</mml:mi><mml:mi>e</mml:mi></mml:math></inline-formula> corresponds to file read-write operations. These API segments with specific semantics serve as features to decollate different behavioral patterns, referred to as key subsequences.</p>
<p>To extract key subsequences from the API sequence, a sliding window of size <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:mi>l</mml:mi></mml:math></inline-formula> (where <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:mi>l</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo stretchy="false">]</mml:mo></mml:math></inline-formula>) is introduced, and a convolution operation is performed on the API call sequence. This sliding window moves sequentially along the API call sequence with a step size of 1, from the start bit to the stop bit, ultimately generating a feature map of length <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula>.</p>
<p>Due to variations in the sliding window size, an API sequence yields <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:mi>n</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> subsequences, as calculated by <xref ref-type="disp-formula" rid="eqn-1">Eq. (1)</xref>.
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mo>.</mml:mo><mml:mo>.</mml:mo><mml:mo>.</mml:mo><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mo>.</mml:mo><mml:mo>.</mml:mo><mml:mo>.</mml:mo><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>To reduce computational load and enhance computational efficiency, the most commonly encountered subsequence lengths of 2, 3, and 4 were selected as sliding window sizes to extract subsequence features of corresponding lengths, as illustrated in <xref ref-type="fig" rid="fig-3">Fig. 3</xref>.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>1D-CNN network architecture with convolutional kernel sizes of 2, 3, and 4</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73076-fig-3.tif"/>
</fig>
<p>Upon completion of the extraction of subsequence features of varying lengths, a Global Max Pooling operation was applied to each category of subsequence features. This operation effectively compresses the feature dimension, reducing redundant information while preserving the most representative features of each type of subsequence. By focusing on the most activated features within the subsequences, Global Max Pooling enhances the model&#x2019;s perception of key behavioral patterns, aiding in the prominence of discriminative local features within malicious behaviors. Consequently, this approach further elevates the overall classification performance and detection accuracy.</p>
<p>In an API sequence, a single API constitutes a subsequence of length 1. A unidirection LSTM can only utilize past API call information, whereas a Bi-LSTM processes the sequence simultaneously from both forward and backward directions, enabling a more comprehensive understanding of the dependencies among APIs, which facilitates more precise classification of API sequences. <xref ref-type="fig" rid="fig-4">Fig. 4</xref> illustrates the architecture of a Bi-LSTM unfolded along the time steps, where <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mi>x</mml:mi></mml:math></inline-formula> denotes the input, <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mi>y</mml:mi></mml:math></inline-formula> the output, <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mi>h</mml:mi></mml:math></inline-formula> the forward hidden state, and <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:msup><mml:mi>h</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> the backward hidden state.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>Bi-LSTM network architecture</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73076-fig-4.tif"/>
</fig>
<p>Bi-LSTM is constructed based on LSTM units, addressing the unidirection limitation by combining two independent LSTM layers, where the forward and backward LSTM layers do not share parameters and are trained independently. The API sequences encoded via one-hot encoding are input into the Bi-LSTM module, and its forward propagation, backward propagation, and final output can be calculated using <xref ref-type="disp-formula" rid="eqn-2">Eqs. (2)</xref>&#x2013;<xref ref-type="disp-formula" rid="eqn-4">(4)</xref>.
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mrow><mml:msub><mml:mi>h</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mover><mml:mrow><mml:mi>L</mml:mi><mml:mi>S</mml:mi><mml:mi>T</mml:mi><mml:mi>M</mml:mi></mml:mrow><mml:mo>&#x2192;</mml:mo></mml:mover><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:msup><mml:mrow><mml:msub><mml:mi>h</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mover><mml:mrow><mml:mi>L</mml:mi><mml:mi>S</mml:mi><mml:mi>T</mml:mi><mml:mi>M</mml:mi></mml:mrow><mml:mo>&#x2190;</mml:mo></mml:mover><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo stretchy="false">)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>h</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:msub><mml:mi>h</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo stretchy="false">)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<p>To identify critical APIs within the sequence, an attention mechanism was incorporated into the Bi-LSTM network. The specific computational procedure can be delineated into the following three steps:</p>
<p>1) Employing the dot product of vectors as the attention value score, as expressed in <xref ref-type="disp-formula" rid="eqn-5">Eq. (5)</xref>, where <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:mrow><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denotes the attention value score, <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:mi>q</mml:mi></mml:math></inline-formula> represents the query vector, and <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:mrow><mml:msub><mml:mi>k</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> signifies the key vector.
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mrow><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mi>q</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mrow><mml:msub><mml:mi>k</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<p>2) Normalizing the attention value scores using the Softmax function, as illustrated in <xref ref-type="disp-formula" rid="eqn-6">Eqs. (6)</xref> and <xref ref-type="disp-formula" rid="eqn-7">(7)</xref>, where <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:mrow><mml:msub><mml:mi>a</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> corresponds to the weighting coefficient of the associated fleshless vector, and <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> indicates the dimensionality of the key vector.
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>S</mml:mi><mml:mi>o</mml:mi><mml:mi>f</mml:mi><mml:mi>t</mml:mi><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mrow><mml:mrow><mml:munderover><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mrow><mml:mrow><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mrow></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mrow><mml:msub><mml:mi>a</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mi>S</mml:mi><mml:mi>o</mml:mi><mml:mi>f</mml:mi><mml:mi>t</mml:mi><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<p>3) Summing the weighted fleshless vectors to obtain the attention vector, as shown in <xref ref-type="disp-formula" rid="eqn-8">Eq. (8)</xref>, where <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:mi>a</mml:mi></mml:math></inline-formula> denotes the attention vector, <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:mrow><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> represents the fleshless vector, and <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:mi>n</mml:mi></mml:math></inline-formula> is the sequence length.
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>a</mml:mi><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mrow><mml:mrow><mml:msub><mml:mi>a</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>&#x2217;</mml:mo><mml:mrow><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<p>The fused features extracted by the 1D-CNN and Bi-LSTM network are input into a neural network composed of a fully connected layer, Rectified Linear Unit (ReLU) activation function, Dropout layer, and an output layer employing the Sigmoid activation function for final classification. The fully connected layer receives the fused feature vector <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and processes it according to <xref ref-type="disp-formula" rid="eqn-9">Eqs. (9)</xref> and <xref ref-type="disp-formula" rid="eqn-10">(10)</xref>, where <italic>W</italic> denotes the weight matrix of the fully connected layer, <inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:mrow><mml:mi>b</mml:mi></mml:mrow></mml:math></inline-formula> represents the bias term, and <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:mrow><mml:msub><mml:mi>h</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the output vector of the fully connected layer. Moreover, to prevent overfitting in the model, a Dropout layer with a dropout quotiety of 0.5 is appended after each fully connected layer. During each training iteration, this mechanism randomly sets a portion of neuron outputs to zero, thereby avoiding model reliance on specific neurons.
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>L</mml:mi><mml:mi>U</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mrow><mml:msub><mml:mi>h</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>L</mml:mi><mml:mi>U</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>&#x2217;</mml:mo><mml:mi>W</mml:mi><mml:mo>+</mml:mo><mml:mi>b</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<p>After computation through the fully connected layer, the output layer calculates the positive example classification result probability <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:math></inline-formula> according to <xref ref-type="disp-formula" rid="eqn-11">Eqs. (11)</xref> and <xref ref-type="disp-formula" rid="eqn-12">(12)</xref>.
<disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>S</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mi>m</mml:mi><mml:mi>o</mml:mi><mml:mi>i</mml:mi><mml:mi>d</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mrow><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-12"><label>(12)</label><mml:math id="mml-eqn-12" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mi>S</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mi>m</mml:mi><mml:mi>o</mml:mi><mml:mi>i</mml:mi><mml:mi>d</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>h</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>&#x2217;</mml:mo><mml:mi>W</mml:mi><mml:mo>+</mml:mo><mml:mi>b</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<p>Additionally, the model employs binary cross-entropy as the loss function, utilizing the Adam optimizer to refine the parameters of the network and updating the weight matrices through backpropagation.</p>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Model Structure and Hyperparameter Setting</title>
<p><xref ref-type="fig" rid="fig-5">Fig. 5</xref> illustrates the specific structure of the model in this paper and the hyperparameter combinations, which were determined through grid search. For further details, please refer to <xref ref-type="sec" rid="s4_3">Section 4.3</xref>.</p>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>Specific structure of the model and hyperparameter configuration</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73076-fig-5.tif"/>
</fig>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Experiments and Results</title>
<p>This section elaborates on the dataset and hyperparameter selections employed in the experiments, introduces the evaluation metrics utilized during the assessment, analyzes the experimental results, and concludes with comparative experiments, ablation studies, and an exploration of sequence lengths.</p>
<sec id="s4_1">
<label>4.1</label>
<title>Experimental Setup and Tools</title>
<p>The proposed model was implemented and tested on a computer running Windows 11 Professional (64-bit), equipped with an Intel(R) Core(TM) i5-12600KF processor (3.70 GHz), 32 GB of memory, an NVIDIA GeForce RTX 4060 Ti graphics card (16 GB VRAM), and a 2TB hard disk. The model was developed using Python 3.10.15, leveraging the TensorFlow 2.10.0 and Keras 2.10.0 frameworks. Additionally, it relied on libraries such as Scikit-learn, NumPy, Pandas, Matplotlib, Seaborn, and WordSegment. These libraries are open-source software, freely available via the Python Package Index (PyPI) platform.</p>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Dataset</title>
<p>To validate the efficacy of the proposed method, this study employs two widely recognized publicly available API sequence datasets&#x2014;MalBehavD-V1 [<xref ref-type="bibr" rid="ref-15">15</xref>] and Alibaba Cloud [<xref ref-type="bibr" rid="ref-16">16</xref>]&#x2014;for model training and evaluation. <xref ref-type="table" rid="table-2">Table 2</xref> enumerates the datasets utilized.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>The two datasets utilized for the experiment</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>No. malware samples</th>
<th>No. benign samples</th>
<th>Total</th>
<th>Released in</th>
</tr>
</thead>
<tbody>
<tr>
<td>MalBehavD-V1</td>
<td>1285</td>
<td>1285</td>
<td>2570</td>
<td>2022</td>
</tr>
<tr>
<td>Alibaba Cloud</td>
<td>8909</td>
<td>4978</td>
<td>13,887</td>
<td>2018</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>MalBehvaD-V1 represents a novel dynamic dataset, constructed through the application of dynamic malware analysis methodologies, specifically designed to extract API call sequences from both benign and malicious executable files (EXE files) within Windows operating systems. Each sample undergoes independent execution within an isolated environment powered by the Cuckoo sandbox, ensuring precise behavioral logging. The malicious samples are sourced from VirusTotal, while the benign samples are collected from the CNET website. The dataset encompasses a total of 2570 executable files, comprising 1285 benign samples and 1285 malicious samples.</p>
<p>Alibaba Cloud is a large-scale behavioral dataset released by Alibaba Cloud for dynamic malware analysis and detection research, comprising approximately 90 million API call records. This dataset was collected by executing Windows executable files sourced from the internet within an analog sandbox environment, capturing the API call sequences triggered during sample execution. All samples have undergone desensitization processing. The dataset contains a total of 13,887 executable files, including 4978 benign samples and 8909 malicious samples.</p>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Hyperparameter Selection</title>
<p><xref ref-type="table" rid="table-3">Table 3</xref> presents the hyperparameter search space and its corresponding optimal configuration. A grid search strategy was employed to comprehensively explore all candidate combinations. Specifically, the training set was divided into three subsets, with evaluation conducted through cross-validation, where each subset was individually used for validation while the remaining subsets were used for training. For each hyperparameter configuration, the average validation accuracy across all subsets was calculated as the primary evaluation metric.</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Hyperparameter search space and optimal hyperparameter combination</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Hyperparameter</th>
<th>Search space</th>
<th>Best value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Number of filters in convolutional layers</td>
<td>128, 256, 512</td>
<td>256</td>
</tr>
<tr>
<td>Number of convolutional layers</td>
<td>2, 3, 4, 5</td>
<td>3</td>
</tr>
<tr>
<td>Units of Bi-LSTM layer</td>
<td>128, 256, 512</td>
<td>256</td>
</tr>
<tr>
<td>Units of the first dense layers</td>
<td>128, 256, 512</td>
<td>256</td>
</tr>
<tr>
<td>Number of dense layers</td>
<td>1, 2, 3, 4</td>
<td>3</td>
</tr>
<tr>
<td>Activation function of dense layers</td>
<td>relu, sigmoid, tanh</td>
<td>relu</td>
</tr>
<tr>
<td>Rate of dropout layers</td>
<td>0.2, 0.5, 0.7</td>
<td>0.5</td>
</tr>
<tr>
<td>Learning rate</td>
<td>1e&#x2013;3, 1e&#x2013;4</td>
<td>1e&#x2013;3</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The final hyperparameter settings were selected based on the configuration achieving the highest average validation accuracy, while also considering training stability and computational efficiency. Notably, although adopting larger hyperparameters (such as more convolutional filters or larger bidirectional LSTM units) may yield minor performance improvements, it significantly increases computational and memory overhead. Conversely, excessively small configurations are prone to convergence instability or degraded detection performance.</p>
</sec>
<sec id="s4_4">
<label>4.4</label>
<title>Evaluation Metrics</title>
<p>To evaluate the performance of the model, this paper adopts Accuracy, Precision, Recall, and F1 score as evaluation metrics, with their calculation processes detailed in <xref ref-type="disp-formula" rid="eqn-13">Eqs. (13)</xref>&#x2013;<xref ref-type="disp-formula" rid="eqn-16">(16)</xref>.
<disp-formula id="eqn-13"><label>(13)</label><mml:math id="mml-eqn-13" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>A</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:mi>u</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-14"><label>(14)</label><mml:math id="mml-eqn-14" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-15"><label>(15)</label><mml:math id="mml-eqn-15" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-16"><label>(16)</label><mml:math id="mml-eqn-16" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:mo>&#x22C5;</mml:mo><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<p>These evaluation metrics are calculated based on the following variables: True Positive (TP): samples that are predicted as positive by the model and are indeed positive; False Positive (FP): samples that are predicted as positive by the model but are actually negative; False Negative (FN): samples that are predicted as negative by the model but are indeed positive; True Negative (TN): samples that are predicted as negative by the model and are indeed negative. Herein, positive refers to malicious samples, while negative refers to benign samples.</p>
<p>In addition, this paper also employs the Area Under Receiver Operating Characteristic Curve (AUC) as an evaluation metric. The calculation process is shown in <xref ref-type="disp-formula" rid="eqn-17">Eq. (17)</xref>.
<disp-formula id="eqn-17"><label>(17)</label><mml:math id="mml-eqn-17" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>A</mml:mi><mml:mi>U</mml:mi><mml:mi>C</mml:mi><mml:mo>=</mml:mo><mml:msubsup><mml:mo>&#x222B;</mml:mo><mml:mn>0</mml:mn><mml:mn>1</mml:mn></mml:msubsup><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac><mml:mi>d</mml:mi><mml:mfrac><mml:mrow><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
</sec>
<sec id="s4_5">
<label>4.5</label>
<title>Analysis of Experimental Results</title>
<p>In each dataset, stratified sampling with an 8:2 ratio was employed to partition the training and test sets, followed by further 5-fold cross-validation. Subsequently, the performance of the proposed method was evaluated on the test set based on the evaluation metrics defined in the previous section. <xref ref-type="table" rid="table-4">Table 4</xref> presents the experimental results of the proposed method across different datasets.</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Results of the proposed method across various evaluation metrics</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>Accuracy</th>
<th>Precision</th>
<th>Recall</th>
<th>F1</th>
<th>AUC</th>
<th>5-fold</th>
</tr>
</thead>
<tbody>
<tr>
<td>MalBehavD-V1</td>
<td>0.9767</td>
<td>0.9960</td>
<td>0.9572</td>
<td>0.9762</td>
<td>0.9891</td>
<td>0.9728</td>
</tr>
<tr>
<td>Alibaba Cloud</td>
<td>0.9870</td>
<td>0.9910</td>
<td>0.9888</td>
<td>0.9899</td>
<td>0.9975</td>
<td>0.9842</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>On the MalBehavD-V1 dataset, the proposed method achieved outstanding detection performance, with a precision reaching 99.60%, indicating an extremely low false positive rate. The recall rate of 95.72% demonstrates that the model-extracted API subsequence features effectively encompass the malicious behavior patterns of the samples. Compared to the MalBehavD-V1 dataset, the model&#x2019;s recall rate on the Alibaba Cloud dataset increased by 3.16 deci-percentage points, and the AUC improved by 0.0084, indicating that in a larger-scale and more diverse sample environment, the proposed method maintains the advantages of high detection and low false negatives, accurately identifying malicious samples.</p>
<p>From a cross-dataset comparative perspective, the proposed method achieves an accuracy exceeding 97% on both datasets, with AUC values approaching or surpassing 0.99, indicating robust classification capability and generalization performance. Regarding the trade-off between false positives and false negatives, the method exhibits an almost error-free false positive rate on the MalBehavD-V1 dataset; on the Alibaba Cloud dataset, although the precision slightly decreases, the recall is further enhanced, demonstrating that the model attains a lower false negative rate at an acceptable level of false positives. Furthermore, cross-validation results indicate that the model maintains stable performance even in the presence of distributional differences. <xref ref-type="fig" rid="fig-6">Fig. 6</xref> illustrates the confusion matrices of the proposed method across different datasets.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>Confusion matrices of the proposed method across different datasets. (<bold>a</bold>) MalBehavD-V1 Dataset; (<bold>b</bold>) Alibaba cloud Dataset</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73076-fig-6.tif"/>
</fig>
<p>In summary, the proposed method not only demonstrates outstanding performance on the small-scale MalBehavD-V1 dataset but also achieves higher recall and AUC on the more challenging Alibaba Cloud dataset, thereby exhibiting an equilibrant advantage in accuracy, robustness, and generalization capability.</p>
</sec>
<sec id="s4_6">
<label>4.6</label>
<title>Comparison with Different Deep Learning Models</title>
<p>In <xref ref-type="table" rid="table-5">Table 5</xref>, the proposed method is compared with various mainstream deep learning models on two publicly available datasets, MalBehavD-V1 and Alibaba Cloud, in terms of performance. On the MalBehavD-V1 dataset, the proposed method achieves an accuracy of 97.67%, representing an improvement of approximately 1.76% over the best-performing comparative method, Transformer. On the Alibaba Cloud dataset, the proposed method similarly attains the highest detection performance, with an accuracy of 98.70%, surpassing the best-performing comparative model, Gated Recurrent Unit (GRU), by approximately 2.37%, and achieving an AUC of 0.9975 (see <xref ref-type="fig" rid="fig-7">Fig. 7</xref>), approaching near-perfect classification. Experimental results indicate that single-sequence modeling approaches exhibit deficiencies in modeling long sequences and capturing contextual correlations, resulting in performance limitations in high-dimensional, long-dependency API call scenarios. In contrast, the proposed method demonstrates superior generalization capability across different data distributions and scales.</p>
<table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>Comparison with different deep learning models</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>Models</th>
<th>Accuracy</th>
<th>Precision</th>
<th>Recall</th>
<th>F1</th>
<th>5-fold</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>LSTM</td>
<td>0.9280</td>
<td>0.9545</td>
<td>0.8988</td>
<td>0.9259</td>
<td>0.9241</td>
</tr>
<tr>
<td></td>
<td>GRU</td>
<td>0.9358</td>
<td>0.9242</td>
<td>0.9494</td>
<td>0.9367</td>
<td>0.9261</td>
</tr>
<tr>
<td>MalBehavD-V1</td>
<td>Bi-GRU</td>
<td>0.9572</td>
<td>0.9681</td>
<td>0.9455</td>
<td>0.9567</td>
<td>0.9580</td>
</tr>
<tr>
<td></td>
<td>Transformer</td>
<td>0.9591</td>
<td>0.9646</td>
<td>0.9533</td>
<td>0.9589</td>
<td>0.9611</td>
</tr>
<tr>
<td></td>
<td>Ours</td>
<td>0.9767</td>
<td>0.9960</td>
<td>0.9572</td>
<td>0.9762</td>
<td>0.9728</td>
</tr>
<tr>
<td></td>
<td>LSTM</td>
<td>0.6886</td>
<td>0.7301</td>
<td>0.8165</td>
<td>0.7709</td>
<td>0.6810</td>
</tr>
<tr>
<td></td>
<td>GRU</td>
<td>0.9633</td>
<td>0.9661</td>
<td>0.9770</td>
<td>0.9715</td>
<td>0.9593</td>
</tr>
<tr>
<td>Alibaba Cloud</td>
<td>Bi-GRU</td>
<td>0.9590</td>
<td>0.9659</td>
<td>0.9703</td>
<td>0.9681</td>
<td>0.9512</td>
</tr>
<tr>
<td></td>
<td>Transformer</td>
<td>0.9291</td>
<td>0.9505</td>
<td>0.9383</td>
<td>0.9444</td>
<td>0.9270</td>
</tr>
<tr>
<td></td>
<td>Ours</td>
<td>0.9870</td>
<td>0.9910</td>
<td>0.9888</td>
<td>0.9899</td>
<td>0.9842</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-7">
<label>Figure 7</label>
<caption>
<title>Receiver operating characteristic (ROC) curves of various deep learning models across different datasets. (<bold>a</bold>) MalBehavD-V1; (<bold>b</bold>) Alibaba cloud</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73076-fig-7.tif"/>
</fig>
</sec>
<sec id="s4_7">
<label>4.7</label>
<title>Comparison with Existing Methods</title>
<p>In addition to the baseline model, this paper conducts further comparisons with recent studies on malware behavior analysis based on the same dataset (see <xref ref-type="table" rid="table-6">Table 6</xref>), which include models based on CNN and Recurrent Neural Network (RNN), as well as those based on graph neural networks and Transformers. Despite differences in preprocessing strategies and feature representations, the proposed method demonstrates strong generalization capability across two datasets and various malware behavior patterns. Its consistently superior performance across different datasets and models highlights its potential for practical deployment in large-scale malware detection systems.</p>
<table-wrap id="table-6">
<label>Table 6</label>
<caption>
<title>Comparison of the proposed method with existing approaches</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>Approaches</th>
<th>Year</th>
<th>Accuracy</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>MalDetConv [<xref ref-type="bibr" rid="ref-15">15</xref>]</td>
<td>2022</td>
<td>0.9610</td>
</tr>
<tr>
<td></td>
<td>TCN-BiGRU [<xref ref-type="bibr" rid="ref-17">17</xref>]</td>
<td>2025</td>
<td>0.9465</td>
</tr>
<tr>
<td></td>
<td>Tiwari [<xref ref-type="bibr" rid="ref-18">18</xref>]</td>
<td>2024</td>
<td>0.9500</td>
</tr>
<tr>
<td>MalBehavD-V1</td>
<td>Pham et al. [<xref ref-type="bibr" rid="ref-19">19</xref>]</td>
<td>2025</td>
<td>0.9626</td>
</tr>
<tr>
<td></td>
<td>DawnGNN [<xref ref-type="bibr" rid="ref-20">20</xref>]</td>
<td>2024</td>
<td>0.9638</td>
</tr>
<tr>
<td></td>
<td>CNN-GRU-3 [<xref ref-type="bibr" rid="ref-21">21</xref>]</td>
<td>2025</td>
<td>0.9705</td>
</tr>
<tr>
<td></td>
<td>Ours</td>
<td></td>
<td>0.9767</td>
</tr>
<tr>
<td></td>
<td>Xu et al. [<xref ref-type="bibr" rid="ref-22">22</xref>]</td>
<td>2020</td>
<td>0.9344</td>
</tr>
<tr>
<td></td>
<td>Mal-ASSF [<xref ref-type="bibr" rid="ref-23">23</xref>]</td>
<td>2023</td>
<td>0.9449</td>
</tr>
<tr>
<td></td>
<td>Luo et al. [<xref ref-type="bibr" rid="ref-24">24</xref>]</td>
<td>2024</td>
<td>0.9467</td>
</tr>
<tr>
<td></td>
<td>Yan et al. [<xref ref-type="bibr" rid="ref-25">25</xref>]</td>
<td>2025</td>
<td>0.9561</td>
</tr>
<tr>
<td>Alibaba Cloud</td>
<td>DEGCN [<xref ref-type="bibr" rid="ref-26">26</xref>]</td>
<td>2022</td>
<td>0.9610</td>
</tr>
<tr>
<td></td>
<td>SDGNet [<xref ref-type="bibr" rid="ref-27">27</xref>]</td>
<td>2020</td>
<td>0.9730</td>
</tr>
<tr>
<td></td>
<td>GSB [<xref ref-type="bibr" rid="ref-28">28</xref>]</td>
<td>2024</td>
<td>0.9760</td>
</tr>
<tr>
<td></td>
<td>Ours</td>
<td></td>
<td>0.9870</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_8">
<label>4.8</label>
<title>Ablation Experiment</title>
<p>This subsection conducts ablation experiments on the fusion model, thereby validating the effectiveness of the encoding method and each component of the model individually.</p>
<p><xref ref-type="fig" rid="fig-8">Fig. 8</xref> illustrates the impact of various encoding methods on detection performance and the number of API call types across different datasets. The results indicate that the proposed remapping encoding method significantly outperforms traditional Embedding Layer and One-hot encoding methods on both datasets (see <xref ref-type="fig" rid="fig-8">Fig. 8a</xref>). Additionally, in terms of the number of API call types, the remapping encoding effectively compresses the dimensionality of the feature space, reducing the feature dimension by approximately 20%. This suggests the presence of redundant APIs in the API call sequences that interfere with detection effectiveness (see <xref ref-type="fig" rid="fig-8">Fig. 8b</xref>). The experimental results demonstrate that semantic feature reconstruction during the encoding stage can significantly enhance the performance of malware detection based on API call sequences.</p>
<fig id="fig-8">
<label>Figure 8</label>
<caption>
<title>Comparison of the effects of different encoding methods on detection performance and the number of API call types across various datasets. (<bold>a</bold>) Accuracy; (<bold>b</bold>) Number of API call types</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73076-fig-8.tif"/>
</fig>
<p><xref ref-type="table" rid="table-7">Table 7</xref> presents the ablation experiment results of the proposed method on the MalBehavD-V1 and Alibaba Cloud datasets. On both datasets, the standalone use of 1D-CNN demonstrates relatively superior performance, indicating that 1D-CNN possesses significant advantages in extracting local behavioral patterns and capturing short-range dependencies. However, its recall rate is slightly lower than that of the proposed model, implying that reliance solely on local features is insufficient to encompass all malicious patterns. The Bi-LSTM exhibits inferior performance compared to 1D-CNN on both datasets, suggesting that bidirectional sequential models are prone to overfitting or gradient vanishing when handling high-dimensional long sequences. Nonetheless, Bi-LSTM achieves a relatively higher recall rate, indicating its complementary value in capturing long-range dependencies and global temporal information. The incorporation of the Attention mechanism markedly enhances performance, demonstrating its efficacy in mitigating redundancy in long-sequence information. Overall, the proposed method significantly outperforms single or partial combination models across both datasets, achieving the best results in accuracy, precision, recall, and F1-score, thereby substantiating the effectiveness and necessity of a multi-module fusion design in malicious behavior detection.</p>
<table-wrap id="table-7">
<label>Table 7</label>
<caption>
<title>Comparison of ablation experiment results</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>Ablation component</th>
<th>Accuracy</th>
<th>Precision</th>
<th>Recall</th>
<th>F1</th>
<th>5-fold</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Only 1D-CNN</td>
<td>0.9553</td>
<td>0.9500</td>
<td>0.9611</td>
<td>0.9555</td>
<td>0.9630</td>
</tr>
<tr>
<td>MalBehavD-V1</td>
<td>Only Bi-LSTM</td>
<td>0.9416</td>
<td>0.9316</td>
<td>0.9533</td>
<td>0.9423</td>
<td>0.9514</td>
</tr>
<tr>
<td></td>
<td>Bi-LSTM &#x002B; Attention</td>
<td>0.9630</td>
<td>0.9877</td>
<td>0.9377</td>
<td>0.9621</td>
<td>0.9611</td>
</tr>
<tr>
<td></td>
<td>Ours</td>
<td>0.9767</td>
<td>0.9960</td>
<td>0.9572</td>
<td>0.9762</td>
<td>0.9728</td>
</tr>
<tr>
<td></td>
<td>Only 1D-CNN</td>
<td>0.9755</td>
<td>0.9751</td>
<td>0.9871</td>
<td>0.9810</td>
<td>0.9762</td>
</tr>
<tr>
<td>Alibaba Cloud</td>
<td>Only Bi-LSTM</td>
<td>0.9471</td>
<td>0.9632</td>
<td>0.9540</td>
<td>0.9586</td>
<td>0.9550</td>
</tr>
<tr>
<td></td>
<td>Bi-LSTM &#x002B; Attention</td>
<td>0.9719</td>
<td>0.9897</td>
<td>0.9663</td>
<td>0.9779</td>
<td>0.9634</td>
</tr>
<tr>
<td></td>
<td>Ours</td>
<td>0.9870</td>
<td>0.9910</td>
<td>0.9888</td>
<td>0.9899</td>
<td>0.9842</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_9">
<label>4.9</label>
<title>Different Lengths of API Call Sequences</title>
<p>This paper introduces the percentile length to investigate the impact of varying sequence lengths on the performance of detection models. A percentile is a statistical concept used to indicate the position of a specific value within a dataset, where the p-th percentile denotes that p% of the data are less than or equal to that value. For API call sequences, due to the extreme imbalance in sequence lengths across samples, simple truncation fails to achieve the optimal performance of the detection model. In contrast, percentile length not only characterizes the distribution of API call sequence lengths among samples but also mitigates the interference of outliers on the model. Consequently, this paper selects commonly used percentile values (25%, 50%, 90%, 95%, 99%, 100%) as the criteria for API call sequence lengths to test the performance effects of each sequence length percentile.</p>
<p><xref ref-type="table" rid="table-8">Table 8</xref> presents a comparative analysis of performance across different datasets using varying percentile lengths. It is evident that in the MalBehavD-V1 dataset, where sequences are relatively short and uniformly distributed, employing a lower percentile length still maintains high precision. Conversely, in the Alibaba Cloud dataset, the sequence length distribution is extremely imbalanced with a substantial number of long-tail sequences, rendering the processing of complete sequences challenging. When truncation lengths below the 90th percentile are applied, a significant decline in model accuracy is observed, indicating that excessive truncation results in substantial loss of behavioral features. Overall, across different datasets, the 95th percentile length represents an optimal trade-off between performance and computational efficiency; hence, this truncation length is consistently adopted in the experiments of this study.</p>
<table-wrap id="table-8">
<label>Table 8</label>
<caption>
<title>Comparison of percentile lengths utilized across various datasets</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>Percentile/%</th>
<th>Sequence length</th>
<th>Training time/s</th>
<th>Accuracy</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>100</td>
<td>174</td>
<td>29.31</td>
<td>0.9747</td>
</tr>
<tr>
<td></td>
<td>99</td>
<td>126</td>
<td>22.55</td>
<td>0.9708</td>
</tr>
<tr>
<td>MalBehavD-V1</td>
<td>95</td>
<td>101</td>
<td>19.62</td>
<td>0.9767</td>
</tr>
<tr>
<td></td>
<td>90</td>
<td>92</td>
<td>18.80</td>
<td>0.9728</td>
</tr>
<tr>
<td></td>
<td>50</td>
<td>37</td>
<td>13.30</td>
<td>0.9455</td>
</tr>
<tr>
<td></td>
<td>25</td>
<td>20</td>
<td>10.68</td>
<td>0.9358</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>511,775</td>
<td>-&#x002A;</td>
<td>-&#x002A;</td>
</tr>
<tr>
<td></td>
<td>99</td>
<td>29,034</td>
<td>-&#x002A;</td>
<td>-&#x002A;</td>
</tr>
<tr>
<td>Alibaba Cloud</td>
<td>95</td>
<td>11,061</td>
<td>4670.35</td>
<td>0.9870</td>
</tr>
<tr>
<td></td>
<td>90</td>
<td>7101</td>
<td>2947.20</td>
<td>0.9856</td>
</tr>
<tr>
<td></td>
<td>50</td>
<td>659</td>
<td>310.09</td>
<td>0.9672</td>
</tr>
<tr>
<td></td>
<td>25</td>
<td>122</td>
<td>74.06</td>
<td>0.9456</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-8fn1" fn-type="other">
<p>Note: &#x002A;Data exceeds the processing capacity of the device.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Limitations and Discussion</title>
<p>Although the proposed method demonstrates excellent detection performance, it may still encounter multiple challenges during practical realtime deployment, such as computational overhead, sandbox evasion behaviors, and data distribution discrepancies. The following sections present a systematic analysis and discussion of these challenges.</p>
<sec id="s5_1">
<label>5.1</label>
<title>Computational Overhead</title>
<p>In practical deployment, in addition to detection performance, computational complexity and realtime performance must also be fully considered. The overall time complexity of the proposed method is approximately <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:mrow><mml:mi>&#x1D4AA;</mml:mi></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>L</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:msup><mml:mi>H</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>, where <italic>L</italic> denotes the sequence length and <italic>H</italic> represents the hidden layer dimension. The model&#x2019;s average single-sample inference time on the Alibaba Cloud dataset is 16.38 ms, which can meet the near realtime requirements of large-scale detection tasks. However, due to the model&#x2019;s use of bidirectional recurrent layers and attention mechanisms, the computational burden during the training phase remains relatively high. In the future, we will focus on exploring more efficient sequence truncation strategies, lightweight model architectures, and distributed training schemes to reduce computational overhead.</p>
</sec>
<sec id="s5_2">
<label>5.2</label>
<title>Sandbox Evasion Behaviors</title>
<p>This paper primarily conducts malware detection research based on API call sequences within the Windows system, extracting behavioral features through the dynamic execution of malicious samples in sandbox environments. However, with the continuous advancement of anti-sandbox techniques, certain malicious programs can identify sandbox environments and adopt countermeasures to evade dynamic analysis. Future research may explore hybrid analysis strategies that integrate static and dynamic features, as well as model cross-platform adaptability.</p>
</sec>
<sec id="s5_3">
<label>5.3</label>
<title>Data Distribution Discrepancies</title>
<p>The MalBehavD-V1 and Alibaba Cloud datasets cover a limited range of malware sample types, which may differ from the diverse attack behaviors encountered in real-world environments. Future work could employ incremental learning or adversarial training mechanisms to enable the model to continuously learn new types of malware samples while maintaining detection capabilities for existing samples. Moreover, although the Alibaba Cloud dataset exhibits a certain degree of imbalance, in practical applications, malware data often exhibit significant class imbalance and dynamic variability. Distributional differences across data collected at different times or from various sources may cause fluctuations in model performance. Future research could explore transfer learning and adaptive learning strategies to enhance the model&#x2019;s robustness under distributional drift conditions.</p>
</sec>
</sec>
<sec id="s6">
<label>6</label>
<title>Conclusion</title>
<p>This paper proposes a dynamic malware detection method based on API multi-subsequence, aiming to streamline API call sequences and extract key behavioral features from these sequences. Initially, the original API call sequences undergo remapping encoding, transforming each API into a numerical representation with semantic discriminative capabilities to enhance the model&#x2019;s recognition of different APIs. Subsequently, a fusion architecture is constructed, incorporating two sub-models: 1D-CNN and Bi-LSTM, to model and integrate features from subsequences of varying lengths. Experimental results demonstrate that the proposed method outperforms existing approaches across various public datasets, showcasing robust generalization capabilities. Furthermore, by analyzing the distribution of sequence lengths in the dataset, the 95th percentile length was adopted as the optimal truncation length for input sequences, effectively balancing the trade-off between preserving sequence information and computational efficiency, thereby enhancing overall detection performance. However, the method proposed in this paper currently relies primarily on dynamic features and lacks utilization of static features. In future research, we will further explore fusion strategies between dynamic and static features to construct a more comprehensive hybrid malware detection framework, enabling more thorough and accurate malware identification.</p>
</sec>
</body>
<back>
<ack>
<p>The authors acknowledge the foundational support of Hubei Minzu University, whose infrastructure and funding were instrumental in conducting this study.</p>
</ack>
<sec>
<title>Funding Statement</title>
<p>This study was supported by the National Natural Science Foundation of China (62262020) and the Graduate Education Innovation Project of Hubei Minzu University (MYK2024025).</p>
</sec>
<sec>
<title>Author Contributions</title>
<p>Conceptualization and study design were performed by Jinhuo Liang. Jinan Shen conducted the experiments and data collection. Data analysis and interpretation were carried out by Jinan Shen and Pengfei Wang. The original draft of the manuscript was written by Jinhuo Liang and Jinan Shen. Pengfei Wang, Fang Liang and Xuejian Deng critically reviewed and edited the manuscript. Fang Liang provided technical and material support. Xuejian Deng supervised the entire project. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability">
<title>Availability of Data and Materials</title>
<p>The datasets utilized in this study are available in MalbehavD-V1 at <ext-link ext-link-type="uri" xlink:href="https://github.com/mpasco/MalbehavD-V1">https://github.com/mpasco/MalbehavD-V1</ext-link> (accessed on 09 December 2025) and in Alibaba Cloud at <ext-link ext-link-type="uri" xlink:href="https://tianchi.aliyun.com/competition/entrance/231694/information">https://tianchi.aliyun.com/competition/entrance/231694/information</ext-link> (accessed on 09 December 2025).</p>
</sec>
<sec>
<title>Ethics Approval</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement">
<title>Conflicts of Interest</title>
<p>The authors declare no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="other"><article-title>Malware Statistics &#x0026; Trends Report &#x007C; AV-TEST&#x2014;av-test.org</article-title>. <comment>[cited 2024 Sep 20]</comment>. Available from: <ext-link ext-link-type="uri" xlink:href="https://www.av-test.org/en/statistics/malware/">https://www.av-test.org/en/statistics/malware/</ext-link>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gopinath</surname> <given-names>M</given-names></string-name>, <string-name><surname>Sethuraman</surname> <given-names>SC</given-names></string-name></person-group>. <article-title>A comprehensive survey on deep learning based malware detection techniques</article-title>. <source>Comput Sci Rev</source>. <year>2023</year>;<volume>47</volume>:<fpage>100529</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.cosrev.2022.100529</pub-id>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Brezinski</surname> <given-names>K</given-names></string-name>, <string-name><surname>Ferens</surname> <given-names>K</given-names></string-name></person-group>. <article-title>Metamorphic malware and obfuscation: a survey of techniques, variants, and generation kits</article-title>. <source>Secur Commun Netw</source>. <year>2023</year>;<volume>2023</volume>(<issue>1</issue>):<fpage>8227751</fpage>. doi:<pub-id pub-id-type="doi">10.1155/2023/8227751</pub-id>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Begovic</surname> <given-names>K</given-names></string-name>, <string-name><surname>Al-Ali</surname> <given-names>A</given-names></string-name>, <string-name><surname>Malluhi</surname> <given-names>Q</given-names></string-name></person-group>. <article-title>Cryptographic ransomware encryption detection: survey</article-title>. <source>Comput Secur</source>. <year>2023</year>;<volume>132</volume>:<fpage>103349</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.cose.2023.103349</pub-id>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Sharma</surname> <given-names>A</given-names></string-name>, <string-name><surname>Gupta</surname> <given-names>BB</given-names></string-name>, <string-name><surname>Singh</surname> <given-names>AK</given-names></string-name>, <string-name><surname>Saraswat</surname> <given-names>V</given-names></string-name></person-group>. <article-title>Orchestration of APT malware evasive manoeuvers employed for eluding anti-virus and sandbox defense</article-title>. <source>Comput Secur</source>. <year>2022</year>;<volume>115</volume>:<fpage>102627</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.cose.2022.102627</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Nadler</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bitton</surname> <given-names>R</given-names></string-name>, <string-name><surname>Brodt</surname> <given-names>O</given-names></string-name>, <string-name><surname>Shabtai</surname> <given-names>A</given-names></string-name></person-group>. <article-title>On the vulnerability of anti-malware solutions to DNS attacks</article-title>. <source>Comput Secur</source>. <year>2022</year>;<volume>116</volume>:<fpage>102687</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.cose.2022.102687</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Kambar</surname> <given-names>MEZN</given-names></string-name>, <string-name><surname>Esmaeilzadeh</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Taghva</surname> <given-names>K</given-names></string-name></person-group>. <article-title>A survey on mobile malware detection methods using machine learning</article-title>. In: <conf-name>2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC); 2022 Jan 26&#x2013;29</conf-name>; <publisher-loc>Las Vegas, NV, USA</publisher-loc>. p. <fpage>215</fpage>&#x2013;<lpage>21</lpage>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>de Oliveira</surname> <given-names>AS</given-names></string-name>, <string-name><surname>Sassi</surname> <given-names>RJ</given-names></string-name></person-group>. <article-title>Behavioral malware detection using deep graph convolutional neural networks</article-title>. <source>Int J Comput Appl</source>. <year>2021</year>;<volume>174</volume>(<issue>29</issue>):<fpage>1</fpage>&#x2013;<lpage>8</lpage>. doi:<pub-id pub-id-type="doi">10.36227/techrxiv.10043099.v1</pub-id>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Agrawal</surname> <given-names>R</given-names></string-name>, <string-name><surname>Stokes</surname> <given-names>JW</given-names></string-name>, <string-name><surname>Marinescu</surname> <given-names>M</given-names></string-name>, <string-name><surname>Selvaraj</surname> <given-names>K</given-names></string-name></person-group>. <article-title>Neural sequential malware detection with parameters</article-title>. In: <conf-name>Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2018 Apr 15&#x2013;20</conf-name>; <publisher-loc>Calgary, AB, Canada</publisher-loc>. p. <fpage>2656</fpage>&#x2013;<lpage>60</lpage>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Jang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Li</surname> <given-names>S</given-names></string-name>, <string-name><surname>Jeong</surname> <given-names>YS</given-names></string-name>, <string-name><surname>Sung</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Long short-term memory-based malware classification method for information security</article-title>. <source>Comput Electr Eng</source>. <year>2019</year>;<volume>77</volume>:<fpage>366</fpage>&#x2013;<lpage>75</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.compeleceng.2019.06.014</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Catak</surname> <given-names>FO</given-names></string-name>, <string-name><surname>Yaz&#x0131;</surname> <given-names>AF</given-names></string-name>, <string-name><surname>Elezaj</surname> <given-names>O</given-names></string-name>, <string-name><surname>Ahmed</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Deep learning based Sequential model for malware analysis using Windows exe API Calls</article-title>. <source>PeerJ Comput Sci</source>. <year>2020</year>;<volume>6</volume>:<fpage>e285</fpage>. doi:<pub-id pub-id-type="doi">10.7717/peerj-cs.285</pub-id>; <pub-id pub-id-type="pmid">33816936</pub-id></mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Lu</surname> <given-names>X</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>F</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>X</given-names></string-name>, <string-name><surname>Yi</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sha</surname> <given-names>J</given-names></string-name>, <string-name><surname>Pietro</surname> <given-names>L</given-names></string-name></person-group>. <article-title>ASSCA: API sequence and statistics features combined architecture for malware detection</article-title>. <source>Comput Netw</source>. <year>2019</year>;<volume>157</volume>:<fpage>99</fpage>&#x2013;<lpage>111</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.comnet.2019.04.007</pub-id>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Li</surname> <given-names>C</given-names></string-name>, <string-name><surname>Lv</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Li</surname> <given-names>N</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>D</given-names></string-name>, <string-name><surname>Qiao</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>A novel deep framework for dynamic malware detection based on API sequence intrinsic features</article-title>. <source>Comput Secur</source>. <year>2022</year>;<volume>116</volume>:<fpage>102686</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.cose.2022.102686</pub-id>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Iqbal</surname> <given-names>A</given-names></string-name>, <string-name><surname>Hussain</surname> <given-names>M</given-names></string-name>, <string-name><surname>Riaz</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Khalid</surname> <given-names>M</given-names></string-name>, <string-name><surname>Mumtaz</surname> <given-names>R</given-names></string-name>, <string-name><surname>Jung</surname> <given-names>KH</given-names></string-name></person-group>. <article-title>Enhancing ransomware detection with machine learning techniques and effective API integration</article-title>. <source>Comput Mater Contin</source>. <year>2025</year>;<volume>85</volume>(<issue>1</issue>):<fpage>1693</fpage>&#x2013;<lpage>714</lpage>. doi:<pub-id pub-id-type="doi">10.32604/cmc.2025.064260</pub-id>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Maniriho</surname> <given-names>P</given-names></string-name>, <string-name><surname>Mahmood</surname> <given-names>AN</given-names></string-name>, <string-name><surname>Chowdhury</surname> <given-names>MJM</given-names></string-name></person-group>. <article-title>MalDetConv: automated behaviour-based malware detection framework based on natural language processing and deep learning techniques</article-title>. <comment>arXiv:2209.03547. 2022</comment>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="other"><article-title>Alibaba cloud malware detection based on behaviors&#x2014;tianchi.aliyun.com</article-title>. <comment>[cited 2024 Aug 20]</comment>. Available from: <ext-link ext-link-type="uri" xlink:href="https://tianchi.aliyun.com/competition/entrance/231694/information">https://tianchi.aliyun.com/competition/entrance/231694/information</ext-link>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Aswin</surname> <given-names>V</given-names></string-name>, <string-name><surname>Kumar</surname> <given-names>BS</given-names></string-name></person-group>. <article-title>TCN-BiGRU model for malware detection based on API call sequences</article-title>. In: <conf-name>Proceedings of the 2025 5th International Conference on Pervasive Computing and Social Networking (ICPCSN); 2025 May 14&#x2013;16</conf-name>; <publisher-loc>Salem, India</publisher-loc>. p. <fpage>1461</fpage>&#x2013;<lpage>6</lpage>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Tiwari</surname> <given-names>PK</given-names></string-name></person-group>. <article-title>Malware detection using control flow graphs</article-title>. In: <conf-name>Proceedings of the 2024 2nd International Conference on Device Intelligence, Computing and Communication Technologies (DICCT); 2024 Mar 15&#x2013;16</conf-name>; <publisher-loc>Dehradun, India</publisher-loc>. p. <fpage>216</fpage>&#x2013;<lpage>20</lpage>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Pham</surname> <given-names>TB</given-names></string-name>, <string-name><surname>Duong</surname> <given-names>PHT</given-names></string-name>, <string-name><surname>Nguyen</surname> <given-names>DK</given-names></string-name>, <string-name><surname>Hien</surname> <given-names>DTT</given-names></string-name>, <string-name><surname>Cam</surname> <given-names>NT</given-names></string-name>, <string-name><surname>Pham</surname> <given-names>VH</given-names></string-name></person-group>. <article-title>Multimodal windows malware detection via hybrid analysis and enriched graphs: effectiveness and explainability</article-title>. In: <conf-name>Proceedings of the 2025 International Conference on Multimedia Analysis and Pattern Recognition (MAPR); 2025 Aug 14&#x2013;15</conf-name>; <publisher-loc>Khanh Hoa, Vietnam</publisher-loc>. p. <fpage>1</fpage>&#x2013;<lpage>6</lpage>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Feng</surname> <given-names>P</given-names></string-name>, <string-name><surname>Gai</surname> <given-names>L</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Li</surname> <given-names>T</given-names></string-name>, <string-name><surname>Xi</surname> <given-names>N</given-names></string-name>, <etal>et al.</etal></person-group> <article-title>DawnGNN: documentation augmented windows malware detection using graph neural network</article-title>. <source>Comput Secur</source>. <year>2024</year>;<volume>140</volume>:<fpage>103788</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.cose.2024.103788</pub-id>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Sar&#x0131;</surname> <given-names>NV</given-names></string-name>, <string-name><surname>Ac&#x0131;</surname> <given-names>M</given-names></string-name>, <string-name><surname>Ac&#x0131;</surname> <given-names>&#x00C7;&#x0130;</given-names></string-name></person-group>. <article-title>Windows malware detection via enhanced graph representations with Node2Vec and graph attention network</article-title>. <source>Appl Sci</source>. <year>2025</year>;<volume>15</volume>(<issue>9</issue>):<fpage>4775</fpage>. doi:<pub-id pub-id-type="doi">10.3390/app15094775</pub-id>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Xu</surname> <given-names>A</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>L</given-names></string-name>, <string-name><surname>Kuang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Lv</surname> <given-names>H</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>Y</given-names></string-name>, <etal>et al.</etal></person-group> <article-title>A hybrid deep learning model for malicious behavior detection</article-title>. In: <conf-name>Proceedings of the 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS); 2020 May 25&#x2013;27</conf-name>; <publisher-loc>Baltimore, MD, USA</publisher-loc>. p. <fpage>55</fpage>&#x2013;<lpage>9</lpage>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>M</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>W</given-names></string-name></person-group>. <article-title>Dynamic malware analysis based on API sequence semantic fusion</article-title>. <source>Appl Sci</source>. <year>2023</year>;<volume>13</volume>(<issue>11</issue>):<fpage>6526</fpage>. doi:<pub-id pub-id-type="doi">10.3390/app13116526</pub-id>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Luo</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Luo</surname> <given-names>J</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>P</given-names></string-name>, <string-name><surname>Jing</surname> <given-names>R</given-names></string-name></person-group>. <article-title>Sequence-based malware detection using a single-bidirectional graph embedding and multi-task learning framework</article-title>. <source>J Comput Secur</source>. <year>2024</year>;<volume>32</volume>(<issue>2</issue>):<fpage>141</fpage>&#x2013;<lpage>63</lpage>. doi:<pub-id pub-id-type="doi">10.3233/jcs-230041</pub-id>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yan</surname> <given-names>P</given-names></string-name>, <string-name><surname>Tan</surname> <given-names>S</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>M</given-names></string-name>, <string-name><surname>Huang</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Prompt engineering-assisted malware dynamic analysis using gpt-4</article-title>. <source>IEEE Trans Dependable Secure Comput</source>. <year>2025</year>;<volume>22</volume>(<issue>6</issue>):<fpage>7712</fpage>&#x2013;<lpage>28</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tdsc.2025.3599004</pub-id>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>W</given-names></string-name>, <string-name><surname>Song</surname> <given-names>H</given-names></string-name>, <string-name><surname>Dong</surname> <given-names>H</given-names></string-name></person-group>. <article-title>Malware detection with dynamic evolving graph convolutional networks</article-title>. <source>Int J Intell Syst</source>. <year>2022</year>;<volume>37</volume>(<issue>10</issue>):<fpage>7261</fpage>&#x2013;<lpage>80</lpage>. doi:<pub-id pub-id-type="doi">10.1002/int.22880</pub-id>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Dong</surname> <given-names>H</given-names></string-name>, <string-name><surname>Gao</surname> <given-names>H</given-names></string-name>, <string-name><surname>Jin</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>W</given-names></string-name></person-group>. <article-title>Spectral-based directed graph network for malware detection</article-title>. <source>IEEE Trans Netw Sci Eng</source>. <year>2020</year>;<volume>8</volume>(<issue>2</issue>):<fpage>957</fpage>&#x2013;<lpage>70</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tnse.2020.3024557</pub-id>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>G</given-names></string-name>, <string-name><surname>Xiang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhuang</surname> <given-names>S. GSB</given-names></string-name></person-group>: <article-title>GNGS and SAG-BiGRU network for malware dynamic detection</article-title>. <source>PLoS One</source>. <year>2024</year>;<volume>19</volume>(<issue>4</issue>):<fpage>e0298809</fpage>. doi:<pub-id pub-id-type="doi">10.1371/journal.pone.0298809</pub-id>; <pub-id pub-id-type="pmid">38635682</pub-id></mixed-citation></ref>
</ref-list>
</back></article>