<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">53163</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2024.053163</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>An Attention-Based Approach to Enhance the Detection and Classification of Android Malware</article-title>
<alt-title alt-title-type="left-running-head">An Attention-Based Approach to Enhance the Detection and Classification of Android Malware</alt-title>
<alt-title alt-title-type="right-running-head">An Attention-Based Approach to Enhance the Detection and Classification of Android Malware</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Ghourabi</surname><given-names>Abdallah</given-names></name><email>aghourabi@ju.edu.sa</email></contrib>
<aff><institution>Department of Computer Science, College of Computer and Information Sciences, Jouf University</institution>, <addr-line>Sakaka, 72388</addr-line>, <country>Saudi Arabia</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Abdallah Ghourabi. Email: <email>aghourabi@ju.edu.sa</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2024</year></pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>15</day>
<month>8</month>
<year>2024</year></pub-date>
<volume>80</volume>
<issue>2</issue>
<fpage>2743</fpage>
<lpage>2760</lpage>
<history>
<date date-type="received">
<day>26</day>
<month>4</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>07</day>
<month>6</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2024 Ghourabi</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Ghourabi</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_53163.pdf"></self-uri>
<abstract>
<p>The dominance of Android in the global mobile market and the open development characteristics of this platform have resulted in a significant increase in malware. These malicious applications have become a serious concern to the security of Android systems. To address this problem, researchers have proposed several machine-learning models to detect and classify Android malware based on analyzing features extracted from Android samples. However, most existing studies have focused on the classification task and overlooked the feature selection process, which is crucial to reduce the training time and maintain or improve the classification results. The current paper proposes a new Android malware detection and classification approach that identifies the most important features to improve classification performance and reduce training time. The proposed approach consists of two main steps. First, a feature selection method based on the Attention mechanism is used to select the most important features. Then, an optimized Light Gradient Boosting Machine (LightGBM) classifier is applied to classify the Android samples and identify the malware. The feature selection method proposed in this paper is to integrate an Attention layer into a multilayer perceptron neural network. The role of the Attention layer is to compute the weighted values of each feature based on its importance for the classification process. Experimental evaluation of the approach has shown that combining the Attention-based technique with an optimized classification algorithm for Android malware detection has improved the accuracy from 98.64% to 98.71% while reducing the training time from 80 to 28 s.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Android malware</kwd>
<kwd>malware detection</kwd>
<kwd>feature selection</kwd>
<kwd>attention mechanism</kwd>
<kwd>LightGBM</kwd>
<kwd>mobile security</kwd>
</kwd-group>
<funding-group>
<award-group id="awg1">
<funding-source>Deanship of Graduate Studies and Scientific Research at Jouf University</funding-source>
<award-id>DGSSR-2023-02-02178</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>The mobile application market has grown rapidly in recent years. According to the newly released State of Mobile 2022 report by App Annie [<xref ref-type="bibr" rid="ref-1">1</xref>], 170 billion dollars were spent in application stores in 2021, 230 billion new applications were downloaded and an average of 4.8 h per day were spent on a mobile device. The major part of this application market is reserved for the Android environment. As of August 2022, an estimated 71.52% of mobile phone users are using Android (according to StatCounter [<xref ref-type="bibr" rid="ref-2">2</xref>]). This great popularity of Android applications has made them targets of several types of malwares. In fact, the personal data stored on Android devices including account credentials and banking transactions attract the attention of hackers on these devices and encourage them to perform malicious behavior through Android applications.</p>
<p>A recent report on the evolution of mobile malware [<xref ref-type="bibr" rid="ref-3">3</xref>] indicated that, in 2021, a large number of malware applications was detected by Kaspersky Labs, including 97,661 new mobile banking Trojans, 3,464,756 mobile malicious installer packages, 17,372 new mobile ransomwares. The same survey also illustrated how sophisticated mobile attacks are growing. Even with Google's efforts to keep threats away from its application platform, the experts still manage to find malware on Google Play. With regard to mobile malware detection systems, it is still challenging to create effective detection mechanisms comparable to those of personal computers due to several constraints associated with the characteristics of mobile devices.</p>
<p>The research community has proposed several solutions for Android malware detection based on machine learning. These solutions usually involve two main steps. The first step is the extraction and selection of features from the applications. The second step is the creation of a classifier to distinguish malicious from benign applications. Selecting appropriate techniques for both steps is crucial to achieving efficient classification results, especially if one wants to deploy these systems on mobile devices with limited hardware performance. For instance, the feature collection phase requires extracting diverse features, which may comprise requested permissions, API calls, opcodes, activities, and system features. This large number of features significantly increases the size of the dataset and can affect the classifier&#x2019;s execution time and accuracy. The choice of the classifier is also vital in this context. For example, classifiers based on deep learning are very efficient in the classification results; however, they are known for their high memory occupation and slow execution. In the current paper, we propose an approach for Android malware detection through feature selection and classification techniques that allow both to keep a high level of classification precision and to reduce the execution time of the model. Several recent studies [<xref ref-type="bibr" rid="ref-4">4</xref>&#x2013;<xref ref-type="bibr" rid="ref-6">6</xref>] have shown that data optimization using feature selection techniques can improve the performance of attack and anomaly detection systems, especially in environments with limited hardware characteristics, such as mobile and IoT devices.</p>
<p>The set of features extracted from Android applications for malware analysis is large and diverse. This can result in a large amount of data that also contains irrelevant and redundant elements, making the task of classification extremely complex. For example, in the dataset CCCS-CIC-AndMal-2020, which we used for the experimental evaluation of our approach, each instance contains 5911 features. Analyzing such high-dimensionality dataset is challenging as it requires longer processing times and more storage and can potentially lead to misclassification of Android applications. This work proposes an innovative feature selection method for dimensionality reduction. In the literature, several research studies have proposed methods to select features and reduce their dimensionality in Android malware classification systems. The majority of works are based on statistical methods such as chi-square [<xref ref-type="bibr" rid="ref-7">7</xref>,<xref ref-type="bibr" rid="ref-8">8</xref>], PCA [<xref ref-type="bibr" rid="ref-8">8</xref>,<xref ref-type="bibr" rid="ref-9">9</xref>], information gain [<xref ref-type="bibr" rid="ref-10">10</xref>,<xref ref-type="bibr" rid="ref-11">11</xref>] and Fast Correlation-Based Filter [<xref ref-type="bibr" rid="ref-12">12</xref>]. Other studies have considered the use of techniques based on neural networks such as Restricted Boltzmann Machines [<xref ref-type="bibr" rid="ref-13">13</xref>] and RNN [<xref ref-type="bibr" rid="ref-14">14</xref>]. In other works, researchers have tried techniques based on graphs and trees such as Dominance Tree [<xref ref-type="bibr" rid="ref-15">15</xref>] and Random Forest [<xref ref-type="bibr" rid="ref-16">16</xref>]. The present paper complements and extends previous research efforts by exploring the effectiveness of the Attention mechanism as a feature selection technique for mobile malware detection and classification. The Attention mechanism has been very successful in the ML community in the last few years, especially in NLP tasks. Based on neural network architecture, this mechanism utilizes a weighted vector to help understand the relationship between the input features and the target and to estimate which part of the data is more pivotal than the others to accomplish the ML task. Although this mechanism is successful in natural language processing tasks, it is not yet well explored for mobile malware detection.</p>
<p>In this paper, we propose an Android malware detection system based on the Attention mechanism and the LightGBM classifier. The goal of our solution is to optimize the performance of the Android malware detection systems by reducing the number of features while maintaining optimal classification accuracy and ensuring a fast and efficient classification process.</p>
<p>The main contributions of our work are summarized below:
<list list-type="bullet">
<list-item>
<p>The proposal of a neural network based on the Attention mechanism dedicated to reducing the number of features and selecting only those crucial to the classification results.</p></list-item>
<list-item>
<p>The application of a classification model created using a LightGBM algorithm optimized through the use of the Bayesian method.</p></list-item>
<list-item>
<p>The proposal of an optimized detection system that can be integrated into mobile devices while ensuring fast execution and high accuracy.</p></list-item>
<list-item>
<p>The evaluation of the proposed solution demonstrated a reduction in processing time and a slight improvement in classification accuracy from 98.64% to 98.71%.</p></list-item>
</list></p>
<p>The rest of the paper is organized as follows: <xref ref-type="sec" rid="s2">Section 2</xref> reviews the related works. <xref ref-type="sec" rid="s3">Section 3</xref> describes our approach and its model design. <xref ref-type="sec" rid="s4">Section 4</xref> presents the results of the experimental tests. Finally, we conclude the paper in <xref ref-type="sec" rid="s5">Section 5</xref>.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Work</title>
<p>In this section, we provide an overview of recent work on feature dimensionality reduction in the field of Android malware detection and classification. These works are classified into three categories depending on the type of methods used for feature selection: statistical based methods, neural network based methods, and tree based methods.</p>
<sec id="s2_1">
<label>2.1</label>
<title>Statistical Based Methods</title>
<p>Most papers in the literature favor the use of statistical techniques (such as chi-square test, Fisher&#x2019;s score, information gain, etc.) to select or reduce features due to their speed and low computational cost. For example, Cai et al. [<xref ref-type="bibr" rid="ref-10">10</xref>] proposed JOWMDroid, a feature-based approach to detect Android malware combining weight mapping optimization with classifier parameter optimization. Their idea is to utilize information gain to select the most relevant features from eight categories of features extracted from APK files. Next, three machine learning algorithms were used to calculate the initial weight for each feature and then map them to the final weights. The final step is to utilize the differential evolution algorithm to jointly optimize the parameters of the weighting function and the classifier.</p>
<p>Another work dealing with feature reduction for Android malware detection was proposed by Xie et al. [<xref ref-type="bibr" rid="ref-11">11</xref>]. In their approach, they employed a two-step feature selection method consisting of using InfoGain for an initial selection and applying the chi-square test for further reduction to remove redundant and irrelevant features. For the classification of Android malware, they used a stacking method of 5 base classifiers with the use of the Genetic Algorithm to optimize the hyperparameters of the model.</p>
<p>In [<xref ref-type="bibr" rid="ref-12">12</xref>], the authors introduced a malware detection framework called FAMD (Fast Android Malware Detector). It utilizes the FCBF (Fast Correlation-Based Filter) algorithm and the N-Gram technique to process the features and reduce their dimensionality. Then, the CatBoost classifier is utilized for malware classification. Experimental tests demonstrated a malware detection accuracy of 97.40% and a malware family classification accuracy of 97.38%.</p>
<p>In [<xref ref-type="bibr" rid="ref-7">7</xref>], the authors investigated the utility of the feature subset selection methods for Android malware detection by comparing and contrasting these methods along several factors. They utilized various learning algorithms to empirically evaluate the predictive accuracy of the feature subset selection methods and compare their predictive accuracy and execution times. The experiment findings demonstrated that feature selection is essential for increasing learning model accuracy and reducing runtime. The outcomes also illustrated that different learning algorithms perform differently when it comes to feature selection techniques, and no particular feature selection approach consistently outperforms the others.</p>
<p>Onwuzurike et al. [<xref ref-type="bibr" rid="ref-9">9</xref>] proposed a system that aims to detect Android malware from a behavioral perspective. It is based on abstracting the API calls executed by the application and building behavioral models through the use of Markov chains. The authors employed principal component analysis (PCA) to lower the dimensionality of the feature space and hence the computational and memory complexity of their system.</p>
<p>In [<xref ref-type="bibr" rid="ref-17">17</xref>], a novel method for selecting features called the selection of relevant attributes was developed by the authors in order to improve locally extracted features through the use of classical feature selectors (SAILS). This mechanism was constructed on top of conventional feature selection methods, including &#x201C;mutual information&#x201D;, &#x201C;distinguishing feature selector&#x201D;, and &#x201C;Galavotti-Sebastiani-Simi&#x201D;. It aimed to discover prominent system calls from Android applications.</p>
<p>Thiyagarajan et al. suggested in [<xref ref-type="bibr" rid="ref-8">8</xref>] a malware detection method based on the permissions requested by the application. Their idea was to minimize the data size by decreasing the number of permissions through a set of data reduction techniques (chi-square, permission ranking with a negative rate, support-based pruning, association-based pruning, and PCA). The reduced permissions were utilized for classifying the samples as malware or benign with a decision tree algorithm and categorizing the malware samples through the use of the K-means clustering algorithm.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Neural Network Based Methods</title>
<p>In other cases, authors have preferred to benefit from the strength of neural networks for feature selection. For instance, Liu et al. [<xref ref-type="bibr" rid="ref-13">13</xref>] proposed an unsupervised feature learning method named &#x201C;Subspace Based Restricted Boltzmann Machines&#x201D; (SRBM) to reduce the data dimensionality in mobile malware detection. Their method includes searching for suitable subspaces over the entire feature set using a clustering method and learning the features in each feature subspace using Restricted Boltzmann Machines. Then, all learned features are concatenated to represent the original features in a lower dimension. The authors illustrated that their method outperforms other feature reduction methods including &#x201C;RBM&#x201D;, &#x201C;Stacked Auto Encoder&#x201D;, &#x201C;Principal Components Analysis&#x201D;, and &#x201C;Agglomeration algorithms&#x201D; in terms of clustering evaluation metrics.</p>
<p>In [<xref ref-type="bibr" rid="ref-14">14</xref>], Wu et al. proposed a feature reduction framework called DroidRL for Android malware detection. They used the Double Deep Q Network (DDQN) and the recurrent neural network (RNN) algorithms to select a valid subset of features over a larger range. They also attempted to determine the semantic relevance of features by using word embedding for the input features. The experiments conducted by the authors showed that their approach reduced the number of features from 1083 to 24 while maintaining high accuracy.</p>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>Tree Based Methods</title>
<p>In [<xref ref-type="bibr" rid="ref-15">15</xref>], the authors proposed a method named DroidDomTree that searches the dominance tree of API calls included in the Android APK in order to find malicious modules. To efficiently select features, they developed a weighting scheme for assigning weights to each node in the dominance tree. This scheme aims to find the key modules that help in detecting malicious elements. In the experimental tests, the method presented detection rates between 98.1% and 99.3% when applied with eight machine learning classifiers. In another paper, Sharma et al. [<xref ref-type="bibr" rid="ref-16">16</xref>] have chosen a simple technique to reduce features based on calculating the importance of each feature using the feature_importances property of the Random Forest classifier, and then evaluating it using various machine learning algorithms.</p>
<p><xref ref-type="table" rid="table-1">Table 1</xref> presents a comparative summary of the different works discussed in this section. These works denoted the importance of feature reduction methods in improving the performance of An- droid malware detection and classification systems. In the current paper, we tried to exploit a more innovative technique other than those described in the literature, that is, the Attention mechanism. Despite its importance, this mechanism is not yet well exploited in the field of Android malware detection. Among the few works done in this context, we can cite the paper by Wu et al. [<xref ref-type="bibr" rid="ref-18">18</xref>], in which the authors proposed a neural network approach to classify Android malware based on two layers: attention layer and multilayer perceptron (MLP). The attention layer is intended to learn feature weights, which can be thought of as scores of relevance between the features and classification outcomes. Then, the MLP maps the weighted features to classify the samples as benign or malicious. In this approach, the Attention-based feature importance is associated with an MLP classifier to classify Android samples. However, in our opinion, we believe that the classification process requires a more robust algorithm than a simple MLP. For this reason, we propose in this paper a new classification approach for android applications based on the Attention mechanism and the LightGBM algorithm. The Attention technique is utilized for determining the importance of features and reducing their dimensionality, while the LightGBM algorithm (with Bayesian optimization of hyperparameters) is utilized for performing an efficient classification of Android samples based on the set of selected features. To the best of our knowledge, the system that we propose in the current article is the first solution combining an Attention mechanism and a distributed gradient boosting framework like LightGBM to classify Android malware.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Comparative summary of related work</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Paper reference</th>
<th>Approach objective</th>
<th>Used methods</th>
<th>Dataset type</th>
</tr>
</thead>
<tbody>
<tr>
<td>Liu et al. [<xref ref-type="bibr" rid="ref-13">13</xref>]</td>
<td>Application of unsupervised feature learning to reduce data dimensionality in mobile malware dataset</td>
<td>Subspace based restricted Boltzmann machines</td>
<td>OmniDroid [<xref ref-type="bibr" rid="ref-19">19</xref>], CIC2019 [<xref ref-type="bibr" rid="ref-20">20</xref>] and CIC2020 [<xref ref-type="bibr" rid="ref-21">21</xref>]</td>
</tr>
<tr>
<td>Alam et al. [<xref ref-type="bibr" rid="ref-15">15</xref>]</td>
<td>Creation of dominance tree of API calls to improve the feature selection and the detection Android malware</td>
<td>Dominance tree, TF-IDF</td>
<td>Android applications collected from different sources</td>
</tr>
<tr>
<td>Cai et al. [<xref ref-type="bibr" rid="ref-10">10</xref>]</td>
<td>Optimization of feature weight-mapping to detect Android malware</td>
<td>Feature weighting, ML classifiers, differential evolution algorithm</td>
<td>Drebin [<xref ref-type="bibr" rid="ref-22">22</xref>], AMD [<xref ref-type="bibr" rid="ref-23">23</xref>], applications collected from Google Play and APKPure.com</td>
</tr>
<tr>
<td>Wu et al. [<xref ref-type="bibr" rid="ref-14">14</xref>]</td>
<td>Feature reduction for Android malware detection and classification</td>
<td>DDQN, word embedding</td>
<td>AndroZoo [<xref ref-type="bibr" rid="ref-24">24</xref>] and Drebin [<xref ref-type="bibr" rid="ref-22">22</xref>]</td>
</tr>
<tr>
<td>Xie et al. [<xref ref-type="bibr" rid="ref-11">11</xref>]</td>
<td>Feature reduction for Android malware detection and classification</td>
<td>InfoGain, chi-square test, stacking and genetic algorithm</td>
<td>CIC-AndMal2017 [<xref ref-type="bibr" rid="ref-25">25</xref>] and<break/>CICMalDroid2020 [<xref ref-type="bibr" rid="ref-21">21</xref>]</td>
</tr>
<tr>
<td>Bai et al. [<xref ref-type="bibr" rid="ref-12">12</xref>]</td>
<td>Feature reduction for Android<break/>malware detection and classification</td>
<td>Fast correlation-based filter, catboost classifier</td>
<td>Drebin [<xref ref-type="bibr" rid="ref-22">22</xref>] and private dataset</td>
</tr>
<tr>
<td>Abawajy et al. [<xref ref-type="bibr" rid="ref-7">7</xref>]</td>
<td>Examine the effectiveness of the feature subset selection techniques for detecting Android malware</td>
<td>Pearson correlation coefficient, chi-square, analysis of variance (ANOVA), information gain, mutual information</td>
<td>Android applications collected from different sources</td>
</tr>
<tr>
<td>Onwuzurike et al. [<xref ref-type="bibr" rid="ref-9">9</xref>]</td>
<td>Detect Android malware by modeling application behavior</td>
<td>Markov chains, PCA</td>
<td>Android applications collected from different sources</td>
</tr>
<tr>
<td>Ananya et al. [<xref ref-type="bibr" rid="ref-17">17</xref>]</td>
<td>Feature selection for Android malware classification</td>
<td>SAILS, XGBoost, CART, logistic regression, random forest and deep neural networks</td>
<td>Drebin [<xref ref-type="bibr" rid="ref-22">22</xref>]</td>
</tr>
<tr>
<td>Thiyagarajan et al. [<xref ref-type="bibr" rid="ref-8">8</xref>]</td>
<td>Reduce the number of application permissions for real time malware detection and clustering</td>
<td>PCA, decision tree, K-means</td>
<td>AndroZoo [<xref ref-type="bibr" rid="ref-24">24</xref>]</td>
</tr>
<tr>
<td>Sharma et al. [<xref ref-type="bibr" rid="ref-16">16</xref>]</td>
<td>Android malware detection and family classification</td>
<td>Random forest, deep learning</td>
<td>AndroZoo [<xref ref-type="bibr" rid="ref-24">24</xref>]</td>
</tr>
<tr>
<td>Wu et al. [<xref ref-type="bibr" rid="ref-18">18</xref>]</td>
<td>Classify Android malware and interpret their malicious behaviors</td>
<td>Attention mechanism, multilayer perceptron</td>
<td>Drebin [<xref ref-type="bibr" rid="ref-22">22</xref>]</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Proposed Approach</title>
<p>In this section, we describe the approach we propose in this paper. The goal of this approach is to classify Android applications and detect those that are malware. It includes two major steps: (i) selecting the most important features based on an Attention mechanism and (ii) classifying Android applications as malware or normal through the use of an optimized LightGBM algorithm. The overall architecture of the proposed system is illustrated in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>. This system comprises 5 basic elements: Feature extraction, Attention-based feature importance, Feature selection, and LightGBM classification. The process starts with feature extraction from Android applications. Then, an Attention-based technique is applied to these features in order to identify the most important ones. This step assists in reducing the number of features and selecting only those that help enhance the performance of the classifier. Finally, an optimized LightGBM algorithm is applied to determine whether the application is malware or normal.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Approach architecture</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_53163-fig-1.tif"/>
</fig>
<sec id="s3_1">
<label>3.1</label>
<title>Feature Extraction</title>
<p>The feature extraction process is based on the technique utilized in the &#x201C;CCCS-CIC-AndMal-2020&#x201D; dataset [<xref ref-type="bibr" rid="ref-21">21</xref>] that we tested during the experimentation of the approach. This process consists in statically analyzing the Android application by reverse engineering its APK file. The extracted features contain a large set of information, including:
<list list-type="bullet">
<list-item>
<p>Activities: the user interfaces of the Android app.</p></list-item>
<list-item>
<p>Broadcast receivers and providers.</p></list-item>
<list-item>
<p>Metadata: a method for storing information that can be accessed by application elements.</p></list-item>
<list-item>
<p>Permissions indicating the restriction of access to data on the device.</p></list-item>
<list-item>
<p>System features.</p></list-item>
</list></p>
<p>A feature vector is then created from the numerical values of the collected features. The dimension of the vector equals 9504.</p>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Feature Selection</title>
<p>Each instance of the dataset includes 9504 features. This number of features is very large and can impact the performance of the classifier in terms of speed and accuracy. In our approach, we decided to utilize a selection method to reduce the number of features and select only those that help in enhancing the performance of the classifier. We started by eliminating features with null or empty values, which reduced the number of features to only 5911. Then, we applied a feature selection algorithm based on the Attention mechanism. This idea aimed to calculate the weight of each feature through the use of a neural network that contains an Attention layer that allows paying attention to the features that are more crucial than the others in the classification process. The features with the highest scores are selected to take part in the classification task. The following paragraph describes in detail how this step works.</p>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Attention-Based Feature Importance</title>
<p>The Attention mechanism is a neural network concept that has gained much popularity in recent years and allows paying more attention to certain parts when processing data. It utilizes a weighted vector to help the neural architecture understand the relationship between the input elements and the target and estimate which part of the data is more important than others for the task at hand. In our approach, we used the Attention mechanism to select the most important features that help enhance the performance of the classifier.</p>
<p>The general architecture of the Attention-based feature importance is presented in <xref ref-type="fig" rid="fig-2">Fig. 2</xref>. It is a neural network architecture. The input to this network is a feature vector containing the initial features of the model. This vector is connected to an attention layer to compute the weighted values of each feature based on its importance to the classification process. In this way, a weighted feature vector is created, which is then connected to a Dense layer. The purpose of the Dense layer is to compute a &#x201C;<inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mi>y</mml:mi></mml:math></inline-formula>&#x201D; score in order to predict whether the instance is classified as malware or normal. After the training process of the neural attention network is complete, the weighted feature vector is utilized for determining which features have higher weighted values than the others. The features with higher weighted values mean that they are more crucial for the classification task.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Attention-based feature importance</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_53163-fig-2.tif"/>
</fig>
<p>In order to explain the feature extraction process, consider <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mi>X</mml:mi></mml:math></inline-formula> as the set of feature vectors extracted from the Android samples and <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> as the <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mi>i</mml:mi></mml:math></inline-formula>-th sample of the set <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:mi>X</mml:mi></mml:math></inline-formula> denoted as <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2264;</mml:mo><mml:mi>j</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi>N</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> refers to the <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mi>j</mml:mi></mml:math></inline-formula>-th feature of the <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:mi>i</mml:mi></mml:math></inline-formula>-th sample. Each feature vector <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is assigned a label value <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo fence="false" stretchy="false">}</mml:mo></mml:math></inline-formula>, where 0 means that the sample is classified as normal and 1 means that it is classified as malware.</p>
<p>The implementation of the Attention layer is inspired by the work of Wu et al. [<xref ref-type="bibr" rid="ref-18">18</xref>]. It consists of using an adapted fully connected network and a SoftMax function to calculate the weight of each feature. We started by using the following equation to evaluate how closely the output feature matches the input feature:</p>
<p><disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:msubsup><mml:mi>e</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msubsup><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>k</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></disp-formula>where <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents a weighting parameter learned during the training of the fully connected network in the attention layer and <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:msubsup><mml:mi>e</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> denotes the output of the fully connected network at the <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:mi>j</mml:mi></mml:math></inline-formula>-th position, which can be viewed as an association of a set of features with varying relevance to the input feature at position <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mi>j</mml:mi></mml:math></inline-formula>. Thus, training the neural model allows the parameter <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> to be assigned a relevant value to indicate the relationship between the <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mi>j</mml:mi></mml:math></inline-formula>-th input feature and other input features.</p>
<p>Next, in order to determine the weights of the input features at various places, we apply a SoftMax function to the output of the fully connected network. The output of the attention layer is a vector denoted by <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:msub><mml:mrow><mml:mi mathvariant="normal">&#x03B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mrow><mml:mi mathvariant="normal">&#x03B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi mathvariant="normal">&#x03B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi mathvariant="normal">&#x03B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>3</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi mathvariant="normal">&#x03B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> calculated as follows:</p>
<p><disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:msubsup><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>exp</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mi>e</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mi>exp</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mi>e</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>k</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula>where <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:msubsup><mml:mrow><mml:mi mathvariant="normal">&#x03B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> denotes the weight of the <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:mi>j</mml:mi></mml:math></inline-formula>-th feature in the <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:mi>i</mml:mi></mml:math></inline-formula>-th sample and reflects its importance based on the classification results.</p>
<p>Then, in order to generate the weighted feature vector, we weight the input feature vector by the Attention vector <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:msub><mml:mrow><mml:mi mathvariant="normal">&#x03B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> as follows:</p>
<p><disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup></mml:math></disp-formula>where <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> denotes the weighted feature vector of the <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:mi>i</mml:mi></mml:math></inline-formula>-th sample.</p>
<p>Finally, we calculate the classification result <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> by mapping the input vector <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> into a binary prediction value.</p>
<p>Once the training process is complete, the Attention-based model assigns different weights to all features based on their contribution to the classification results. The features with higher weights indicate that they are more important for the relevance of the classification results, while the features with lower weights are less important. The weighted feature vector assists in selecting the top <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:mi>n</mml:mi></mml:math></inline-formula> features. These features are considered more crucial than the others, and our main classifier (LightGBM) is applied to them. The choice of the parameter <inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:mi>n</mml:mi></mml:math></inline-formula> affects the classification results, so it is essential to select a suitable value. In our case, we tried to choose the best value for <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:mi>n</mml:mi></mml:math></inline-formula> based on the results of the experiments performed.</p>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>LightGBM Classification</title>
<p>LightGBM [<xref ref-type="bibr" rid="ref-26">26</xref>] is a distributed gradient boosting algorithm released by Microsoft in 2017 for machine learning tasks. LightGBM is based on decision trees and can be used for several machine learning tasks such as classification, ranking and regression. It uses leaf-wise tree growth instead of the level-wise-tree growth which is widely used in several tree-based learning algorithms. LightGBM has outperformed several machine learning algorithms in multiple applications thanks to its characteristics, including efficiency, speed, and low memory consumption. In our model, LightGBM plays a role in classifying the Android samples as normal or malware after selecting the most important features from the previous step.</p>
<p>The implementation of LightGBM requires the use of a set of parameters called hyper-parameters, such as the number of leaves per tree, the maximum tree depth, and the learning rate. The mentioned parameters significantly affect how well the LightGBM algorithm performs and produces results. The choice of these hyper-parameters is crucial to achieving good results [<xref ref-type="bibr" rid="ref-27">27</xref>]. In our approach, a Bayesian optimization technique was utilized for identifying the best parameters for this model.</p>
<p>Bayesian optimization is a useful technique to optimize black-box functions that are expensive to evaluate [<xref ref-type="bibr" rid="ref-28">28</xref>,<xref ref-type="bibr" rid="ref-29">29</xref>]. The optimization problem can be formulated as follows:</p>
<p><disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mspace width="thinmathspace" /><mml:msup><mml:mrow><mml:mtext>x</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo>&#x2217;</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mspace width="thinmathspace" /><mml:mo>=</mml:mo><mml:mspace width="thinmathspace" /><mml:msub><mml:mrow><mml:mtext>argmax</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mtext>x</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mo>&#x2208;</mml:mo><mml:mspace width="thinmathspace" /><mml:mi>&#x03C7;</mml:mi></mml:mrow></mml:msub><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext>f</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>x</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:msup><mml:mi>x</mml:mi><mml:mrow><mml:mo>&#x2217;</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> represents the LightGBM model&#x2019;s hyper-parameters that need to be optimized. The symbol <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:mrow><mml:mi mathvariant="normal">&#x03C7;</mml:mi></mml:mrow></mml:math></inline-formula> denotes the search space for the hyper-parameters. The objective function is denoted by <inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and indicates how well the LightGBM model performs given the selected hyper-parameters. Accuracy is the measurement criterion we chose to evaluate how well the objective function performed. Therefore, the goal of the optimization is to determine the collection of hyper-parameters <inline-formula id="ieqn-33"><mml:math id="mml-ieqn-33"><mml:msup><mml:mi>x</mml:mi><mml:mrow><mml:mo>&#x2217;</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> with which the function <inline-formula id="ieqn-34"><mml:math id="mml-ieqn-34"><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> performs best. The optimization procedure usually involves numerous iterations. The objective function yields an observed result <inline-formula id="ieqn-35"><mml:math id="mml-ieqn-35"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> that will be added to the historical set <inline-formula id="ieqn-36"><mml:math id="mml-ieqn-36"><mml:mi>D</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and utilized for updating the surrogate probability model in order to generate the next proposal. The optimization procedure for the LightGBM model is presented in Algorithm 1.</p>
<fig id="fig-4">
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_53163-fig-4.tif"/>
</fig>
<p>After completing the optimization process and selecting the best hyperparameters, the classification process of the LightGBM model starts to identify whether the Android sample should be classified as normal or malware.</p>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Experimental Evaluation</title>
<p>The objective of this section is to evaluate the performance of our approach based on several experimental tests. This evaluation consists in comparing the classification results of LightGBM and other machine learning models before and after the application of feature reduction based on the Attention mechanism.</p>
<sec id="s4_1">
<label>4.1</label>
<title>Dataset Description</title>
<p>In order to test and assess our approach, we utilized the CCCS-CIC-AndMal-2020 dataset [<xref ref-type="bibr" rid="ref-21">21</xref>]. This is a recent Android malware dataset created by the Canadian Institute for Cybersecurity (CIC). The dataset comprises 400 K Android applications (200 K are benign and 200 K are malware). The Android malware data is divided into the following 14 malware categories: Adware, Backdoor, FileInfector, No_Category, Potentially Unwanted Apps (PUA), Ransomware, Riskware, Scareware, Trojan, Banker Trojan, Dropper Trojan, SMS Trojan, Spy Trojan, and Zero-Day. <xref ref-type="table" rid="table-2">Table 2</xref> presents the number of families and samples for each of the 14 malware categories. In the Adware category, for example, there are 47,210 samples in the dataset that belong to this type of malware. Also in this category, we found 48 malware families, including Dowgin, Adflex, Airpush, Baiduprotect, etc. In the experimental tests, 12 malware categories were utilized, with the exception of No_Category and Zero Day, because the data in these categories was incomplete. The features of the dataset contain a lot of information, including:</p>
<p><list list-type="bullet">
<list-item>
<p>Activities: the user interfaces of the Android app.</p></list-item>
<list-item>
<p>Broadcast receivers and providers.</p></list-item>
<list-item>
<p>Metadata: a method for storing information that can be accessed by application elements.</p></list-item>
<list-item>
<p>Permissions indicating the restriction of access to data on the device.</p></list-item>
<list-item>
<p>System features.</p></list-item>
</list></p>

<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>CCCS-CIC-AndMal-2020 dataset details</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Class</th>
<th>Number of families</th>
<th>Amount of samples</th>
</tr>
</thead>
<tbody>
<tr>
<td>Adware</td>
<td>48</td>
<td>47,210</td>
</tr>
<tr>
<td>No category</td>
<td>&#x2013;</td>
<td>2296</td>
</tr>
<tr>
<td>PUA</td>
<td>8</td>
<td>2051</td>
</tr>
<tr>
<td>Backdoor</td>
<td>11</td>
<td>1538</td>
</tr>
<tr>
<td>Ransomware</td>
<td>8</td>
<td>6202</td>
</tr>
<tr>
<td>File infector</td>
<td>5</td>
<td>669</td>
</tr>
<tr>
<td>Riskware</td>
<td>21</td>
<td>97,349</td>
</tr>
<tr>
<td>Scareware</td>
<td>3</td>
<td>1556</td>
</tr>
<tr>
<td>Dropper trojan</td>
<td>9</td>
<td>2302</td>
</tr>
<tr>
<td>Banker trojan</td>
<td>11</td>
<td>887</td>
</tr>
<tr>
<td>Spy trojan</td>
<td>11</td>
<td>3540</td>
</tr>
<tr>
<td>SMS trojan</td>
<td>11</td>
<td>3125</td>
</tr>
<tr>
<td>Trojan</td>
<td>45</td>
<td>13,559</td>
</tr>
<tr>
<td>Zero day</td>
<td>&#x2013;</td>
<td>13,340</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Parameters of the LightGBM Classifier</title>
<p>The hyperparameters of the LightGBM classifier greatly affect the quality of the classification results. The integration of Bayesian optimization into this approach has remarkably helped us in carefully selecting the right parameters. To accomplish this task, we have selected a set of values for each parameter of the LightGBM classifier (Learning rates, Number of iterations, Number of leaves, Bagging fraction, Feature fraction, Min data in leaf and Max depth). We then enter these values as input to Algorithm 1 described above. The role of this optimization algorithm is to try different configurations of these parameters in several iterations with the aim of obtaining an optimal configuration for the best classification accuracy. <xref ref-type="table" rid="table-3">Table 3</xref> describes the selected hyperparameters for this classification model.</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Hyperparameters of the LightGBM classifier</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Name of the hyperparameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Boosting method</td>
<td>gbdt</td>
</tr>
<tr>
<td>Learning rates</td>
<td>0.1</td>
</tr>
<tr>
<td>Number of iterations</td>
<td>588</td>
</tr>
<tr>
<td>Number of leaves</td>
<td>453</td>
</tr>
<tr>
<td>Bagging fraction</td>
<td>0.8</td>
</tr>
<tr>
<td>Feature fraction</td>
<td>0.5</td>
</tr>
<tr>
<td>Min data in leaf</td>
<td>50</td>
</tr>
<tr>
<td>Max depth</td>
<td>15</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Experimental Results of Binary Classification</title>
<p>The classification model used in this approach plays the role of a binary classifier whose goal is to classify Android samples into two classes: normal and malware. For this purpose, we grouped all the samples in the dataset belonging to the different malware families into a single class labeled &#x201C;Malware&#x201D;, and the other class labeled &#x201C;Normal&#x201D; is intended for benign samples. In order to train and assess the model, we divided the dataset as follows: 80% for training and 20% for testing. The experiments conducted in this work include 7 classification models (LightGBM, Random Forest, AdaBoost, Naive Bayes, Decision Tree, XGBoost and K-Nearest-Neighbor). For each model we performed 5 tests with the features of the dataset: applying the model (i) to the totality of the features without reduction (5911 features), (ii) to the 200 best features of the dataset selected thanks to our Attention-based feature importance method, (iii) to the 300 best features, (iv) to the 500 best features, and finally (v) to the 1000 best features. The aim is to evaluate the effectiveness of the Attention-based feature importance method in association with the LightGBM model (as well as the other machine learning algorithms) and to identify whether it can contribute to the improvement of the classification model performance. In order to evaluate the performed experimentations, we calculated 5 measures: accuracy, precision, recall, F1-score and false alarm rate (FAR) as well as the speed of training.</p>
<p><xref ref-type="table" rid="table-4">Table 4</xref> illustrates the results of the experiments described above. Concerning LightGBM, the model applied to the top 300 features (in terms of importance) obtained the best results in the 5 evaluation measures with an accuracy of 0.987110, a precision of 0.988937, a recall of 0.982342, an F1-score of 0.985628, and a FAR of 0.008990. These scores slightly exceeded the results obtained from the LightGBM model applied to the totality of features without reduction (5911 features). The latter exhibited an accuracy of 0.986462, a precision of 0.988772, a recall of 0.981053, an F1-score of 0.984897, and a FAR of 0.009114. Furthermore, the LightGBM model with 300 features took less time to complete the training with 28.099 s in comparison with 80.043 s for the LightGBM model without the feature reduction. <xref ref-type="fig" rid="fig-3">Fig. 3</xref> presents the ROC curve and the confusion matrix of the LightGBM model with the top 300 features.</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Results obtained from the classification models</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Classification model</th>
<th>Number of features</th>
<th>Accuracy</th>
<th>Precision</th>
<th>Recall</th>
<th>F1-score</th>
<th>FAR</th>
<th>Training time (s)</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="5">LightGBM</td>
<td>Without feature reduction</td>
<td>0.986462</td>
<td>0.988772</td>
<td>0.981053</td>
<td>0.984897</td>
<td>0.009114</td>
<td>80.043</td>
</tr>
<tr>
<td>200 features</td>
<td>0.986598</td>
<td>0.988402</td>
<td>0.981736</td>
<td>0.985058</td>
<td>0.009424</td>
<td>25.056</td>
</tr>
<tr>
<td>300 features</td>
<td>0.987110</td>
<td>0.988937</td>
<td>0.982342</td>
<td>0.985628</td>
<td>0.008990</td>
<td>28.099</td>
</tr>
<tr>
<td>500 features</td>
<td>0.986564</td>
<td>0.988700</td>
<td>0.981357</td>
<td>0.985014</td>
<td>0.009176</td>
<td>34.318</td>
</tr>
<tr>
<td>1000 features</td>
<td>0.986427</td>
<td>0.988696</td>
<td>0.981053</td>
<td>0.984860</td>
<td>0.009176</td>
<td>35.581</td>
</tr>
<tr>
<td rowspan="5">Random forest</td>
<td>Without feature reduction</td>
<td>0.985643</td>
<td>0.990553</td>
<td>0.977416</td>
<td>0.983940</td>
<td>0.007626</td>
<td>410.918</td>
</tr>
<tr>
<td>200 features</td>
<td>0.986086</td>
<td>0.989286</td>
<td>0.979689</td>
<td>0.984464</td>
<td>0.008680</td>
<td>26.052</td>
</tr>
<tr>
<td>300 features</td>
<td>0.985814</td>
<td>0.990030</td>
<td>0.978325</td>
<td>0.984143</td>
<td>0.008060</td>
<td>32.892</td>
</tr>
<tr>
<td>500 features</td>
<td>0.985268</td>
<td>0.988144</td>
<td>0.979007</td>
<td>0.983554</td>
<td>0.009610</td>
<td>39.334</td>
</tr>
<tr>
<td>1000 features</td>
<td>0.985848</td>
<td>0.990256</td>
<td>0.978174</td>
<td>0.984178</td>
<td>0.007874</td>
<td>55.863</td>
</tr>
<tr>
<td rowspan="5">AdaBoost</td>
<td>Without feature reduction</td>
<td>0.957407</td>
<td>0.964897</td>
<td>0.939523</td>
<td>0.952041</td>
<td>0.027962</td>
<td>1197.214</td>
</tr>
<tr>
<td>200 features</td>
<td>0.958839</td>
<td>0.963286</td>
<td>0.944524</td>
<td>0.953813</td>
<td>0.029450</td>
<td>20.893</td>
</tr>
<tr>
<td>300 features</td>
<td>0.959146</td>
<td>0.962455</td>
<td>0.946116</td>
<td>0.954215</td>
<td>0.030194</td>
<td>26.718</td>
</tr>
<tr>
<td>500 features</td>
<td>0.957509</td>
<td>0.957509</td>
<td>0.940205</td>
<td>0.952184</td>
<td>0.028334</td>
<td>38.433</td>
</tr>
<tr>
<td>1000 features</td>
<td>0.957407</td>
<td>0.964897</td>
<td>0.939523</td>
<td>0.952041</td>
<td>0.027962</td>
<td>68.769</td>
</tr>
<tr>
<td rowspan="5">Naive bayes</td>
<td>Without feature reduction</td>
<td>0.824137</td>
<td>0.775236</td>
<td>0.857901</td>
<td>0.814476</td>
<td>0.203484</td>
<td>13.823</td>
</tr>
<tr>
<td>200 features</td>
<td>0.815680</td>
<td>0.764390</td>
<td>0.853429</td>
<td>0.806460</td>
<td>0.215202</td>
<td>0.197</td>
</tr>
<tr>
<td>300 features</td>
<td>0.817078</td>
<td>0.766633</td>
<td>0.853202</td>
<td>0.807604</td>
<td>0.212474</td>
<td>0.203</td>
</tr>
<tr>
<td>500 features</td>
<td>0.821102</td>
<td>0.770724</td>
<td>0.857522</td>
<td>0.811809</td>
<td>0.208692</td>
<td>0.239</td>
</tr>
<tr>
<td>1000 features</td>
<td>0.823353</td>
<td>0.774280</td>
<td>0.857370</td>
<td>0.813709</td>
<td>0.204476</td>
<td>0.530</td>
</tr>
<tr>
<td rowspan="5">Decision tree</td>
<td>Without feature reduction</td>
<td>0.975208</td>
<td>0.970562</td>
<td>0.974460</td>
<td>0.972507</td>
<td>0.024180</td>
<td>88.788</td>
</tr>
<tr>
<td>200 features</td>
<td>0.974355</td>
<td>0.969016</td>
<td>0.974157</td>
<td>0.971580</td>
<td>0.025482</td>
<td>3.736</td>
</tr>
<tr>
<td>300 features</td>
<td>0.974253</td>
<td>0.969434</td>
<td>0.973475</td>
<td>0.973475</td>
<td>0.025110</td>
<td>5.289</td>
</tr>
<tr>
<td>500 features</td>
<td>0.975447</td>
<td>0.970293</td>
<td>0.975294</td>
<td>0.975294</td>
<td>0.024428</td>
<td>9.415</td>
</tr>
<tr>
<td>1000 features</td>
<td>0.975344</td>
<td>0.970428</td>
<td>0.974915</td>
<td>0.972666</td>
<td>0.024304</td>
<td>12.959</td>
</tr>
<tr>
<td rowspan="5">XGBoost</td>
<td>Without feature reduction</td>
<td>0.980835</td>
<td>0.983024</td>
<td>0.974233</td>
<td>0.978608</td>
<td>0.013764</td>
<td>1438.581</td>
</tr>
<tr>
<td>200 features</td>
<td>0.981210</td>
<td>0.982669</td>
<td>0.975445</td>
<td>0.979044</td>
<td>0.014074</td>
<td>55.962</td>
</tr>
<tr>
<td>300 features</td>
<td>0.982062</td>
<td>0.983587</td>
<td>0.976430</td>
<td>0.979995</td>
<td>0.013330</td>
<td>81.079</td>
</tr>
<tr>
<td>500 features</td>
<td>0.980596</td>
<td>0.982350</td>
<td>0.974384</td>
<td>0.978351</td>
<td>0.014322</td>
<td>133.677</td>
</tr>
<tr>
<td>1000 features</td>
<td>0.980971</td>
<td>0.982512</td>
<td>0.975066</td>
<td>0.978775</td>
<td>0.014198</td>
<td>253.358</td>
</tr>
<tr>
<td rowspan="5">K-Nearest-Neighbor</td>
<td>Without feature reduction</td>
<td>0.932376</td>
<td>0.945841</td>
<td>0.901326</td>
<td>0.923047</td>
<td>0.042222</td>
<td>1641.555</td>
</tr>
<tr>
<td>200 features</td>
<td>0.932035</td>
<td>0.945372</td>
<td>0.901023</td>
<td>0.922665</td>
<td>0.042594</td>
<td>74.515</td>
</tr>
<tr>
<td>300 features</td>
<td>0.932172</td>
<td>0.945815</td>
<td>0.900872</td>
<td>0.922796</td>
<td>0.042222</td>
<td>94.963</td>
</tr>
<tr>
<td>500 features</td>
<td>0.932206</td>
<td>0.945748</td>
<td>0.901023</td>
<td>0.922844</td>
<td>0.042284</td>
<td>150.967</td>
</tr>
<tr>
<td>1000 features</td>
<td>0.932410</td>
<td>0.945987</td>
<td>0.901250</td>
<td>0.923077</td>
<td>0.042098</td>
<td>285.270</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>ROC curve and confusion matrix of the LightGBM model with the top 300 features. (a) Confusion matrix. (b) ROC curve</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_53163-fig-3.tif"/>
</fig>
<p>Concerning the other classification models, the application of our feature reduction technique based on the Attention mechanism showed better classification performance with all algorithms except Naive Bayes. For example, for Random Forest, the use of the 200 best features showed an accuracy equal to 0.986086, exceeding that with all the features of the dataset. For AdaBoost, similarly, the use of the top 200 and 300 features showed better performance in terms of accuracy, recall, F1-score and training time. Regarding the Decision Tree algorithm, we obtained the best results with the top 500 features. The selection of the first 300 features with the XGBoost model showed the best results in terms of accuracy, precision, recall, F1-score, FAR and training time. For the K-Nearest-Neighbor model, accuracy reached its maximum with the top 1000 features. The only case where the feature reduction technique failed to improve the classification performance was with the Naive Bayes algorithm.</p>
<p>Comparing all the results obtained from the different experiments, we can conclude that the LightGBM model with the top 300 features is the best performing model for the Android malware dataset. These comparative results confirm that the feature importance technique based on the Attention mechanism proposed in this paper has shown its ability to improve the classification results of the LightGBM model. The strength of this technique is that it can pay attention to features that play an important role in the classification results and ignore those that are less important. This technique has both reduced the training time and improved the classification results.</p>
</sec>
<sec id="s4_4">
<label>4.4</label>
<title>Experimental Results of Malware Category Classification</title>
<p>In this part, we are looking to evaluate our approach in the context of a multiclass classification. Our target is to classify the malware samples of the dataset into 12 malware categories as previously mentioned in the description of the dataset. This experiment consists in conducting a comparative evaluation among (i) the LightGBM model without feature reduction, (ii) the LightGBM model with selection of the best features as we did in the previous experiment, and (iii) the model described by the authors of the dataset paper (deep learning model). The benchmarking comprises 4 measures: precision, recall, and F1-score of each malware category as well as the overall model accuracy.</p>
<p><xref ref-type="table" rid="table-5">Table 5</xref> shows the classification results of the different models tested in this comparison. This time the LightGBM model applied to the 500 best features provided the best results with an accuracy of 0.947711 in comparison with 0.946933 for LightGBM without feature reduction, 0.946349 for LightGBM with top 200 features, 0.946960 for LightGBM with top 300 features, 0.947461 for LightGBM with top 1000 features, and 0.93 for the deep learning model of the authors of the dataset.</p>
<table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>Results of the malware category classification</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th></th>
<th>Metrics</th>
<th>Adware</th>
<th>Backdoor</th>
<th>Banker Trojan</th>
<th>Dropper Trojan</th>
<th>File infector</th>
<th>PUA</th>
<th>Ransomware</th>
<th>Riskware</th>
<th>SMS trojan</th>
<th>Scareware</th>
<th>Spy trojan</th>
<th>Trojan</th>
<th>Accuracy</th>
<th>Training speed</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="3">LightGBM without feature reduction</td>
<td>Precision</td>
<td>0.92</td>
<td>0.85</td>
<td>0.87</td>
<td>0.85</td>
<td>0.94</td>
<td>0.86</td>
<td>0.81</td>
<td>0.98</td>
<td>0.94</td>
<td>0.97</td>
<td>0.92</td>
<td>0.94</td>
<td rowspan="3">0.9469</td>
<td rowspan="3">374.56</td>
</tr>
<tr>
<td>Recall</td>
<td>0.96</td>
<td>0.75</td>
<td>0.84</td>
<td>0.70</td>
<td>0.77</td>
<td>0.66</td>
<td>0.92</td>
<td>0.97</td>
<td>0.94</td>
<td>0.75</td>
<td>0.88</td>
<td>0.91</td>
</tr>
<tr>
<td>F1-score</td>
<td>0.94</td>
<td>0.79</td>
<td>0.85</td>
<td>0.77</td>
<td>0.85</td>
<td>0.74</td>
<td>0.86</td>
<td>0.97</td>
<td>0.94</td>
<td>0.85</td>
<td>0.90</td>
<td>0.92</td>
</tr>
<tr>
<td rowspan="3">LightGBM (200 features)</td>
<td>Precision</td>
<td>0.92</td>
<td>0.86</td>
<td>0.87</td>
<td>0.85</td>
<td>0.94</td>
<td>0.87</td>
<td>0.81</td>
<td>0.98</td>
<td>0.94</td>
<td>0.97</td>
<td>0.92</td>
<td>0.94</td>
<td rowspan="3">0.9463</td>
<td rowspan="3">283.4</td>
</tr>
<tr>
<td>Recall</td>
<td>0.96</td>
<td>0.74</td>
<td>0.85</td>
<td>0.70</td>
<td>0.77</td>
<td>0.66</td>
<td>0.92</td>
<td>0.97</td>
<td>0.94</td>
<td>0.75</td>
<td>0.88</td>
<td>0.91</td>
</tr>
<tr>
<td>F1-score</td>
<td>0.94</td>
<td>0.79</td>
<td>0.86</td>
<td>0.77</td>
<td>0.85</td>
<td>0.75</td>
<td>0.86</td>
<td>0.97</td>
<td>0.94</td>
<td>0.85</td>
<td>0.90</td>
<td>0.92</td>
</tr>
<tr>
<td rowspan="3">LightGBM (300 features)</td>
<td>Precision</td>
<td>0.92</td>
<td>0.85</td>
<td>0.87</td>
<td>0.85</td>
<td>0.94</td>
<td>0.86</td>
<td>0.81</td>
<td>0.98</td>
<td>0.94</td>
<td>0.97</td>
<td>0.92</td>
<td>0.94</td>
<td rowspan="3">0.9467</td>
<td rowspan="3">289.59</td>
</tr>
<tr>
<td>Recall</td>
<td>0.96</td>
<td>0.74</td>
<td>0.85</td>
<td>0.70</td>
<td>0.78</td>
<td>0.66</td>
<td>0.92</td>
<td>0.97</td>
<td>0.94</td>
<td>0.75</td>
<td>0.88</td>
<td>0.91</td>
</tr>
<tr>
<td>F1-score</td>
<td>0.94</td>
<td>0.79</td>
<td>0.86</td>
<td>0.77</td>
<td>0.85</td>
<td>0.75</td>
<td>0.86</td>
<td>0.97</td>
<td>0.94</td>
<td>0.84</td>
<td>0.90</td>
<td>0.92</td>
</tr>
<tr>
<td rowspan="3">LightGBM (500 features)</td>
<td>Precision</td>
<td>0.92</td>
<td>0.86</td>
<td>0.86</td>
<td>0.86</td>
<td>0.93</td>
<td>0.88</td>
<td>0.81</td>
<td>0.98</td>
<td>0.94</td>
<td>0.97</td>
<td>0.92</td>
<td>0.94</td>
<td rowspan="3">0.9477</td>
<td rowspan="3">311.01</td>
</tr>
<tr>
<td>Recall</td>
<td>0.96</td>
<td>0.74</td>
<td>0.85</td>
<td>0.71</td>
<td>0.76</td>
<td>0.67</td>
<td>0.92</td>
<td>0.97</td>
<td>0.93</td>
<td>0.75</td>
<td>0.88</td>
<td>0.91</td>
</tr>
<tr>
<td>F1-score</td>
<td>0.94</td>
<td>0.80</td>
<td>0.86</td>
<td>0.77</td>
<td>0.84</td>
<td>0.76</td>
<td>0.86</td>
<td>0.97</td>
<td>0.94</td>
<td>0.85</td>
<td>0.90</td>
<td>0.93</td>
</tr>
<tr>
<td rowspan="3">LightGBM (1000 features)</td>
<td>Precision</td>
<td>0.92</td>
<td>0.85</td>
<td>0.87</td>
<td>0.85</td>
<td>0.94</td>
<td>0.88</td>
<td>0.81</td>
<td>0.98</td>
<td>0.94</td>
<td>0.97</td>
<td>0.92</td>
<td>0.94</td>
<td rowspan="3">0.9475</td>
<td rowspan="3">350.16</td>
</tr>
<tr>
<td>Recall</td>
<td>0.96</td>
<td>0.75</td>
<td>0.85</td>
<td>0.70</td>
<td>0.78</td>
<td>0.66</td>
<td>0.92</td>
<td>0.97</td>
<td>0.94</td>
<td>0.75</td>
<td>0.88</td>
<td>0.91</td>
</tr>
<tr>
<td>F1-score</td>
<td>0.94</td>
<td>0.79</td>
<td>0.86</td>
<td>0.77</td>
<td>0.85</td>
<td>0.75</td>
<td>0.86</td>
<td>0.97</td>
<td>0.94</td>
<td>0.84</td>
<td>0.90</td>
<td>0.92</td>
</tr>
<tr>
<td rowspan="3">DiDroid [<xref ref-type="bibr" rid="ref-18">18</xref>]</td>
<td>Precision</td>
<td>0.935</td>
<td>0.721</td>
<td>0.759</td>
<td>0.85</td>
<td>0.909</td>
<td>0.677</td>
<td>0.798</td>
<td>0.963</td>
<td>0.917</td>
<td>0.836</td>
<td>0.924</td>
<td>0.895</td>
<td rowspan="3">0.93</td>
<td rowspan="3">&#x2013;</td>
</tr>
<tr>
<td>Recall</td>
<td>0.929</td>
<td>0.643</td>
<td>0.759</td>
<td>0.686</td>
<td>0.789</td>
<td>0.682</td>
<td>0.944</td>
<td>0.967</td>
<td>0.886</td>
<td>0.764</td>
<td>0.835</td>
<td>0.896</td>
</tr>
<tr>
<td>F1-score</td>
<td>0.932</td>
<td>0.68</td>
<td>0.759</td>
<td>0.759</td>
<td>0.845</td>
<td>0.679</td>
<td>0.864</td>
<td>0.965</td>
<td>0.901</td>
<td>0.799</td>
<td>0.877</td>
<td>0.896</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>These results demonstrated once again the effectiveness of the feature importance technique based on the Attention mechanism. This effectiveness is also associated with the choice of the value <inline-formula id="ieqn-45"><mml:math id="mml-ieqn-45"><mml:mi>n</mml:mi></mml:math></inline-formula> that determines the number of important features to be selected. Choosing this number carefully is very essential to avoid ignoring some important features or including some features that have a negative effect on the classification results.</p>
</sec>
<sec id="s4_5">
<label>4.5</label>
<title>Discussion</title>
<p>In recent years, a number of studies have been conducted with the aim of proposing relevant solutions for the identification and classification of malware in Android mobile environments. <xref ref-type="table" rid="table-6">Table 6</xref> compares some of these studies, including the one we propose in this paper. To achieve a fair and equitable comparison, we only selected the works that were evaluated using the CCCS-CIC-AndMal-2020 dataset. Different techniques were used in these works, including classical machine learning algorithms such as Random Forest and SVM as well as deep learning algorithms such as CNN and LSTM. The comparison showed that our approach outperformed the other works in terms of binary and multi-class classification accuracy.</p>
<table-wrap id="table-6">
<label>Table 6</label>
<caption>
<title>Comparison with studies using the CCCS-CIC-AndMal-2020 dataset</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Paper reference</th>
<th>Year</th>
<th>Used methods</th>
<th colspan="2" align="center">Accuracy</th>
</tr>
<tr>
<th/>
<th/>
<th/>
<th>Binary classification</th>
<th>Multiclass classification</th>
</tr>
</thead>
<tbody>
<tr>
<td>Musikawan et al. [<xref ref-type="bibr" rid="ref-30">30</xref>]</td>
<td>2023</td>
<td>DNN</td>
<td>0.9772</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>Batouche et al. [<xref ref-type="bibr" rid="ref-31">31</xref>]</td>
<td>2021</td>
<td>Random forest</td>
<td>&#x2013;</td>
<td>0.89</td>
</tr>
<tr>
<td>Chopra et al. [<xref ref-type="bibr" rid="ref-32">32</xref>]</td>
<td>2023</td>
<td>CNN, transfer learning</td>
<td>0.9719</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>Ullah et al. [<xref ref-type="bibr" rid="ref-33">33</xref>]</td>
<td>2022</td>
<td>SVM</td>
<td>0.9664</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>DiDroid [<xref ref-type="bibr" rid="ref-21">21</xref>]</td>
<td>2020</td>
<td>CNN</td>
<td>&#x2013;</td>
<td>0.93</td>
</tr>
<tr>
<td>Wang et al. [<xref ref-type="bibr" rid="ref-34">34</xref>]</td>
<td>2023</td>
<td>Bidirectional LSTM</td>
<td>&#x2013;</td>
<td>0.92</td>
</tr>
<tr>
<td>Our approach</td>
<td>2024</td>
<td>Attention mechanism, LightGBM</td>
<td><bold>0.9871</bold></td>
<td><bold>0.9477</bold></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The experimental results shown that our proposed approach is very effective. This approach made it possible to minimize the execution time of the model by effectively reducing the dimensionality of the data while improving the accuracy of the classification. The experimental findings have also shown that the LightGBM model performs better than other machine learning algorithms on this type of dataset.</p>
<p>Although the proposed approach has been shown to be effective with the CCCS-CIC-AndMal-2020 dataset and although we believe it can be properly adapted to other types of datasets, it would be more appropriate to test this approach with other Android malware samples. This will help to thoroughly investigate the performance of the Attention mechanism and verify its ability to efficiently analyze other types of features. Another limitation of this approach must be considered in our future work. The data collected in the CCCS-CIC-AndMal-2020 dataset contains static analysis of Android applications. This static analysis is very useful in identifying Android malware due to the large number of features we can extract from the application APK file. However, it is not able to detect complex Android malwares as their malicious actions can only be observed during execution. Therefore, it is important to extend the current approach to support hybrid application analysis that includes static data from the application and dynamic monitoring of its behavior at runtime.</p>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Conclusion</title>
<p>In this study, we presented a solution for Android malware detection. Two recent ML techniques were employed in the solution: the Attention mechanism and the LightGBM classifier. The Attention mechanism was integrated with a neural network to analyze the dataset&#x2019;s features and identify which are important for the classification results. The LightGBM algorithm was chosen for classifying the samples of the dataset based on a set of features selected according to their importance. The advantage of our solution is its ability to reduce the size of the features, subsequently minimizing the execution time, and also improving the accuracy of the classification algorithm. Experimental results demonstrated that the feature importance technique enhanced the classification accuracy from 98.64% (without feature reduction) to 98.71% (after feature selection).</p>
<p>Additionally, we tested the proposed approach on the CCCS-CIC-AndMal-2020 dataset, focusing on static analysis of Android applications. In the future, we plan to investigate other malware datasets and improve our approach in order to support hybrid features obtained from static and dynamic analysis of Android applications. We also intend to develop a new method for determining the key features that characterize the behavior of each malware family so that we could assist security experts in identifying these malicious applications.</p>
</sec>
</body>
<back>
<ack><p>The authors extend their appreciation to the Deanship of Graduate Studies and Scientific Research at Jouf University for funding this work.</p>
</ack>
<sec><title>Funding Statement</title>
<p>This work was funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under Grant No. (DGSSR-2023-02-02178).</p>
</sec>
<sec sec-type="data-availability"><title>Availability of Data and Materials</title>
<p>The data that support the findings of this study are openly available at <ext-link ext-link-type="uri" xlink:href="https://www.unb.ca/cic/datasets/andmal2020.html">https://www.unb.ca/cic/datasets/andmal2020.html</ext-link> (accessed on 15 January 2024).</p>
</sec>
<sec sec-type="COI-statement"><title>Conflicts of Interest</title>
<p>The authors declare that they have no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><collab>Annie</collab></person-group>, &#x201C;<article-title>The state of mobile in 2022</article-title>,&#x201D; <comment>2022. Accessed: Jan. 15, 2024</comment>. [Online]. Available: <ext-link ext-link-type="uri" xlink:href="https://www.data.ai/en/insights/market-data/state-of-mobile-2022/">https://www.data.ai/en/insights/market-data/state-of-mobile-2022/</ext-link></mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><collab>StatCounter</collab></person-group>, &#x201C;<article-title>Mobile operating system market share worldwide</article-title>,&#x201D; <comment>2022. Accessed: Jan. 15, 2024</comment>. [Online]. Available: <ext-link ext-link-type="uri" xlink:href="http://gs.statcounter.com/os-market-share/mobile/worldwide">http://gs.statcounter.com/os-market-share/mobile/worldwide</ext-link></mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><given-names>T.</given-names> <surname>Shishkova</surname></string-name> and <string-name><given-names>A.</given-names> <surname>Kivva</surname></string-name></person-group>, &#x201C;<article-title>Mobile malware evolution 2021</article-title>,&#x201D; <comment>2022. Accessed: Jan. 15, 2024</comment>. [Online]. Available: <ext-link ext-link-type="uri" xlink:href="https://securelist.com/mobile-malware-evolution-2021/105876/">https://securelist.com/mobile-malware-evolution-2021/105876/</ext-link></mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Nazir</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>A deep learning-based novel hybrid CNN-LSTM architecture for efficient detection of threats in the IoT ecosystem</article-title>,&#x201D; <source>Ain Shams Eng. J.</source>, vol. <volume>15</volume>, no. <issue>7</issue>, pp. <fpage>102777</fpage>, <year>Apr. 2024</year>. doi: <pub-id pub-id-type="doi">10.1016/j.asej.2024.102777</pub-id>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Nazir</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Advancing IoT security: A systematic review of machine learning approaches for the detection of IoT botnets</article-title>,&#x201D; <source>J. King Saud Univ.-Comput. Inf. Sci.</source>, vol. <volume>35</volume>, no. <issue>10</issue>, pp. <fpage>101820</fpage>, <year>Dec. 2023</year>. doi: <pub-id pub-id-type="doi">10.1016/j.jksuci.2023.101820</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R. H.</given-names> <surname>Hadi</surname></string-name>, <string-name><given-names>H. N.</given-names> <surname>Hady</surname></string-name>, <string-name><given-names>A. M.</given-names> <surname>Hasan</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Al-Jodah</surname></string-name>, and <string-name><given-names>A. J.</given-names> <surname>Humaidi</surname></string-name></person-group>, &#x201C;<article-title>Improved fault classification for predictive maintenance in industrial IoT based on AutoML: A case study of ball-bearing faults</article-title>,&#x201D; <source>Processes</source>, vol. <volume>11</volume>, no. <issue>5</issue>, pp. <fpage>1507</fpage>, <year>May 2023</year>. doi: <pub-id pub-id-type="doi">10.3390/pr11051507</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Abawajy</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Darem</surname></string-name>, and <string-name><given-names>A. A.</given-names> <surname>Alhashmi</surname></string-name></person-group>, &#x201C;<article-title>Feature subset selection for malware detection in smart IoT platforms</article-title>,&#x201D; <source>Sensors</source>, vol. <volume>21</volume>, no. <issue>4</issue>, pp. <fpage>1374</fpage>, <year>Feb. 2021</year>. doi: <pub-id pub-id-type="doi">10.3390/s21041374</pub-id>; <pub-id pub-id-type="pmid">33669191</pub-id></mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Thiyagarajan</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Akash</surname></string-name>, and <string-name><given-names>B.</given-names> <surname>Murugan</surname></string-name></person-group>, &#x201C;<article-title>Improved real-time permission based malware detection and clustering approach using model independent pruning</article-title>,&#x201D; <source>IET Inf. Secur.</source>, vol. <volume>14</volume>, no. <issue>5</issue>, pp. <fpage>531</fpage>&#x2013;<lpage>541</lpage>, <year>Mar. 2020</year>. doi: <pub-id pub-id-type="doi">10.1049/iet-ifs.2019.0418</pub-id>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Onwuzurike</surname></string-name>, <string-name><given-names>E.</given-names> <surname>Mariconti</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Andriotis</surname></string-name>, <string-name><given-names>E. D.</given-names> <surname>Cristofaro</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Ross</surname></string-name> and <string-name><given-names>G.</given-names> <surname>Stringhini</surname></string-name></person-group>, &#x201C;<article-title>MaMaDroid: Detecting Android malware by building markov chains of behavioral models</article-title>,&#x201D; <source>ACM Trans. Priv. Secur.</source>, vol. <volume>22</volume>, no. <issue>2</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>34</lpage>, <year>Apr. 2019</year>. doi: <pub-id pub-id-type="doi">10.1145/3313391</pub-id>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Cai</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Li</surname></string-name>, and <string-name><given-names>Z.</given-names> <surname>Xiong</surname></string-name></person-group>, &#x201C;<article-title>JOWMDroid: Android malware detection based on feature weighting with joint optimization of weight-mapping and classifier parameters</article-title>,&#x201D; <source>Comput. Secur.</source>, vol. <volume>100</volume>, no. <issue>7</issue>, pp. <fpage>102086</fpage>, <year>Jan. 2021</year>. doi: <pub-id pub-id-type="doi">10.1016/j.cose.2020.102086</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>N.</given-names> <surname>Xie</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Qin</surname></string-name>, and <string-name><given-names>X.</given-names> <surname>Di</surname></string-name></person-group>, &#x201C;<article-title>GA-StackingMD: Android malware detection method based on genetic algorithm optimized stacking</article-title>,&#x201D; <source>Appl. Sci.</source>, vol. <volume>13</volume>, no. <issue>4</issue>, pp. <fpage>2629</fpage>, <year>Jan. 2023</year>. doi: <pub-id pub-id-type="doi">10.3390/app13042629</pub-id>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>Bai</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Xie</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Di</surname></string-name>, and <string-name><given-names>Q.</given-names> <surname>Ye</surname></string-name></person-group>, &#x201C;<article-title>FAMD: A fast multifeature Android malware detection framework, design, and implementation</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>8</volume>, pp. <fpage>194729</fpage>&#x2013;<lpage>194740</lpage>, <year>2020</year>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2020.3033026</pub-id>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Japkowicz</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Tang</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Zhang</surname></string-name> and <string-name><given-names>J.</given-names> <surname>Zhao</surname></string-name></person-group>, &#x201C;<article-title>Research on unsupervised feature learning for Android malware detection based on restricted Boltzmann machines</article-title>,&#x201D; <source>Future Gener. Comput. Syst.</source>, vol. <volume>120</volume>, no. <issue>5</issue>, pp. <fpage>91</fpage>&#x2013;<lpage>108</lpage>, <year>Jul. 2021</year>. doi: <pub-id pub-id-type="doi">10.1016/j.future.2021.02.015</pub-id>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Wu</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>DroidRL: Feature selection for Android malware detection with reinforcement learning</article-title>,&#x201D; <source>Comput. Secur.</source>, vol. <volume>128</volume>, no. <issue>1</issue>, pp. <fpage>103126</fpage>, <year>May 2023</year>. doi: <pub-id pub-id-type="doi">10.1016/j.cose.2023.103126</pub-id>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Alam</surname></string-name>, <string-name><given-names>S. A.</given-names> <surname>Alharbi</surname></string-name>, and <string-name><given-names>S.</given-names> <surname>Yildirim</surname></string-name></person-group>, &#x201C;<article-title>Mining nested flow of dominant APIs for detecting Android malware</article-title>,&#x201D; <source>Comput. Netw.</source>, vol. <volume>167</volume>, no. <issue>1</issue>, pp. <fpage>107026</fpage>, <year>Feb. 2020</year>. doi: <pub-id pub-id-type="doi">10.1016/j.comnet.2019.107026</pub-id>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Sharma</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Ahlawat</surname></string-name>, and <string-name><given-names>K.</given-names> <surname>Khanna</surname></string-name></person-group>, &#x201C;<article-title>DeepMDFC: A deep learning based Android malware detection and family classification method</article-title>,&#x201D; <source>Secur. Priv.</source>, vol. <volume>7</volume>, no. <issue>2</issue>, pp. <fpage>23</fpage>, <year>Oct. 2023</year>. doi: <pub-id pub-id-type="doi">10.1002/spy2.347</pub-id>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Ananya</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Aswathy</surname></string-name>, <string-name><given-names>T. R.</given-names> <surname>Amal</surname></string-name>, <string-name><given-names>P. G.</given-names> <surname>Swathy</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Vinod</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Mohammad</surname></string-name></person-group>, &#x201C;<article-title>SysDroid: A dynamic ML-based Android malware analyzer using system call traces</article-title>,&#x201D; <source>Clust. Comput.</source>, vol. <volume>23</volume>, no. <issue>4</issue>, pp. <fpage>2789</fpage>&#x2013;<lpage>2808</lpage>, <year>Jan. 2020</year>. doi: <pub-id pub-id-type="doi">10.1007/s10586-019-03045-6</pub-id>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>B.</given-names> <surname>Wu</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Why an android app is classified as malware</article-title>,&#x201D; <source>ACM Trans. Softw. Eng. Methodol.</source>, vol. <volume>30</volume>, no. <issue>2</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>29</lpage>, <year>Mar. 2021</year>. doi: <pub-id pub-id-type="doi">10.1145/3423096</pub-id>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Martin</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Lara-Cabrera</surname></string-name>, and <string-name><given-names>D.</given-names> <surname>Camacho</surname></string-name></person-group>, &#x201C;<article-title>Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset</article-title>,&#x201D; <source>Inf. Fusion</source>, vol. <volume>52</volume>, no. <issue>7</issue>, pp. <fpage>128</fpage>&#x2013;<lpage>142</lpage>, <year>Dec. 2019</year>. doi: <pub-id pub-id-type="doi">10.1016/j.inffus.2018.12.006</pub-id>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Taheri</surname></string-name>, <string-name><given-names>A. F.</given-names> <surname>Kadir</surname></string-name>, and <string-name><given-names>A. H.</given-names> <surname>Lashkari</surname></string-name></person-group>, &#x201C;<article-title>Extensible Android malware detection and family classification using network-flows and API-calls</article-title>,&#x201D; in <conf-name>Int. Carnahan Conf. Secur. Technol. (ICCST)</conf-name>, <publisher-loc>Chennai, India</publisher-loc>, <year>Oct. 2019</year>. doi: <pub-id pub-id-type="doi">10.1109/ccst.2019.8888430</pub-id>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Rahali</surname></string-name>, <string-name><given-names>A. H.</given-names> <surname>Lashkari</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Kaur</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Taheri</surname></string-name>, <string-name><given-names>F.</given-names> <surname>GAGNON</surname></string-name>, and <string-name><given-names>F.</given-names> <surname>Massicotte</surname></string-name></person-group>, &#x201C;<article-title>DIDroid: Android malware classification and characterization using deep image learning</article-title>,&#x201D; in <conf-name>2020 the 10th Int. Conf. Commun. Netw. Secur.</conf-name>, <publisher-loc>Tokyo, Japan</publisher-loc>, <year>Nov. 2020</year>. doi: <pub-id pub-id-type="doi">10.1145/3442520.3442522</pub-id>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>D.</given-names> <surname>Arp</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Spreitzenbarth</surname></string-name>, <string-name><given-names>M.</given-names> <surname>H&#x00FC;bner</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Gascon</surname></string-name>, and <string-name><given-names>K.</given-names> <surname>Rieck</surname></string-name></person-group>, &#x201C;<article-title>Drebin: Effective and explainable detection of Android malware in your pocket</article-title>,&#x201D; in <conf-name>Proc. 2014 Netw. Distrib. Syst. Secur. Symp.</conf-name>, <publisher-loc>San Diego,CA, USA</publisher-loc>, <year>2014</year>. doi: <pub-id pub-id-type="doi">10.14722/ndss.2014.23247</pub-id>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>Wei</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Li</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Roy</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Ou</surname></string-name>, and <string-name><given-names>W.</given-names> <surname>Zhou</surname></string-name></person-group>, &#x201C;<article-title>Deep ground truth analysis of current Android malware</article-title>,&#x201D; in <conf-name>Int. Conf. Detect. Intrusions Malware Vulnerability Assess.</conf-name>, <publisher-loc>Bonn, Germany</publisher-loc>, <year>2017</year>, vol. <volume>10327</volume>, <pub-id pub-id-type="doi">10.1007/978-3-319-60876-1_12</pub-id>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Allix</surname></string-name>, <string-name><given-names>T. F.</given-names> <surname>Bissyand&#x00E9;</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Klein</surname></string-name>, and <string-name><given-names>Y. L.</given-names> <surname>Traon</surname></string-name></person-group>, &#x201C;<article-title>AndroZoo: Collecting millions of android apps for the research community</article-title>,&#x201D; in <conf-name>2016 IEEE/ACM 13th Work. Conf. Min. Softw. Repos. (MSR)</conf-name>, <publisher-loc>Austin, TX, USA</publisher-loc>, <year>2016</year>, pp. <fpage>468</fpage>&#x2013;<lpage>471</lpage>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>A. H.</given-names> <surname>Lashkari</surname></string-name>, <string-name><given-names>A. F. A.</given-names> <surname>Kadir</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Taheri</surname></string-name>, and <string-name><given-names>A. A.</given-names> <surname>Ghorbani</surname></string-name></person-group>, &#x201C;<article-title>Toward developing a systematic approach to generate benchmark Android malware datasets and classification</article-title>,&#x201D; in <conf-name>2018 Int. Carnahan Conf. Secur. Technol. (ICCST)</conf-name>, <publisher-loc>Montreal, QC, Canada</publisher-loc>, <year>2018</year>, pp. <fpage>1</fpage>&#x2013;<lpage>7</lpage>. doi: <pub-id pub-id-type="doi">10.1109/CCST.2018.8585560</pub-id>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>G.</given-names> <surname>Ke</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>LightGBM: A highly efficient gradient boosting decision tree</article-title>,&#x201D; in <conf-name>Proc. 31st Int. Conf. Neural Inform. Process. Syst. (NIPS&#x2019;17)</conf-name>, <publisher-loc>Long Beach, CA, USA</publisher-loc>, <year>2017</year>, pp. <fpage>3149</fpage>&#x2013;<lpage>3157</lpage>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Ghourabi</surname></string-name></person-group>, &#x201C;<article-title>A security model based on LightGBM and transformer to protect healthcare systems from cyberattacks</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>10</volume>, pp. <fpage>48890</fpage>&#x2013;<lpage>48903</lpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2022.3172432</pub-id>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><given-names>E.</given-names> <surname>Brochu</surname></string-name>, <string-name><given-names>V. M.</given-names> <surname>Cora</surname></string-name>, and <string-name><surname>Nando de Freitas</surname></string-name></person-group>, &#x201C;<article-title>A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning</article-title>,&#x201D; <comment>arXiv preprint arXiv:1012.2599</comment>, <year>Dec. 2010</year>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>van Hoof</surname></string-name> and <string-name><given-names>J.</given-names> <surname>Vanschoren</surname></string-name></person-group>, &#x201C;<article-title>Hyperboost: Hyperparameter optimization by gradient boosting surrogate models</article-title>,&#x201D; <comment>arXiv preprint arXiv:2101.02289</comment>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>P.</given-names> <surname>Musikawan</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Kongsorot</surname></string-name>, <string-name><given-names>I.</given-names> <surname>You</surname></string-name>, and <string-name><given-names>C.</given-names> <surname>So-In</surname></string-name></person-group>, &#x201C;<article-title>An enhanced deep learning neural network for the detection and identification of Android malware</article-title>,&#x201D; <source>IEEE Internet Things J.</source>, vol. <volume>10</volume>, no. <issue>10</issue>, pp. <fpage>8560</fpage>&#x2013;<lpage>8577</lpage>, <year>15 May, 2023</year>. doi: <pub-id pub-id-type="doi">10.1109/JIOT.2022.3194881</pub-id>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Batouche</surname></string-name> and <string-name><given-names>H.</given-names> <surname>Jahankhani</surname></string-name></person-group>, &#x201C;<chapter-title>A comprehensive approach to Android malware detection using machine learning</chapter-title>,&#x201D; in <source>Information Security Technologies for Controlling Pandemics</source>. <publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Chopra</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Acharya</surname></string-name>, <string-name><given-names>U.</given-names> <surname>Rawat</surname></string-name>, and <string-name><given-names>R.</given-names> <surname>Bhatnagar</surname></string-name></person-group>, &#x201C;<article-title>An energy efficient, robust, sustainable, and low computational cost method for mobile malware detection</article-title>,&#x201D; <source>Appl. Comput. Intell. Soft Comput.</source>, vol. <volume>2023</volume>, pp. <fpage>e2029064</fpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1155/2023/2029064</pub-id>.</mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Ullah</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Ahmad</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Buriro</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Zara</surname></string-name>, and <string-name><given-names>S.</given-names> <surname>Saha</surname></string-name></person-group>, &#x201C;<article-title>TrojanDetector: A multi-layer hybrid approach for trojan detection in android applications</article-title>,&#x201D; <source>Appl. Sci.</source>, vol. <volume>12</volume>, no. <issue>21</issue>, pp. <fpage>10755</fpage>, <year>Jan. 2022</year>. doi: <pub-id pub-id-type="doi">10.3390/app122110755</pub-id>.</mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>X.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Liu</surname></string-name>, and <string-name><given-names>C.</given-names> <surname>Zhang</surname></string-name></person-group>, &#x201C;<article-title>Network intrusion detection based on multi-domain data and ensemble-bidirectional LSTM</article-title>,&#x201D; <source>EURASIP J. Inf. Secur.</source>, vol. <volume>2023</volume>, no. <issue>1</issue>, pp. <fpage>5</fpage>, <year>Jun. 2023</year>. doi: <pub-id pub-id-type="doi">10.1186/s13635-023-00139-y</pub-id>.</mixed-citation></ref>
</ref-list>
</back></article>