<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">30784</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2023.030784</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Intelligent Deep Learning Based Cybersecurity Phishing Email Detection and Classification</article-title>
<alt-title alt-title-type="left-running-head">Intelligent Deep Learning Based Cybersecurity Phishing Email Detection and Classification</alt-title>
<alt-title alt-title-type="right-running-head">Intelligent Deep Learning Based Cybersecurity Phishing Email Detection and Classification</alt-title>
</title-group>
<contrib-group content-type="authors">
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Brindha</surname><given-names>R.</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-2" contrib-type="author">
<name name-style="western"><surname>Nandagopal</surname><given-names>S.</given-names></name><xref ref-type="aff" rid="aff-2">2</xref></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Azath</surname><given-names>H.</given-names></name><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Sathana</surname><given-names>V.</given-names></name><xref ref-type="aff" rid="aff-4">4</xref></contrib>
<contrib id="author-5" contrib-type="author">
<name name-style="western"><surname>Joshi</surname><given-names>Gyanendra Prasad</given-names></name><xref ref-type="aff" rid="aff-5">5</xref></contrib>
<contrib id="author-6" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Kim</surname><given-names>Sung Won</given-names></name><xref ref-type="aff" rid="aff-6">6</xref><email>swon@yu.ac.kr</email></contrib>
<aff id="aff-1"><label>1</label><institution>Department of Computing Technologies, SRM Institute of Science and Technology</institution>, <addr-line>Kattankulathur, 603203</addr-line>, <country>India</country></aff>
<aff id="aff-2"><label>2</label><institution>Department of Computing Science and Engineering, Nandha College of Technology</institution>, <addr-line>Erode, 638052</addr-line>, <country>India</country></aff>
<aff id="aff-3"><label>3</label><institution>School of Computing Science and Engineering, VIT Bhopal University</institution>, <addr-line>Bhopal, 466114</addr-line>, <country>India</country></aff>
<aff id="aff-4"><label>4</label><institution>Department of Computer Science and Engineering, K.Ramakrishnan College of Engineering</institution>, <addr-line>Tiruchirappalli, 621112</addr-line>, <country>India</country></aff>
<aff id="aff-5"><label>5</label><institution>Department of Computer Science and Engineering, Sejong University</institution>, <addr-line>Seoul, 05006</addr-line>, <country>Korea</country></aff>
<aff id="aff-6"><label>6</label><institution>Department of Information and Communication Engineering, Yeungnam University</institution>, <addr-line>Gyeongsan-si, 38541, Gyeongbuk-do</addr-line>, <country>Korea</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Sung Won Kim. Email: <email>swon@yu.ac.kr</email></corresp>
</author-notes>
<pub-date publication-format="print" date-type="pub" iso-8601-date="2022-12-15"><day>15</day>
<month>12</month>
<year>2022</year></pub-date>
<volume>74</volume>
<issue>3</issue>
<fpage>5901</fpage>
<lpage>5914</lpage>
<history>
<date date-type="received"><day>01</day><month>4</month><year>2022</year></date>
<date date-type="accepted"><day>12</day><month>10</month><year>2022</year></date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2023 Brindha et al.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Brindha et al.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_30784.pdf"></self-uri>
<abstract>
<p>Phishing is a type of cybercrime in which cyber-attackers pose themselves as authorized persons or entities and hack the victims&#x2019; sensitive data. E-mails, instant messages and phone calls are some of the common modes used in cyberattacks. Though the security models are continuously upgraded to prevent cyberattacks, hackers find innovative ways to target the victims. In this background, there is a drastic increase observed in the number of phishing emails sent to potential targets. This scenario necessitates the importance of designing an effective classification model. Though numerous conventional models are available in the literature for proficient classification of phishing emails, the Machine Learning (ML) techniques and the Deep Learning (DL) models have been employed in the literature. The current study presents an Intelligent Cuckoo Search (CS) Optimization Algorithm with a Deep Learning-based Phishing Email Detection and Classification (ICSOA-DLPEC) model. The aim of the proposed ICSOA-DLPEC model is to effectually distinguish the emails as either legitimate or phishing ones. At the initial stage, the pre-processing is performed through three stages such as email cleaning, tokenization and stop-word elimination. Then, the N-gram approach is; moreover, the CS algorithm is applied to extract the useful feature vectors. Moreover, the CS algorithm is employed with the Gated Recurrent Unit (GRU) model to detect and classify phishing emails. Furthermore, the CS algorithm is used to fine-tune the parameters involved in the GRU model. The performance of the proposed ICSOA-DLPEC model was experimentally validated using a benchmark dataset, and the results were assessed under several dimensions. Extensive comparative studies were conducted, and the results confirmed the superior performance of the proposed ICSOA-DLPEC model over other existing approaches. The proposed model achieved a maximum accuracy of 99.72&#x0025;.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Phishing email</kwd>
<kwd>data classification</kwd>
<kwd>natural language processing</kwd>
<kwd>deep learning</kwd>
<kwd>cybersecurity</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1"><label>1</label><title>Introduction</title>
<p>Nowadays, phishing has become a more lucrative style of committing fraudulent activities than ever before [<xref ref-type="bibr" rid="ref-1">1</xref>]. According to criminal law, fraud can be defined as a dishonest action of an individual with an intention to gain personal interests or reveal an individual&#x2019;s image. The online-based fraudulent activities include misleading individuals into smearing their private data to achieve personal or financial gains. Likewise, Phishing is an act in which a cyber-attacker attempts to automatically gain confidential or delicate data from the users for the purpose of stealing. This is done by mimicking the original websites and projecting them as the real ones [<xref ref-type="bibr" rid="ref-2">2</xref>]. Usually, the phishing attack is executed through electronic devices (namely, computers and ipads) and computer networks; the cyber-attackers wait for the right opportunity, find the vulnerable places in the recognition system, bypass the security features and steal the valuable data of the end-users. Such vulnerable places are taken into account as weak components in a security chain [<xref ref-type="bibr" rid="ref-3">3</xref>,<xref ref-type="bibr" rid="ref-4">4</xref>]. As described earlier, the Phishing attacker interacts well with the users through socially-engineered messages and persuades them to disclose their private data. The collected data is then used by fraudsters to obtain illegal access to a user&#x2019;s private and secure data. Phishing extracts complex data from the incautious victims and remains a social engineering-based threat. Some of the transmission networks that are frequently utilized to make such types of attacks include instant messaging, emails and so on. The attacker appears as a credible and legitimate individual. Since email is commonly used by such attackers, they emphasize email messages in their work [<xref ref-type="bibr" rid="ref-5">5</xref>].</p>
<p>As per the literature [<xref ref-type="bibr" rid="ref-6">6</xref>], it is challenging to detect phishing messages and emails automatically. This study discussed various techniques for the detection of phishing emails. It discussed about the Deep Learning (DL) method, blacklisting and the Machine Learning (ML)-based classification algorithms for the detection of phishing emails [<xref ref-type="bibr" rid="ref-7">7</xref>]. The existing blacklist method primarily depends on personal reports, whereas phishing emails are identified after spending too much time and workforce. On the other hand, the ML classification method-based phishing detection techniques utilize Artificial Intelligence (AI) methods to detect phishing attacks. Feature engineering is important for the automatic recognition of the representation features, whereas it is not possiblewhen the information migration scenario is applied [<xref ref-type="bibr" rid="ref-8">8</xref>]. Moreover, the recognition methods established using the DL techniques are constrained by word embedding in email content representations. In such case, the output remains opaque since the technique falls short in recognizing the significance of the phishing emails. So, both DL and the Natural Language Processing (NLP) technologies must be interchanged [<xref ref-type="bibr" rid="ref-9">9</xref>]. Following a dramatic expansion in the DL method-applications, considerable attention has been paid by the researchers to investigative phishing [<xref ref-type="bibr" rid="ref-10">10</xref>]. In contrast to the conventional ML methods, the DL techniques integrally collect the hand-engineered aspects in such a way that the ML specialists add the information without any need to obtain information on cyber security.</p>
<p>The authors in literature [<xref ref-type="bibr" rid="ref-11">11</xref>] validated the performance of the Convolutional Neural Network (CNN) techniques in recognizing phishing attacks through content analysis of email messages. These techniques take the embedded texts from the email body as input, and the outcome is as follows; probable demonstration of a message, whether it is malicious or not. Pan&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-12">12</xref>] presented a model termed Semantic Graph Neural Network (SGNN) to overcome the challenges involved in email classifiers. This approach converted the email classification problems into graph classification problems. In this method, the emails were presented as graphs, and the SGNN method was used for classification. The email feature was created as a semantic graph. So, there was no requirement to embed a word as a numerical vector representation.</p>
<p>Nayak&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-13">13</xref>] presented a data science technique for Spam Email Detection (SMD) utilizing an ML approach. A hybrid bagging method was utilized in this study for the recognition of spam emails. This approach used two methods such as the Na&#x00EF;ve Bayes (NB) method and the J48 (viz. Decision tree (DT) method. When a dataset was fed as an input in these approaches, the methods categorized the data into distinct sets with the help of data science. Hossain&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-14">14</xref>] presented a method in which e-mails were classified either as spam or harmful. Both Density-Based Clustering (DBSCAN) and Isolation Forest techniques were utilized in the literature to identify the maximum values outside a particular range. Recursive Feature Elimination, Heatmap, and the Chi-Square Feature Selection (FS) approaches were utilized for the selection of effectual features. The presented method was executed using both ML and DL techniques to conduct a comparative analysis.</p>
<p>The current study presents an Intelligent Cuckoo Search (CS) Optimization Algorithm with Deep Learning-based Phishing Email Detection and Classification (ICSOA-DLPEC) model. The aim of the proposed ICSOA-DLPEC model is to effectually distinguish the emails as legitimate and the phishing ones. The N-gram approach is applied for the extraction of useful feature vectors. Moreover, the CS algorithm with a Gated Recurrent Unit (GRU) model is employed in this study for the detection and classification of phishing emails. Furthermore, the CS algorithm is also involved in this study for fine-tuning the parameters of the GRU model. The performance of the proposed ICSOA-DLPEC model was experimentally validated using a benchmark dataset, and the results were assessed under several dimensions. In short, the contributions of the current research study are as follows.
<list list-type="bullet">
<list-item><p>A new ICSOA-DLPEC model is proposed for detection and the classification of the phishing emails.</p></list-item>
<list-item><p>Data preprocessing and feature extraction processes are employed to derive the features.</p></list-item>
<list-item><p>The GRU model is applied for detection and the classification of the phishing emails.</p></list-item>
<list-item><p>The CS algorithm is employed for optimal fine-tuning of the hyperparameters involved in the GRU model.</p></list-item>
</list></p>
</sec>
<sec id="s2"><label>2</label><title>Materials and Methods</title>
<p>In this study, a new ICSOA-DLPEC technique has been developed to effectually distinguish emails as legitimate and phishing ones. Primarily, the data pre-processing is performed through three phases such as email cleaning, tokenization and stop-word elimination. Besides, the N-gram approach is applied for the extraction of useful feature vectors. Further, the CS algorithm is employed with the GRU model to detect and classify phishing emails. <xref ref-type="fig" rid="fig-1">Fig. 1</xref> shows the block diagram of the proposed ICSOA-DLPEC technique.</p>
<fig id="fig-1"><label>Figure 1</label><caption><title>Block diagram of the ICSOA-DLPEC model</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30784-fig-1.png"/></fig>
<sec id="s2_1"><label>2.1</label><title>Feature Engineering</title>
<p>In this case, the N-gram method is applied in the extraction of valuable feature vectors. The pattern is generated by concatenating the neighboring demonstrations to n-grams, in which <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mn>3</mml:mn><mml:mo>&#x2026;</mml:mo><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>. In this simple approach, <italic>n</italic> signifies a factor that is related to one and is also termed a unigram. Here, the n-gram approach is used to decide a word that should be considered to develop a visible word in the text with respect to another word. In general, <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mtext mathvariant="italic">grams</mml:mtext></mml:mrow></mml:math></inline-formula> are certainly not superior to <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>3</mml:mn></mml:math></inline-formula>. The maximal values denote the probability of creating difficult patterns that are rarely equivalent.</p>
</sec>
<sec id="s2_2"><label>2.2</label><title>GRU Classification</title>
<p>The GRU model is exploited for the detection and classification of phishing emails.</p>
<p>Elman Recurrent Neural Network (RNN) is a significant example of simple RNNs in which a context layer feature is used to function as a memory layer. It is integrated with the existing state to propagate the data to the future states so as to handle the provided future input. The context layer is utilized to store the resultant neurons. The preceding time steps are constructed appropriately to suit the time-varying pattern of the data. The context layer continues as a memory of the previously hidden layer outcomes. The vectorized equation for a simple RNN is provided below.
<disp-formula id="ueqn-1">
<mml:math id="mml-ueqn-1" display="block"><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mi>h</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> refers to the input vector, <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> signifies the hidden layer vector, <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the resultant vector, <italic>W</italic> denotes the weight of the hidden and the resultant layers, <italic>U</italic> implies the weight of the context state, <italic>b</italic> indicates the bias and <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:msub><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:msub><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> denote the respective activation functions. The Backpropagation Through Time (BPTT) approach is a prominent approach used to train simple RNNs. In comparison with the simple Neural Network (NN), the BPTT in RNNs propagate the error to in-depth network infrastructure, and the feature states are determined based on time.</p>
<p>The GRU model is a simplified model of the Long Short-Term Memory (LSTM) model in RNN. Unlike LSTM, the GRU combines both inputs as well as forgetting gates to update the gates. Considering the quantity of the hidden layers represented by <italic>h</italic>, a small-batch input that uses a time step <italic>t</italic> indicates <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mrow><mml:mo>&#x2217;</mml:mo></mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>. The quantity of the samples is denoted by <italic>n</italic>, whereas the quantity of the input is denoted by <italic>d</italic>. The hidden neuron at a prior time step t1 is denoted by <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mrow><mml:mo>&#x2217;</mml:mo></mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>. The resultant hidden layer <italic>h</italic> of the individual GRU at the existing time step <italic>t</italic> is given below [<xref ref-type="bibr" rid="ref-15">15</xref>]:
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>x</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>h</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:msub><mml:mi>Z</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>x</mml:mi><mml:mi>z</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>h</mml:mi><mml:mi>z</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mover><mml:mi>H</mml:mi><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mrow><mml:mtext>tan</mml:mtext></mml:mrow><mml:mi>h</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>x</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mi>E</mml:mi><mml:mo>&#x2299;</mml:mo><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>h</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>Z</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2299;</mml:mo><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>Z</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mi>E</mml:mi><mml:mo>&#x2299;</mml:mo><mml:msub><mml:mrow><mml:mover><mml:mi>H</mml:mi><mml:mo>&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></disp-formula>whereas <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:mi>&#x03C3;</mml:mi></mml:math></inline-formula> denotes the sigmoid activation function, for example, <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:mo>&#x22C5;</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>h</mml:mi><mml:mi>z</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>x</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>h</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-1001"><mml:math id="mml-ieqn-1001"><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>X</mml:mi><mml:mi>Z</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> demonstrate the weights of the connecting hidden layer and the update gate, input layer and the reset gate, hidden layer and the reset gate and input layer and the update gate respectively; <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> indicate the bias values of the reset and update gates; <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> indicates the candidate hidden layer of the existing time step, <italic>t</italic>; <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mo>&#x2299;</mml:mo></mml:math></inline-formula> denotes the matrix multiplication of two components. <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mi>T</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>h</mml:mi></mml:math></inline-formula> indicates the hyperbolic tangent activation function as given below.
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mrow><mml:mtext>tanh</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mn>2</mml:mn><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>2</mml:mn><mml:mi>x</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:mstyle></mml:math></disp-formula></p>
<p>When a variable is estimated, the value of the present-day time is closely linked with the value of previous time and the value of the following time [<xref ref-type="bibr" rid="ref-16">16</xref>]. <xref ref-type="fig" rid="fig-2">Fig. 2</xref> depicts the architecture of the GRU method.</p>
<fig id="fig-2"><label>Figure 2</label><caption><title>Structure of the GRU model</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30784-fig-2.png"/></fig>
</sec>
<sec id="s2_3"><label>2.3</label><title>Hyperparameter Optimization</title>
<p>In order to optimally adjust the hyperparameters [<xref ref-type="bibr" rid="ref-17">17</xref>&#x2013;<xref ref-type="bibr" rid="ref-20">20</xref>] involved in the GRU model, the CS algorithm is applied. Like other evolutionary models, the CS algorithm starts with an initial population. In general, the Cuckoos lay eggs in other birds&#x2019; nests. Some of the host eggs are expected to be turned and raised by the cuckoos. However, other eggs are recognized by the host bird, whereas the host bird still raises the cuckoos&#x2019; eggs. The rate of the raised eggs shows the correctness of the area. When additional eggs need to be added, it shows that the region gains no profit [<xref ref-type="bibr" rid="ref-21">21</xref>]. Consequently, the situation turns different in that the additional eggs endure for a period of time. This is a parameter to improve the performance of the cuckoos. The cuckoos seek the best area to enhance the lifetime of an egg. After hatching and turning into matured cuckoos, they generate their own society and community. All the communities have their own habitation. The optimum habitation of all the communities would be the forthcoming terminated place of the cuckoos in other groups.</p>
<p>All the groups immigrate towards an optimum currency area. Each group may be an inhabitant in the region near the optimum area. Egg Laying Radius (ELR) is assessed according to the number of eggs laid by the cuckoos and the distance from the existing enhanced area. Next, the cuckoos start laying their eggs arbitrarily in the nest within an egg-laying radii. This method iterates that until the optimal position is obtained, the eggs are to be laid in the area with maximal returns. This improved area is the location where the highest number of cuckoos gather collectively. It is important to make parameters as an array so that the optimization issues are solved. In Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) techniques, the array is known as a &#x2018;chromosome&#x2019; and the location of the &#x2018;particles&#x2019;. However, in the CS algorithm, the array is recognized as a &#x2018;habitat&#x2019;. In one-D <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:mi>N</mml:mi><mml:mi>v</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi></mml:math></inline-formula> optimization issue, habitation is a <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:mn>1</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mi>N</mml:mi><mml:mi>v</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi></mml:math></inline-formula> array that expects the existing position of the cuckoo life. This is defined as follows.
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:mrow><mml:mtext mathvariant="italic">Habitat</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mi>v</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>]</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>The quantity of the suitability or profit rate for the existing habitation is accomplished as a profit function. Thus, the following equation is applied.
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:mrow><mml:mtext mathvariant="italic">Porofit</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext mathvariant="italic">habitat</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mi>v</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>The CS algorithm is a procedure that exploits the profit of an entity. To exploit the CS algorithm, the cost function needs to be multiplied with a minus symbol so that the problem gets resolved. In order to initiate the optimization process, a habitation matrix of size <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo>&#x00D7;</mml:mo><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>v</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is generated. Then, the quantity of eggs is characterized by each habitation matrix. Naturally, a cuckoo lays about 5&#x2013;20 eggs. This number can be exploited by a minimal limit and a maximal limit in describing the eggs. Each cuckoo lays eggs in a certain range. So, ELR is the highest array of eggs laid upon. In optimization problems, the lower and the upper limits are denoted by <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:mi>v</mml:mi><mml:mi>a</mml:mi><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>w</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> 0027
&#x0026; <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:mi>v</mml:mi><mml:mi>a</mml:mi><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>h</mml:mi><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, respectively. A cuckoo has an ELR, which is related to the complete amount of eggs, existing amount of eggs and lower or upper limits of the parameter. Henceforth, the ELR is described as given below.
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:mi>E</mml:mi><mml:mi>L</mml:mi><mml:mi>R</mml:mi><mml:mo>=</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">Number</mml:mtext></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mi>o</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mtext mathvariant="italic">current</mml:mtext></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mtext mathvariant="italic">cuckoo</mml:mtext></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mi>e</mml:mi><mml:mi>g</mml:mi><mml:mi>g</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">Total</mml:mtext></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mtext mathvariant="italic">number</mml:mtext></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mi>o</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>e</mml:mi><mml:mi>g</mml:mi><mml:mi>g</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>v</mml:mi><mml:mi>a</mml:mi><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>h</mml:mi><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:mi>v</mml:mi><mml:mi>a</mml:mi><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>w</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Consider that <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:mi>&#x03B1;</mml:mi></mml:math></inline-formula> represents a variable, whereas the highest ELR remains static. Each cuckoo subjectively lays its eggs in a host bird&#x2019;s nest with an ELR value. At last, an egg which is approximately smaller than the host eggs is recognized and thrown away. Later, each egg lays its own process, <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:mi>p</mml:mi></mml:math></inline-formula>&#x0025; of eggs (usually 10&#x0025;). If the profit value is lesser, it gets discarded. The residual chick in the host nests is then fed and raised.</p>
</sec>
</sec>
<sec id="s3"><label>3</label><title>Results and Discussion</title>
<p>In this section, the performance of the proposed ICSOA-DLPEC model was experimentally validated using the Enron email dataset [<xref ref-type="bibr" rid="ref-22">22</xref>], which can be accessed at <uri xlink:href="https://www.cs.cmu.edu/~enron/">https://www.cs.cmu.edu/~enron/</uri>. It includes 7,781 legitimate emails and 999 phishing emails.</p>
<p><xref ref-type="fig" rid="fig-3">Fig. 3</xref> exhibits the four confusion matrices generated by the proposed ICSOA-DLPEC model on phishing email classification with distinct training (TR) and testing (TS) datasets. With a TR/TS of 90:10, the proposed ICSOA-DLPEC model recognized 773 and 99 samples as legitimate and phishing classes, respectively. In line with this, with a TR/TS of 80:40, the proposed ICSOA-DLPEC model classified 1,543 and 208 samples under legitimate and phishing classes, respectively. Moreover, on 80:20 TR/TS, the ICSOA-DLPEC model categorized 2,333 and 284 samples under legitimate and phishing classes respectively.</p>
<fig id="fig-3"><label>Figure 3</label><caption><title>Confusion matrices of the ICSOA-DLPEC model</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30784-fig-3.png"/></fig>
<p><xref ref-type="table" rid="table-1">Table 1</xref> and <xref ref-type="fig" rid="fig-4">Fig. 4</xref> show the overall classification outcomes accomplished by the proposed ICSOA-DLPEC model on phishing emails. With 90:10 TR/TS data, the proposed ICSOA-DLPEC model gained average <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">score</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> values such as 99.32&#x0025;, 97.53&#x0025;, 99.18&#x0025; and 98.34&#x0025;, respectively. Besides, with 80:20 TR/TS data, the proposed ICSOA-DLPEC model gained average <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">score</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> values such as 99.72&#x0025;, 99.02&#x0025;, 99.63&#x0025; and 99.33&#x0025;, respectively. Moreover, with 70:30 TR/TS data, the presented ICSOA-DLPEC model achieved average <inline-formula id="ieqn-33"><mml:math id="mml-ieqn-33"><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-34"><mml:math id="mml-ieqn-34"><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-35"><mml:math id="mml-ieqn-35"><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, and <inline-formula id="ieqn-36"><mml:math id="mml-ieqn-36"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">score</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> values such as 99.352&#x0025;, 98.44&#x0025;, 98.29&#x0025; and 98.37&#x0025;, respectively. Furthermore, with 60:40 TR/TS data, the proposed ICSOA-DLPEC model attained average <inline-formula id="ieqn-37"><mml:math id="mml-ieqn-37"><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-38"><mml:math id="mml-ieqn-38"><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-39"><mml:math id="mml-ieqn-39"><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-40"><mml:math id="mml-ieqn-40"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">score</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> values such as 98.46&#x0025;, 97.04&#x0025;, 95.15&#x0025; and 96.07&#x0025; respectively.</p>
<table-wrap id="table-1"><label>Table 1</label><caption><title>Overall classification results of the ICSOA-DLPEC model</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left">Class labels</th>
<th align="left">Accuracy</th>
<th align="left">Precision</th>
<th align="left">Recall</th>
<th align="left">F-score</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" colspan="5">Training/Testing (90:10)</td>
</tr>
<tr>
<td align="left">Legitimate</td>
<td align="left">99.32</td>
<td align="left">99.87</td>
<td align="left">99.36</td>
<td align="left">99.61</td>
</tr>
<tr>
<td align="left">Phishing</td>
<td align="left">99.32</td>
<td align="left">95.19</td>
<td align="left">99.00</td>
<td align="left">97.06</td>
</tr>
<tr>
<td align="left">Average</td>
<td align="left">99.32</td>
<td align="left">97.53</td>
<td align="left">99.18</td>
<td align="left">98.34</td>
</tr>
<tr>
<td align="center" colspan="5">Training/Testing (80:20)</td>
</tr>
<tr>
<td align="left">Legitimate</td>
<td align="left">99.72</td>
<td align="left">99.94</td>
<td align="left">99.74</td>
<td align="left">99.84</td>
</tr>
<tr>
<td align="left">Phishing</td>
<td align="left">99.72</td>
<td align="left">98.11</td>
<td align="left">99.52</td>
<td align="left">98.81</td>
</tr>
<tr>
<td align="left">Average</td>
<td align="left">99.72</td>
<td align="left">99.02</td>
<td align="left">99.63</td>
<td align="left">99.33</td>
</tr>
<tr>
<td align="center" colspan="5">Training/Testing (70:30)</td>
</tr>
<tr>
<td align="left">Legitimate</td>
<td align="left">99.35</td>
<td align="left">99.62</td>
<td align="left">99.66</td>
<td align="left">99.64</td>
</tr>
<tr>
<td align="left">Phishing</td>
<td align="left">99.35</td>
<td align="left">97.26</td>
<td align="left">96.93</td>
<td align="left">97.09</td>
</tr>
<tr>
<td align="left">Average</td>
<td align="left">99.35</td>
<td align="left">98.44</td>
<td align="left">98.29</td>
<td align="left">98.37</td>
</tr>
<tr>
<td align="center" colspan="5">Training/Testing (60:40)</td>
</tr>
<tr>
<td align="left">Legitimate</td>
<td align="left">98.46</td>
<td align="left">98.85</td>
<td align="left">99.42</td>
<td align="left">99.14</td>
</tr>
<tr>
<td align="left">Phishing</td>
<td align="left">98.46</td>
<td align="left">95.23</td>
<td align="left">90.89</td>
<td align="left">93.01</td>
</tr>
<tr>
<td align="left">Average</td>
<td align="left">98.46</td>
<td align="left">97.04</td>
<td align="left">95.15</td>
<td align="left">96.07</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-4"><label>Figure 4</label><caption><title>Phishing email classification outcomes of the ICSOA-DLPEC model</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30784-fig-4.png"/></fig>
<p>A Receiver Operating Characteristic (ROC) analysis was conducted upon the ICSOA-DLPEC model on phishing email classification and the results are revealed in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>. The figure portrays that the proposed ICSOA-DLPEC model accomplished the maximum ROC values to identify the phishing and legitimate class labels.</p>
<fig id="fig-5"><label>Figure 5</label><caption><title>ROC examination results of the ICSOA-DLPEC model</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30784-fig-5.png"/></fig>
<p>Both Training Accuracy (TA) and Validation Accuracy (VA) values, attained by the proposed ICSOA-DLPEC model on phishing email classification, are demonstrated in <xref ref-type="fig" rid="fig-6">Fig. 6</xref>. The experimental outcomes imply that the proposed ICSOA-DLPEC model gained the maximum TA and VA values. To be specific, the VA values were higher than the TA values. Then, the Training Loss (TL) and the Validation Loss (VL) values, achieved by the proposed ICSOA-DLPEC model on phishing email classification, are shown in <xref ref-type="fig" rid="fig-7">Fig. 7</xref>. The experimental outcomes infer that the proposed ICSOA-DLPEC model accomplished the least TL and VL values whereas the VL values were lower than the TL values.</p>
<fig id="fig-6"><label>Figure 6</label><caption><title>TA/VA examination results of the ICSOA-DLPEC model</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30784-fig-6.png"/></fig><fig id="fig-7"><label>Figure 7</label><caption><title>TL/VL examination results of the ICSOA-DLPEC model</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30784-fig-7.png"/></fig>
<p><xref ref-type="fig" rid="fig-8">Fig. 8</xref> shows the precision-recall curve obtained by the ICSOA-DLPEC model on phishing email classification. The results indicate that the proposed ICSOA-DLPEC model gained the maximum precision-recall values in both classes.</p>
<fig id="fig-8"><label>Figure 8</label><caption><title>Precision recall examination results of the ICSOA-DLPEC model</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30784-fig-8.png"/></fig>
<p>To validate the betterment of the proposed ICSOA-DLPEC model on phishing email classification, a comparative examination was conducted, and the results are shown in <xref ref-type="table" rid="table-2">Table 2</xref>.</p>
<table-wrap id="table-2"><label>Table 2</label><caption><title>Comparative study results of the ICSOA-DLPEC model and other recent models</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left">Methods</th>
<th align="left">Accuracy</th>
<th align="left">Precision</th>
<th align="left">Recall</th>
<th align="left">F-score</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">LSTM model</td>
<td align="left">98.16</td>
<td align="left">93.85</td>
<td align="left">83.21</td>
<td align="left">88.18</td>
</tr>
<tr>
<td align="left">CNN model</td>
<td align="left">97.58</td>
<td align="left">86.27</td>
<td align="left">87.53</td>
<td align="left">85.19</td>
</tr>
<tr>
<td align="left">THEMIS</td>
<td align="left">99.34</td>
<td align="left">98.96</td>
<td align="left">99.12</td>
<td align="left">99.03</td>
</tr>
<tr>
<td align="left">ICSOA-DLPEC</td>
<td align="left">99.72</td>
<td align="left">99.02</td>
<td align="left">99.63</td>
<td align="left">99.33</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><xref ref-type="fig" rid="fig-9">Fig. 9</xref> shows the comprehensive <inline-formula id="ieqn-41"><mml:math id="mml-ieqn-41"><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-42"><mml:math id="mml-ieqn-42"><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> investigation results of the ICSOA-DLPEC model and other existing models. The figure reveals that the CNN model achieved the least <inline-formula id="ieqn-43"><mml:math id="mml-ieqn-43"><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-44"><mml:math id="mml-ieqn-44"><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> values, such as 97.58&#x0025; and 86.27&#x0025;, respectively. In line with this, the LSTM model gained slightly enhanced <inline-formula id="ieqn-45"><mml:math id="mml-ieqn-45"><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-46"><mml:math id="mml-ieqn-46"><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> values such as 98.76&#x0025; and 93.85&#x0025;, respectively. Though the THEMIS model gained reasonable <inline-formula id="ieqn-47"><mml:math id="mml-ieqn-47"><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-48"><mml:math id="mml-ieqn-48"><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> values such as 99.34&#x0025; and 98.96&#x0025;, respectively, the proposed ICSOA-DLPEC model gained effectual outcomes with maximum <inline-formula id="ieqn-49"><mml:math id="mml-ieqn-49"><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-50"><mml:math id="mml-ieqn-50"><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> values such as 99.72&#x0025; and 99.02&#x0025;, respectively.</p>
<fig id="fig-9"><label>Figure 9</label><caption><title>Comparative <inline-formula id="ieqn-61"><mml:math id="mml-ieqn-61"><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-62"><mml:math id="mml-ieqn-62"><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> investigation results of the ICSOA-DLPEC model</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30784-fig-9.png"/></fig>
<p><xref ref-type="fig" rid="fig-10">Fig. 10</xref> portrays the extensive <inline-formula id="ieqn-51"><mml:math id="mml-ieqn-51"><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-52"><mml:math id="mml-ieqn-52"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">score</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> examination results accomplished by the proposed ICSOA-DLPEC model and other existing models. The figure exposes that the CNN model delivered the least <inline-formula id="ieqn-53"><mml:math id="mml-ieqn-53"><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-54"><mml:math id="mml-ieqn-54"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">score</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> values, such as 87.53&#x0025; and 85.19&#x0025;, respectively. In line with this, the LSTM model gained slightly enhanced <inline-formula id="ieqn-55"><mml:math id="mml-ieqn-55"><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-56"><mml:math id="mml-ieqn-56"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">score</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> values, such as 83.21&#x0025; and 88.18&#x0025;, respectively. Though the THEMIS model gained reasonable <inline-formula id="ieqn-57"><mml:math id="mml-ieqn-57"><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-58"><mml:math id="mml-ieqn-58"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">score</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> values such as 99.12&#x0025; and 99.03&#x0025;, the proposed ICSOA-DLPEC model gained effectual outcomes with maximum <inline-formula id="ieqn-59"><mml:math id="mml-ieqn-59"><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-60"><mml:math id="mml-ieqn-60"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">score</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> values such as 99.63&#x0025; and 99.33&#x0025;, respectively. Based on the results and the discussion, it is evident that the proposed ICSOA-DLPEC model is effective in the detection and classification of phishing emails.</p>
<fig id="fig-10"><label>Figure 10</label><caption><title>Comparative <inline-formula id="ieqn-63"><mml:math id="mml-ieqn-63"><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-64"><mml:math id="mml-ieqn-64"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">score</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> investigation results of the ICSOA-DLPEC model</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30784-fig-10.png"/></fig>
</sec>
<sec id="s4"><label>4</label><title>Conclusion</title>
<p>In this study, a new ICSOA-DLPEC technique has been developed to effectually distinguish emails as legitimate and phishing. Primarily, the data pre-processing is performed through three steps such as email cleaning, tokenization and stop-word elimination. Following this, the N-gram approach is applied for the extraction of the useful feature vectors. In addition, the CS algorithm is employed with the GRU model to detect and classify phishing emails. Finally, the CS algorithm is applied to fine-tune the parameters involved in the GRU model. The performance of the proposed ICSOA-DLPEC model was experimentally validated using a benchmark dataset, and the results were assessed under several dimensions. Extensive comparative analysis results confirmed the superiority of the ICSOA-DLPEC model over recent approaches. In the future, the hybrid DL models can be exploited to increase the detection rate further.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="other"><p><bold>Funding Statement:</bold> This research was supported in part by Basic Science Research Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Education (NRF-2021R1A6A1A03039493), and in part by the NRF grant funded by the Korea government (MSIT) (NRF-2022R1A2C1004401).</p></fn>
<fn fn-type="conflict"><p><bold>Conflicts of Interest:</bold> The authors declare that they have no conflicts of interest to report regarding the present study.</p></fn>
</fn-group>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Salloum</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Gaber</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Vadera</surname></string-name> and <string-name><given-names>K.</given-names> <surname>Shaalan</surname></string-name></person-group>, &#x201C;<article-title>Phishing email detection using natural language processing techniques: A literature survey</article-title>,&#x201D; <source>Procedia Computer Science</source>, vol. <volume>189</volume>, pp. <fpage>19</fpage>&#x2013;<lpage>28</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>T.</given-names> <surname>Gangavarapu</surname></string-name>, <string-name><given-names>C. D.</given-names> <surname>Jaidhar</surname></string-name> and <string-name><given-names>B.</given-names> <surname>Chanduka</surname></string-name></person-group>, &#x201C;<article-title>Applicability of machine learning in spam and phishing email filtering: Review and approaches</article-title>,&#x201D; <source>Artificial Intelligence Review</source>, vol. <volume>53</volume>, no. <issue>7</issue>, pp. <fpage>5019</fpage>&#x2013;<lpage>5081</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Fang</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Huang</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Liu</surname></string-name> and <string-name><given-names>Y.</given-names> <surname>Yang</surname></string-name></person-group>, &#x201C;<article-title>Phishing email detection using improved rcnn model with multilevel vectors and attention mechanism</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>7</volume>, pp. <fpage>56329</fpage>&#x2013;<lpage>56340</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Karim</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Azam</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Shanmugam</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Kannoorpatti</surname></string-name> and <string-name><given-names>M.</given-names> <surname>Alazab</surname></string-name></person-group>, &#x201C;<article-title>A comprehensive survey for intelligent spam email detection</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>7</volume>, pp. <fpage>168261</fpage>&#x2013;<lpage>168295</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Vinayakumar</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Ganesh</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Kumar</surname></string-name>, <string-name><given-names>K. P.</given-names> <surname>Soman</surname></string-name> and <string-name><given-names>P.</given-names> <surname>Poornachandran</surname></string-name></person-group>, &#x201C;<article-title>DeepAnti-PhishNet: Applying deep neural networks for phishing email detection</article-title>,&#x201D; in <conf-name>Proc. of the 1st AntiPhishing Shared Pilot at 4th ACM Int. Workshop on Security and Privacy Analytics (IWSPA 2018)</conf-name>, <conf-loc>Tempe, Arizona, USA</conf-loc>, pp. <fpage>1</fpage>&#x2013;<lpage>11</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Vazhayil</surname></string-name>, <string-name><given-names>N. B.</given-names> <surname>Harikrishnan</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Vinayakumar</surname></string-name>, <string-name><given-names>K. P.</given-names> <surname>Soman</surname></string-name> and <string-name><given-names>A. D. R.</given-names> <surname>Verma</surname></string-name></person-group>, &#x201C;<article-title>PED-ML: Phishing email detection using classical machine learning techniques</article-title>,&#x201D; in <conf-name>Proc. of the 1st AntiPhishing Shared Pilot at 4th ACM Int. Workshop on Security and Privacy Analytics (IWSPA 2018)</conf-name>, <conf-loc>Tempe, Arizona, USA</conf-loc>, pp. <fpage>1</fpage>&#x2013;<lpage>8</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Kumar Birthriya</surname></string-name> and <string-name><given-names>A. K.</given-names> <surname>Jain</surname></string-name></person-group>, &#x201C;<article-title>A comprehensive survey of phishing email detection and protection techniques</article-title>,&#x201D; <source>Information Security Journal: A Global Perspective</source>, pp. <fpage>1</fpage>&#x2013;<lpage>30</lpage>, <year>2021</year>. <uri xlink:href="https://doi.org/10.1080/19393555.2021.1959678">https://doi.org/10.1080/19393555.2021.1959678</uri>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Lee</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Tang</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Ye</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Abbasi</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Hay</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>D-fence: A flexible, efficient, and comprehensive phishing email detection system</article-title>,&#x201D; in <conf-name>2021 IEEE European Symp. on Security and Privacy (EuroS&#x0026;P)</conf-name>, <conf-loc>Vienna, Austria</conf-loc>, pp. <fpage>578</fpage>&#x2013;<lpage>597</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Rastenis</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Ramanauskait&#x0117;</surname></string-name>, <string-name><given-names>I.</given-names> <surname>Suzdalev</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Tunaityt&#x0117;</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Janulevi&#x010D;ius</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Multi-language spam/phishing classification by email body text: Toward automated security incident investigation</article-title>,&#x201D; <source>Electronics</source>, vol. <volume>10</volume>, no. <issue>6</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>10</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Sundararaj</surname></string-name> and <string-name><given-names>G.</given-names> <surname>Kul</surname></string-name></person-group>, &#x201C;<article-title>Impact analysis of training data characteristics for phishing email classification</article-title>,&#x201D; <source>Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications</source>, vol. <volume>12</volume>, no. <issue>2</issue>, pp. <fpage>85</fpage>&#x2013;<lpage>98</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>McGinley</surname></string-name> and <string-name><given-names>S. A. S.</given-names> <surname>Monroy</surname></string-name></person-group>, &#x201C;<article-title>Convolutional neural network optimization for phishing email classification</article-title>,&#x201D; in <conf-name>2021 IEEE Int. Conf. on Big Data (Big Data)</conf-name>, <conf-loc>Orlando, FL, USA</conf-loc>, pp. <fpage>5609</fpage>&#x2013;<lpage>5613</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Pan</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Li</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Gao</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Yue</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Yang</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Semantic graph neural network: A conversion from spam email classification to graph classification</article-title>,&#x201D; <source>Scientific Programming</source>, vol. <volume>2022</volume>, pp. <fpage>1</fpage>&#x2013;<lpage>8</lpage>, <year>2022</year>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Nayak</surname></string-name>, <string-name><given-names>S. A.</given-names> <surname>Jiwani</surname></string-name> and <string-name><given-names>B.</given-names> <surname>Rajitha</surname></string-name></person-group>, &#x201C;<article-title>Spam email detection using machine learning algorithm</article-title>,&#x201D; <source>Materials Today: Proceedings</source>, pp. <fpage>1</fpage>&#x2013;<lpage>5</lpage>, <year>2021</year>. <uri xlink:href="https://doi.org/10.1016/j.matpr.2021.03.147">https://doi.org/10.1016/j.matpr.2021.03.147</uri>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>Hossain</surname></string-name>, <string-name><given-names>M. N.</given-names> <surname>Uddin</surname></string-name> and <string-name><given-names>R. K.</given-names> <surname>Halder</surname></string-name></person-group>, &#x201C;<article-title>Analysis of optimized machine learning and deep learning techniques for spam detection</article-title>,&#x201D; in <conf-name>2021 IEEE Int. IOT, Electronics and Mechatronics Conf. (IEMTRONICS)</conf-name>, <conf-loc>Toronto, ON, Canada</conf-loc>, pp. <fpage>1</fpage>&#x2013;<lpage>7</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Tang</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>He</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Tan</surname></string-name> and <string-name><given-names>C.</given-names> <surname>Li</surname></string-name></person-group>, &#x201C;<article-title>A novel displacement prediction method using gated recurrent unit model with time series analysis in the erdaohe landslide</article-title>,&#x201D; <source>Natural Hazards</source>, vol. <volume>105</volume>, no. <issue>1</issue>, pp. <fpage>783</fpage>&#x2013;<lpage>813</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Xu</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Shen</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Du</surname></string-name> and <string-name><given-names>F.</given-names> <surname>Zhang</surname></string-name></person-group>, &#x201C;<article-title>An intrusion detection system using a deep neural network with gated recurrent units</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>6</volume>, pp. <fpage>48697</fpage>&#x2013;<lpage>48707</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Shankar</surname></string-name>, <string-name><given-names>E.</given-names> <surname>Perumal</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Elhoseny</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Taher</surname></string-name>, <string-name><given-names>B. B.</given-names> <surname>Gupta</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Synergic deep learning for smart health diagnosis of covid-19 for connected living and smart cities</article-title>,&#x201D; <source>ACM Transactions on Internet Technology</source>, vol. <volume>22</volume>, no. <issue>3</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>14</lpage>, <year>2022</year>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>N.</given-names> <surname>Metawa</surname></string-name>, <string-name><given-names>I. V.</given-names> <surname>Pustokhina</surname></string-name>, <string-name><given-names>D. A.</given-names> <surname>Pustokhin</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Shankar</surname></string-name> and <string-name><given-names>M.</given-names> <surname>Elhoseny</surname></string-name></person-group>, &#x201C;<article-title>Computational intelligence-based financial crisis prediction model using feature subset selection with optimal deep belief network</article-title>,&#x201D; <source>Big Data</source>, vol. <volume>9</volume>, no. <issue>2</issue>, pp. <fpage>100</fpage>&#x2013;<lpage>115</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Saravanakumar</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Krishnaraj</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Venkatraman</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Sivakumar</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Prasanna</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Hierarchical symbolic analysis and particle swarm optimization based fault diagnosis model for rotating machineries with deep neural networks</article-title>,&#x201D; <source>Measurement</source>, vol. <volume>171</volume>, pp. <fpage>1</fpage>&#x2013;<lpage>25</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Elhoseny</surname></string-name>, <string-name><given-names>M. M.</given-names> <surname>Selim</surname></string-name> and <string-name><given-names>K.</given-names> <surname>Shankar</surname></string-name></person-group>, &#x201C;<article-title>Optimal deep learning based convolution neural network for digital forensics face sketch synthesis in internet of things (IoT)</article-title>,&#x201D; <source>International Journal of Machine Learning and Cybernetics</source>, vol. <volume>12</volume>, no. <issue>11</issue>, pp. <fpage>3249</fpage>&#x2013;<lpage>3260</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Xu</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Chen</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Shen</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Xu</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Pan</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Proposal and experimental case study on building ventilating fan fault diagnosis based on cuckoo search algorithm optimized extreme learning machine</article-title>,&#x201D; <source>Sustainable Energy Technologies and Assessments</source>, vol. <volume>45</volume>, pp. <fpage>1</fpage>&#x2013;<lpage>15</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="web">The First Security and Privacy Analytics Anti-Phishing Shared Task. [Online]. Available: <uri xlink:href="https://dasavisha.github.io/IWSPA-sharedtask/?tdsourcetag=s_pctim_aiomsg">https://dasavisha.github.io/IWSPA-sharedtask/?tdsourcetag=s_pctim_aiomsg</uri>.</mixed-citation></ref>
</ref-list>
</back>
</article>











