<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">JCS</journal-id>
<journal-id journal-id-type="nlm-ta">JCS</journal-id>
<journal-id journal-id-type="publisher-id">JCS</journal-id>
<journal-title-group>
<journal-title>Journal of Cyber Security</journal-title>
</journal-title-group>
<issn pub-type="epub">2579-0064</issn>
<issn pub-type="ppub">2579-0072</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">48332</article-id>
<article-id pub-id-type="doi">10.32604/jcs.2024.048332</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Securing Web by Predicting Malicious URLs</article-title>
<alt-title alt-title-type="left-running-head">Securing Web by Predicting Malicious URLs</alt-title>
<alt-title alt-title-type="right-running-head">Securing Web by Predicting Malicious URLs</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Khan</surname><given-names>Imran</given-names></name></contrib>
<contrib id="author-2" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Megavarnam</surname><given-names>Meenakshi</given-names></name><email>meenaaksharadn@gmail.com</email></contrib>
<aff><institution>School of Computer Science, University of Hertfordshire</institution>, <addr-line>Hatfield, AL10 9AB</addr-line>, <country>UK</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Meenakshi Megavarnam. Email: <email>meenaaksharadn@gmail.com</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2024</year></pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>06</day>
<month>12</month>
<year>2024</year>
</pub-date>
<volume>6</volume>
<issue>1</issue>
<fpage>117</fpage>
<lpage>130</lpage>
<history>
<date date-type="received">
<day>05</day>
<month>12</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>08</day>
<month>11</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2024 The Authors.</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Published by Tech Science Press.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_JCS_48332.pdf"></self-uri>
<abstract>
<p>A URL (Uniform Resource Locator) is used to locate a digital resource. With this URL, an attacker can perform a variety of attacks, which can lead to serious consequences for both individuals and organizations. Therefore, attackers create malicious URLs to gain access to an organization&#x2019;s systems or sensitive information. It is crucial to secure individuals and organizations against these malicious URLs. A combination of machine learning and deep learning was used to predict malicious URLs. This research contributes significantly to the field of cybersecurity by proposing a model that seamlessly integrates the accuracy of machine learning with the swiftness of deep learning. The strategic fusion of Random Forest (RF) and Multilayer Perceptron (MLP) with an accuracy of 81% represents a noteworthy advancement, offering a balanced solution for robust cybersecurity. This study found that by combining RF and MLP, an efficient model was developed with an accuracy of 81% and a training time of 33.78 s.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Malicious URLs</kwd>
<kwd>prediction</kwd>
<kwd>machine learning</kwd>
<kwd>deep learning</kwd>
<kwd>random forest</kwd>
<kwd>multilayer perceptron</kwd>
<kwd>securing web</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>One cannot avoid the use of the internet in today&#x2019;s world. Therefore, web security has become critical in securing individuals and organizations from cyber threats. By implementing stringent and proactive web security measures, companies can protect their online environment and provide a safe place for their consumers [<xref ref-type="bibr" rid="ref-1">1</xref>]. When workers have access to dangerous files and websites, a firewall, intrusion prevention system, URL filtration, and access restrictions can be implemented to reduce the company&#x2019;s risk [<xref ref-type="bibr" rid="ref-2">2</xref>]. There are various kinds of online assaults like cross-site scripting, SQL injection, phishing, denial of service, and many more. The main goal of cyber criminals, who pose as legitimate website, is to access sensitive data and systems from a company or individual for financial gain.</p>
<p>Malicious URLs are one the important factors through which many cyber-attacks occur. Uniform resource locator (URL) is the permanent address for the resources. These resources can be of any kind, such as files, audio, and images. Protocol controls the information transmission in a network. The resource ID is placed after the URL resource type [<xref ref-type="bibr" rid="ref-3">3</xref>]. The URL is used to find information from a specific place on the World Wide Web. The primary objective of URL filtering is to prevent online assaults, therefore strengthening cybersecurity for organizations and individuals [<xref ref-type="bibr" rid="ref-4">4</xref>].</p>
<p>The motivation for this research comes from the ever-evolving landscape of cyber threats, particularly those involving malicious URLs. URLs pose a significant challenge, threatening individuals and organizations with data breaches, financial losses, and reputation damage. This research makes a substantial contribution to the field of cybersecurity by focusing on preventing malicious URLs. The proposed fusion of RF (Random Forest) and MLP (Multilayer Perceptron) represents a novel advancement, offering a balanced solution for predicting malicious URLs. This kind of attack can be prevented by implementing a model for predicting the malicious URLs.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Literature Review</title>
<p>The research paper titled &#x201C;Detecting Malicious URLs Using Binary Classification Through Adaboost Algorithm&#x201D; [<xref ref-type="bibr" rid="ref-5">5</xref>] uses machine learning to create a comprehensive prototype to predict malicious URLs. This research focuses on exploring the perfect formulation for finding malicious URLs using machine learning. It also introduces an approach to leverage the Adaboost algorithm. Adaboost was selected due to its flexibility, as it can be combined with other machine learning algorithms. <xref ref-type="fig" rid="fig-1">Fig. 1</xref> is the result of this research paper that highlights the number of malware occurrences classified into different categories.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Results of Khan, F. (Reprinted from Reference [<xref ref-type="bibr" rid="ref-5">5</xref>])</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-1.tif"/>
</fig>
<p>The research paper, &#x201C;An Enhanced Deep Learning-Based Phishing Detection Mechanism to Effectively Identify Malicious URLs Using Variational Autoencoders&#x201D; [<xref ref-type="bibr" rid="ref-6">6</xref>], focuses on introducing an advanced deep learning-based model for predicting phishing. This enhances the overall capability of the model to predict malicious URLs. This paper combines the strengths of Variational Autoencoders (VAE) and Deep Neural Networks (DNN) to capture the intrinsic features of URLs, thereby enhancing the model&#x2019;s ability to identify phishing URLs. The dataset used in this research contained 100,000 URLs from two open sources: the ISCX-URL-2016 dataset and the Kaggle dataset. The proposed model achieved an accuracy of 97.45%, with a response time of 1.9 s, which is superior to all other evaluated models.</p>
<p><xref ref-type="fig" rid="fig-2">Fig. 2</xref>, which is the result of this research, clearly states the performance of all the models in terms of precision, recall, F1 score, and accuracy. Among all the models, the VAE-DNN model outperformed the others with an accuracy of 97.45%.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Results of Prabakaran, M. K. (Reprinted from Reference [<xref ref-type="bibr" rid="ref-6">6</xref>])</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-2.tif"/>
</fig>
<p>The research paper by Aljabri et al. [<xref ref-type="bibr" rid="ref-7">7</xref>] focused on the literature review of existing papers related to finding malicious URLs using machine learning in both Arabic and non-Arabic content. It mainly focused on key findings, specifically the use of lexical features in a URL to predict malicious content. It was also found that there was a recurrent use of Support Vector Machine (SVM), Random Forest (RF), and Na&#x000EF;ve Bayes (NB) in the reviewed papers. Additionally, the performance of the Convolutional Neural Network (CNN) and XGBoost models was exceptional, with an accuracy of 99.98%.</p>
<p>In the research paper, &#x201C;Detecting Malicious URLs via a Keyword-Based Convolutional Gated-Recurrent-Unit Neural Network&#x201D; [<xref ref-type="bibr" rid="ref-8">8</xref>], a Convolutional Gated Recurrent Unit (CGRU) neural network was introduced to predict malicious URLs. This model features character-based text classification. The traditional pooling layer is replaced with the Gated Recurrent Unit (GRU) to enhance the capturing of temporal features in the URL. <xref ref-type="fig" rid="fig-3">Fig. 3</xref> shows the comparison results of all the models on the test set. The comparison reveals that the CGRU model has the highest accuracy of 99.6%, outperforming all other models.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Results of Yang, W. (Reprinted from Reference [<xref ref-type="bibr" rid="ref-8">8</xref>])</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-3.tif"/>
</fig>
<p>The research paper, &#x201C;A Malicious URLs Detection System Using Optimization and Machine Learning Classifiers&#x201D; [<xref ref-type="bibr" rid="ref-9">9</xref>], aims to assess the efficiency of machine learning models in predicting malicious URLs. It uses a bio-inspired algorithm, Particle Swarm Optimization (PSO), which is a feature optimization method to select critical URL attributes for detecting malicious URLs. By combining machine learning and static analysis, this study improves prediction accuracy.</p>
<p><xref ref-type="fig" rid="fig-4">Fig. 4</xref> demonstrates the detection performance of the five classifiers used in this research. The performance is evaluated in terms of accuracy, true positive rate (TPR), false positive rate (FPR), precision, recall, and F-measure. <xref ref-type="fig" rid="fig-4">Fig. 4</xref> indicates that the Na&#x00EF;ve Bayes and SVM have the highest precision in predicting the malicious dataset.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>Results of Lee, O. V. (Reprinted from Reference [<xref ref-type="bibr" rid="ref-9">9</xref>])</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-4.tif"/>
</fig>
<p>In the research papers listed in <xref ref-type="table" rid="table-1">Table 1</xref>, the authors have used either machine learning or deep learning to detect malicious URLs. However, this research paper combines both machine learning and deep learning algorithms to leverage the strengths of both: the accuracy of machine learning and the swiftness of deep learning. Thus, this proposed method can be more efficient in detecting malicious URLs.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Literature review comparison</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left" />
<col align="left" />
<col align="left" />
</colgroup>
<thead>
<tr>
<th>Reference</th>
<th>Approach/model</th>
<th>Accuracy</th>
</tr>
</thead>
<tbody>
<tr>
<td>[<xref ref-type="bibr" rid="ref-5">5</xref>]</td>
<td>AdaBoost algorithm</td>
<td>Higher than other ML algorithms</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-6">6</xref>]</td>
<td>VAE and deep neural network</td>
<td>97.45%</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-7">7</xref>]</td>
<td>CNN with XGBoost</td>
<td>99.98%</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-8">8</xref>]</td>
<td>CGRU neural network</td>
<td>99.61%</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-9">9</xref>]</td>
<td>Machine learning &#x002B; Static analysis (NB, SVM)</td>
<td>99%</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3">
<label>3</label>
<title>Aim</title>
<p>This research paper aims to implement an efficient model to predict malicious URLs in the categories of phishing, malware, and defacement. It will use different machine learning and deep learning algorithms to build a system for predicting malicious URLs. Furthermore, it will also combine machine learning and deep learning to determine if this hybrid approach is more efficient than individual algorithms.</p>
</sec>
<sec id="s4">
<label>4</label>
<title>Research Question</title>
<p>Which algorithm is best to find malicious URLs in the categories of Benign, Phishing, Malware, and defacement?</p>
</sec>
<sec id="s5">
<label>5</label>
<title>Proposed Method</title>
<p>This research paper presents a combination of Random Forest (machine learning) and Multilayer Perceptron (deep learning) algorithms to predict malicious URLs in the categories of benign, phishing, malware, and defacement. This combination leverages the accuracy of machine learning and the swiftness of deep learning to create a more efficient model for detecting malicious URLs. This study emphasizes the benefits of the combined model.</p>
</sec>
<sec id="s6">
<label>6</label>
<title>Dataset</title>
<p>This study utilizes the dataset from Kaggle [<xref ref-type="bibr" rid="ref-10">10</xref>], which is open-source. The dataset comprises 651,191 URLs categorized as benign, malware, defacement, and phishing. Machine learning and deep learning models are trained using this dataset to detect malicious URLs, thereby enhancing web security. The dataset consists of two columns: URL and type.</p>
</sec>
<sec id="s7">
<label>7</label>
<title>Preprocessing</title>
<p>The dataset was preprocessed by removing all null values, duplicate values, and stop words before usage. Additionally, label encoding was applied to replace the &#x201C;type&#x201D; column with integers, as illustrated in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>.</p>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>Label encoded malicious URL dataset</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-5.tif"/>
</fig>
<p>The type column is encoded as follows:</p>
<p>0&#x2014;benign</p>
<p>1&#x2014;defacement</p>
<p>2&#x2014;malware</p>
<p>3&#x2014;phishing</p>
<p>Now, the preprocessed dataset is used in this research.</p>
<p>Visualizations of the URL types are also presented in <xref ref-type="fig" rid="fig-6">Fig. 6</xref>.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>Visualization of URL types</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-6.tif"/>
</fig>
</sec>
<sec id="s8">
<label>8</label>
<title>Evaluation Metrics</title>
<p>This section focuses on evaluating different algorithms using key metrics such as accuracy, F1 score, precision, and recall.</p>
<sec id="s8_1">
<label>8.1</label>
<title>Confusion Matrix</title>
<p>The confusion matrix provides clarity on true positives, true negatives, false positives, and false negatives generated by the models. It illustrates how effectively the model distinguishes between true and false malicious URLs [<xref ref-type="bibr" rid="ref-11">11</xref>]. <xref ref-type="fig" rid="fig-7">Fig. 7</xref> presents the confusion matrix, which visually represents the performance of binary classification</p>
<fig id="fig-7">
<label>Figure 7</label>
<caption>
<title>Confusion matrix for performance evaluation (Reprinted from Reference [<xref ref-type="bibr" rid="ref-12">12</xref>])</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-7.tif"/>
</fig>
</sec>
<sec id="s8_2">
<label>8.2</label>
<title>Precision</title>
<p>Precision indicates the algorithm&#x2019;s effectiveness in categorizing URLs into groups such as benign, phishing, malware, and defacement.</p>
</sec>
<sec id="s8_3">
<label>8.3</label>
<title>Recall</title>
<p>Recall demonstrates the model&#x2019;s ability to accurately predict actual malicious URLs, thereby minimizing false negatives. It plays a crucial role in providing comprehensive predictions [<xref ref-type="bibr" rid="ref-13">13</xref>].</p>
</sec>
<sec id="s8_4">
<label>8.4</label>
<title>F1 Score</title>
<p>It provides an overall evaluation of the machine learning and deep learning models which includes precision and recall. This metric estimates the model&#x2019;s capability to balance between accuracy and completeness [<xref ref-type="bibr" rid="ref-14">14</xref>].</p>
</sec>
<sec id="s8_5">
<label>8.5</label>
<title>Accuracy</title>
<p>Accuracy measures the correctness of the model in predicting malicious URLs. It represents the ratio of correctly predicted malicious URLs to the total predicted URLs [<xref ref-type="bibr" rid="ref-15">15</xref>].</p>
<p><xref ref-type="fig" rid="fig-8">Fig. 8</xref> displays the mathematical formulas for these performance metrics, which are used to compare the results of different models.</p>
<fig id="fig-8">
<label>Figure 8</label>
<caption>
<title>Performance metrics (Reprinted from Reference [<xref ref-type="bibr" rid="ref-16">16</xref>])</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-8.tif"/>
</fig>
</sec>
<sec id="s8_6">
<label>8.6</label>
<title>Training Time</title>
<p>It indicates the time taken by the deep learning and machine learning models to learn from the data. This gives an insight into the practical and effective aspects of the model [<xref ref-type="bibr" rid="ref-17">17</xref>].</p>
</sec>
<sec id="s8_7">
<label>8.7</label>
<title>Validation Time</title>
<p>Validation time refers to the duration taken to evaluate the overall performance of the training model on the validation dataset.</p>
</sec>
<sec id="s8_8">
<label>8.8</label>
<title>Testing Time</title>
<p>Testing time refers to the duration taken by the model to predict malicious URLs from new data. This metric is crucial for evaluating the responsiveness of machine learning or deep learning models in real-time scenarios [<xref ref-type="bibr" rid="ref-18">18</xref>].</p>
</sec>
</sec>
<sec id="s9">
<label>9</label>
<title>Results and Evaluation</title>
<p>This section discusses the results, evaluation, and comparison of different models.</p>
<sec id="s9_1">
<label>9.1</label>
<title>Machine Learning Algorithm</title>
<sec id="s9_1_1">
<label>9.1.1</label>
<title>Decision Tree</title>
<p>According to <xref ref-type="table" rid="table-2">Tables 2</xref> and <xref ref-type="table" rid="table-3">3</xref>, the Decision Tree (DT) algorithm performs well in predicting benign and malware URLs. However, its performance slightly diminishes when identifying defacement URLs, and it shows very low performance in predicting phishing URLs. <xref ref-type="fig" rid="fig-9">Fig. 9</xref> illustrates the confusion matrix for the validation and testing phases of the DT algorithm.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>DT validation results</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
</colgroup>
<thead>
<tr>
<th>URL type</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
<th>Training time (s)</th>
<th>Validation time (s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0.85</td>
<td>0.85</td>
<td>0.85</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0.72</td>
<td>0.72</td>
<td>0.72</td>
<td>0.78</td>
<td>7.22</td>
<td>0.49</td>
</tr>
<tr>
<td>2</td>
<td>0.85</td>
<td>0.86</td>
<td>0.86</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>0.52</td>
<td>0.51</td>
<td>0.51</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</table-wrap><table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>DT testing results</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
</colgroup>
<thead>
<tr>
<th>URL type</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
<th>Testing time (s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0.84</td>
<td>0.86</td>
<td>0.85</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0.73</td>
<td>0.72</td>
<td>0.72</td>
<td>0.78</td>
<td>0.33</td>
</tr>
<tr>
<td>2</td>
<td>0.85</td>
<td>0.86</td>
<td>0.86</td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>0.52</td>
<td>0.49</td>
<td>0.51</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-9">
<label>Figure 9</label>
<caption>
<title>Confusion matrix during validation and testing</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-9.tif"/>
</fig>
<p>The results from both validation and testing phases are almost similar, indicating that the model is not overfitting.</p>
</sec>
<sec id="s9_1_2">
<label>9.1.2</label>
<title>Na&#x000EF;ve Bayes</title>
<p>Based on <xref ref-type="table" rid="table-4">Tables 4</xref> and <xref ref-type="table" rid="table-5">5</xref>, Na&#x000EF;ve Bayes achieves an accuracy of 68% in both the validation and testing datasets. However, there are notable differences in its effectiveness in predicting specific types of URLs. The model performs well in predicting benign URLs (Type 0), but its performance decreases significantly for defacement (Type 1), malware (Type 2), and phishing (Type 3) URLs, especially in terms of recall. This suggests that the Na&#x000EF;ve Bayes algorithm may fail to identify certain potentially harmful types of URLs.</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>NB validation results</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
</colgroup>
<thead>
<tr>
<th>URL type</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
<th>Training time (s)</th>
<th>Validation time (s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0.81</td>
<td>0.76</td>
<td>0.78</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0.45</td>
<td>0.63</td>
<td>0.52</td>
<td>0.68</td>
<td>0.29</td>
<td>0.57</td>
</tr>
<tr>
<td>2</td>
<td>0.41</td>
<td>0.61</td>
<td>0.49</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>0.56</td>
<td>0.39</td>
<td>0.46</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</table-wrap><table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>NB testing results</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
</colgroup>
<thead>
<tr>
<th>URL type</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
<th>Testing time (s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0.81</td>
<td>0.76</td>
<td>0.78</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0.45</td>
<td>0.63</td>
<td>0.52</td>
<td>0.68</td>
<td>0.41</td>
</tr>
<tr>
<td>2</td>
<td>0.42</td>
<td>0.62</td>
<td>0.50</td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>0.56</td>
<td>0.39</td>
<td>0.46</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The confusion matrix in <xref ref-type="fig" rid="fig-10">Fig. 10</xref> illustrates that the Na&#x000EF;ve Bayes algorithm incorrectly labels defacement, malware, and phishing URLs during testing. These results indicate that Na&#x000EF;ve Bayes is less efficient compared to the Decision Tree algorithm.</p>
<fig id="fig-10">
<label>Figure 10</label>
<caption>
<title>Confusion matrix during validation and testing</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-10.tif"/>
</fig>
</sec>
<sec id="s9_1_3">
<label>9.1.3</label>
<title>Random Forest</title>
<p>The above results demonstrate the performance of the Random Forest (RF) algorithm in both validation and testing phases. <xref ref-type="table" rid="table-6">Tables 6</xref> and <xref ref-type="table" rid="table-7">7</xref> indicate that the RF model performs exceptionally well across all four types of URLs, achieving a high accuracy of 87% in both validation and testing.</p>
<table-wrap id="table-6">
<label>Table 6</label>
<caption>
<title>RF validation results</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
</colgroup>
<thead>
<tr>
<th>URL type</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
<th>Training time (s)</th>
<th>Validation time (s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0.85</td>
<td>0.98</td>
<td>0.91</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0.96</td>
<td>0.73</td>
<td>0.83</td>
<td>0.87</td>
<td>1238.28</td>
<td>18.86</td>
</tr>
<tr>
<td>2</td>
<td>0.99</td>
<td>0.88</td>
<td>0.93</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>0.88</td>
<td>0.49</td>
<td>0.63</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</table-wrap><table-wrap id="table-7">
<label>Table 7</label>
<caption>
<title>RF testing results</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
</colgroup>
<thead>
<tr>
<th>URL type</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
<th>Testing time (s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0.85</td>
<td>0.98</td>
<td>0.91</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0.96</td>
<td>0.72</td>
<td>0.83</td>
<td>0.87</td>
<td>24.76</td>
</tr>
<tr>
<td>2</td>
<td>0.99</td>
<td>0.88</td>
<td>0.93</td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>0.87</td>
<td>0.49</td>
<td>0.63</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Specifically, the RF model exhibits strong performance in predicting defacement URLs, achieving a precision of 96%, recall rate of 73%, and F1 score of 83%. This can be because of the default characteristics found in the defacement URLs.</p>
<p>For malware URLs, the RF model achieves an impressive F1 score of 93%, recall score of 88%, and precision score of 99%.</p>
<p><xref ref-type="fig" rid="fig-11">Fig. 11</xref> presents the confusion matrix for the RF algorithm in both validation and testing phases. Despite its overall strong performance, the model shows slightly lower effectiveness in predicting benign and phishing URLs, as observed from <xref ref-type="table" rid="table-6">Tables 6</xref> and <xref ref-type="table" rid="table-7">7</xref>. Nonetheless, Random Forest proves to be effective in identifying harmful URLs.</p>
<fig id="fig-11">
<label>Figure 11</label>
<caption>
<title>Confusion matrix during validation and testing</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-11.tif"/>
</fig>
</sec>
</sec>
<sec id="s9_2">
<label>9.2</label>
<title>Deep Learning</title>
<sec id="s9_2_1">
<label>9.2.1</label>
<title>Multi-Layer Perceptron</title>
<p>Based on <xref ref-type="table" rid="table-8">Tables 8</xref> and <xref ref-type="table" rid="table-9">9</xref>, the Multilayer Perceptron (MLP) demonstrates strong performance in predicting malicious URLs with an accuracy of 82% in both the validation and testing datasets. However, there are variations in its effectiveness across different types of URLs. The model performs well (over 80%) in predicting benign, defacement, and malware URLs, but its performance is slightly lower when predicting phishing URLs.</p>
<table-wrap id="table-8">
<label>Table 8</label>
<caption>
<title>MLP validation results</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
</colgroup>
<thead>
<tr>
<th>URL type</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
<th>Training time (s)</th>
<th>Validation time (s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0.82</td>
<td>0.95</td>
<td>0.88</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0.82</td>
<td>0.67</td>
<td>0.74</td>
<td>0.82</td>
<td>975.76</td>
<td>0.84</td>
</tr>
<tr>
<td>2</td>
<td>0.87</td>
<td>0.78</td>
<td>0.82</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>0.75</td>
<td>0.40</td>
<td>0.52</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</table-wrap><table-wrap id="table-9">
<label>Table 9</label>
<caption>
<title>MLP testing results</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
</colgroup>
<thead>
<tr>
<th>URL type</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
<th>Testing time (s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0.82</td>
<td>0.95</td>
<td>0.88</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0.82</td>
<td>0.66</td>
<td>0.73</td>
<td>0.82</td>
<td>0.79</td>
</tr>
<tr>
<td>2</td>
<td>0.86</td>
<td>0.79</td>
<td>0.83</td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>0.75</td>
<td>0.40</td>
<td>0.52</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The confusion matrix in <xref ref-type="fig" rid="fig-12">Fig. 12</xref> indicates that there is a possibility for the model to misclassify phishing URLs as benign. This could be due to common characteristics between phishing and benign URLs.</p>
<fig id="fig-12">
<label>Figure 12</label>
<caption>
<title>Confusion matrix during validation and testing</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="JCS_48332-fig-12.tif"/>
</fig>
<p>MLP exhibits a training time of 975.76 s and a testing time of 0.79 s, which are relatively longer. Despite this, the findings suggest that MLP is an efficient algorithm for predicting malicious URLs. It&#x2019;s worth noting that Multilayer Perceptron (DL) requires less training and testing time compared to Random Forest (ML).</p>
</sec>
</sec>
<sec id="s9_3">
<label>9.3</label>
<title>Performance of Individual Algorithms and ML-DL Combined Algorithms</title>
<p><xref ref-type="table" rid="table-10">Table 10</xref> presents the performance of different combinations of machine learning (ML) and deep learning (DL). The following combinations in <xref ref-type="table" rid="table-10">Table 10</xref> achieve a testing accuracy of 80%: (a) DT, NB, RF &#x0026; MLP; (b) DT, RF &#x0026; MLP; (c) RF &#x0026; MLP. The RF and MLP combination exhibits a training time of 33.78 s, a validation time of 9.61 s, and a testing time of 9.41 s.</p>
<table-wrap id="table-10">
<label>Table 10</label>
<caption>
<title>Results of individual algorithms and ML-DL combined algorithms</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
</colgroup>
<thead>
<tr>
<th>Individual/Combined</th>
<th>Algorithms</th>
<th align="center" colspan="2">Accuracy</th>
<th align="center" colspan="3">Time taken</th>
</tr>
<tr>
<th></th>
<th></th>
<th>Validation</th>
<th>Testing</th>
<th>Training</th>
<th>Validation</th>
<th>Testing</th>
</tr>
</thead>
<tbody>
<tr>
<td>Individual</td>
<td>DT (ML)</td>
<td>0.78</td>
<td>0.78</td>
<td>7.22</td>
<td>0.49</td>
<td>0.33</td>
</tr>
<tr>
<td></td>
<td>NB (ML)</td>
<td>0.68</td>
<td>0.68</td>
<td>0.29</td>
<td>0.57</td>
<td>0.41</td>
</tr>
<tr>
<td></td>
<td>RF (ML)</td>
<td>0.87</td>
<td>0.87</td>
<td>1238.28</td>
<td>18.86</td>
<td>24.76</td>
</tr>
<tr>
<td></td>
<td>MLP (DL)</td>
<td>0.82</td>
<td>0.82</td>
<td>975.76</td>
<td>0.84</td>
<td>0.79</td>
</tr>
<tr>
<td>Combined</td>
<td>RF &#x0026; MLP</td>
<td>0.81</td>
<td>0.80</td>
<td>33.78</td>
<td>9.61</td>
<td>9.41</td>
</tr>
<tr>
<td></td>
<td>MLP &#x0026; DT</td>
<td>0.77</td>
<td>0.77</td>
<td>17.05</td>
<td>1.19</td>
<td>1.20</td>
</tr>
<tr>
<td></td>
<td>NB &#x0026; MLP</td>
<td>0.78</td>
<td>0.78</td>
<td>16.08</td>
<td>1.30</td>
<td>2.04</td>
</tr>
<tr>
<td></td>
<td>DT, NB,</td>
<td>0.80</td>
<td>0.80</td>
<td>33.88</td>
<td>10.61</td>
<td>13.69</td>
</tr>
<tr>
<td></td>
<td>RF &#x0026; MLP</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>NB, RF &#x0026; MLP</td>
<td>0.79</td>
<td>0.79</td>
<td>35.80</td>
<td>9.26</td>
<td>10.37</td>
</tr>
<tr>
<td></td>
<td>DT, NB &#x0026; MLP</td>
<td>0.78</td>
<td>0.78</td>
<td>16.27</td>
<td>1.35</td>
<td>1.33</td>
</tr>
<tr>
<td></td>
<td>DT, RF &#x0026; MLP</td>
<td>0.80</td>
<td>0.80</td>
<td>36.12</td>
<td>10.00</td>
<td>9.79</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Although combinations like MLP &#x0026; DT, NB &#x0026; MLP, NB, RF &#x0026; MLP, and DT, NB, &#x0026; MLP may have slightly lower accuracy, they offer quicker validation, training, and testing times.</p>
<p>The following combination has the best performance, RF &#x0026; MLP, NB, RF &#x0026; MLP, and DT, RF &#x0026; MLP. However, if speed is crucial, the RF &#x0026; MLP combination stands out as the most effective in identifying malicious URLs.</p>
</sec>
</sec>
<sec id="s10">
<label>10</label>
<title>Conclusion</title>
<p>This work focused on developing the best model for predicting malicious URLs using machine learning and deep learning. Three machine learning algorithms, namely Decision Tree, Random Forest, and Na&#x00EF;ve Bayes, and one deep learning algorithm, namely Multilayer Perceptron, were used in this paper. All the machine learning and deep learning algorithms were evaluated individually and in different combinations using various evaluation metrics. The combination of Random Forest (RF) and Multilayer Perceptron (MLP) was found to be the best model with an accuracy of 81%. It balances accuracy, training time, and testing time, making this combination the most efficient among all the algorithms.</p>
</sec>
</body>
<back>
<ack>
<p>None.</p>
</ack>
<sec><title>Funding Statement</title>
<p>The authors received no specific funding for this study.</p>
</sec>
<sec><title>Author Contributions</title>
<p>Study, conceptualization and design, Imran Khan, Meenakshi Megavarnam; Data collection, Meenakshi Megavarnam; Analysis and interpretation results, Imran Khan; Supervision, Imran Khan; Manuscript preparation, Meenakshi Megavarnam. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability"><title>Availability of Data and Materials</title>
<p>The authors confirm that the data supporting the findings of this study are available within the article.</p>
</sec>
<sec><title>Ethics Approval</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement"><title>Conflicts of Interest</title>
<p>The authors declare no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Andrew</surname></string-name></person-group>, &#x201C;<chapter-title>The vulnerability of vital systems: How &#x2018;critical infrastructure&#x2019; became a security problem</chapter-title>,&#x201D; in <source>Securing &#x2018;the Homeland&#x2019;</source>. <publisher-loc>London, UK</publisher-loc>: <publisher-name>Routledge</publisher-name>, <year>2020</year>, pp. <fpage>17</fpage>&#x2013;<lpage>39</lpage>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Iwendi</surname></string-name>, <string-name><given-names>J. H.</given-names> <surname>Anajemba</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Biamba</surname></string-name>, and <string-name><given-names>D.</given-names> <surname>Ngubo</surname></string-name></person-group>, &#x201C;<article-title>Security of things intrusion detection system for smart healthcare</article-title>,&#x201D; <source>Electronics</source>, vol. <volume>10</volume>, no. <issue>12</issue>, <year>Jun. 2021</year>, <comment>Art. no. 1375</comment>. doi: <pub-id pub-id-type="doi">10.3390/electronics10121375</pub-id>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>X.</given-names> <surname>Wu</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Threat analysis for space information network based on network security attributes: A review</article-title>,&#x201D; <source>Complex Intell. Syst.</source>, vol. <volume>9</volume>, no. <issue>3</issue>, pp. <fpage>3429</fpage>&#x2013;<lpage>3468</lpage>, <year>Nov. 2022</year>. doi: <pub-id pub-id-type="doi">10.1007/s40747-022-00899-z</pub-id>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Tan</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Shi</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Yu</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Aloqaily</surname></string-name>, and <string-name><given-names>Y.</given-names> <surname>Jararweh</surname></string-name></person-group>, &#x201C;<article-title>A chain-empowered access control framework for smart devices in green internet of things</article-title>,&#x201D; <source>ACM Trans. Internet Technol.</source>, vol. <volume>21</volume>, no. <issue>3</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>20</lpage>, <year>Jun. 2021</year>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>Khan</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Ahamed</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Kadry</surname></string-name>, and <string-name><given-names>L. K.</given-names> <surname>Ramasamy</surname></string-name></person-group>, &#x201C;<article-title>Detecting malicious URLs using binary classification through adaboost algorithm</article-title>,&#x201D; <source>Int. J. Electr. Comput. Eng.</source>, vol. <volume>10</volume>, no. <issue>1</issue>, pp. <fpage>997</fpage>&#x2013;<lpage>1005</lpage>, <year>Feb. 2020</year>. doi: <pub-id pub-id-type="doi">10.11591/ijece.v10i1.pp997-1005</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M. K.</given-names> <surname>Prabakaran</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Meenakshi Sundaram</surname></string-name>, and <string-name><given-names>A. D.</given-names> <surname>Chandrasekar</surname></string-name></person-group>, &#x201C;<article-title>An enhanced deep learning-based phishing detection mechanism to effectively identify malicious URLs using variational autoencoders</article-title>,&#x201D; <source>IET Inf. Secur.</source>, vol. <volume>17</volume>, no. <issue>3</issue>, pp. <fpage>423</fpage>&#x2013;<lpage>440</lpage>, <year>Jan. 2023</year>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M. M.</given-names> <surname>Aljabri</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>An assessment of lexical, network, and content-based features for detecting malicious urls using machine learning and deep learning models</article-title>,&#x201D; <source>Comput. Intell. Neurosci.</source>, vol. <volume>2022</volume>, no. <issue>1</issue>, <year>2022</year>, <comment>Art. no. 3241216</comment>. doi: <pub-id pub-id-type="doi">10.1155/2022/3241216</pub-id>; <pub-id pub-id-type="pmid">36059391</pub-id></mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Zuo</surname></string-name>, and <string-name><given-names>B.</given-names> <surname>Cui</surname></string-name></person-group>, &#x201C;<article-title>Detecting malicious URLs via a keyword-based convolutional gated-recurrent-unit neural network</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>7</volume>, pp. <fpage>29891</fpage>&#x2013;<lpage>29900</lpage>, <year>Jan. 2019</year>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2019.2895751</pub-id>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>O. V.</given-names> <surname>Lee</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>A malicious URLs detection system using optimization and machine learning classifiers</article-title>,&#x201D; <source>Indones J. Electr. Eng. Comput. Sci.</source>, vol. <volume>17</volume>, no. <issue>3</issue>, pp. <fpage>1210</fpage>&#x2013;<lpage>1214</lpage>, <year>Mar. 2020</year>. doi: <pub-id pub-id-type="doi">10.11591/ijeecs.v17.i3.pp1210-1214</pub-id>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><given-names>N.</given-names> <surname>Bhadouria</surname></string-name></person-group>, &#x201C;<article-title>Malicious_URL&#x2019;s_Dataset</article-title>,&#x201D; <comment>2022. Accessed: Jun. 5, 2023</comment>. [Online]. Available: <ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/naveenbhadouria/malicious">https://www.kaggle.com/datasets/naveenbhadouria/malicious</ext-link></mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Alanazi</surname></string-name> and <string-name><given-names>A.</given-names> <surname>Gumaei</surname></string-name></person-group>, &#x201C;<article-title>A decision-fusion-based ensemble approach for malicious websites detection</article-title>,&#x201D; <source>Appl. Sci.</source>, vol. <volume>13</volume>, no. <issue>18</issue>, <year>Sep. 2023</year>, <comment>Art. no. 10260</comment>. doi: <pub-id pub-id-type="doi">10.3390/app131810260</pub-id>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>T.</given-names> <surname>Le</surname></string-name>, <string-name><given-names>M. Y.</given-names> <surname>Lee</surname></string-name>, <string-name><given-names>J. R.</given-names> <surname>Park</surname></string-name>, and <string-name><given-names>S. W.</given-names> <surname>Baik</surname></string-name></person-group>, &#x201C;<article-title>Oversampling techniques for bankruptcy prediction: Novel features from a transaction dataset</article-title>,&#x201D; <source>Symmetry</source>, vol. <volume>10</volume>, no. <issue>4</issue>, <year>Mar. 2018</year>, <comment>Art. no. 79</comment>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Anagnostis</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>A deep learning approach for anthracnose infected trees classification in walnut orchards</article-title>,&#x201D; <source>Comput. Electron. Agric.</source>, vol. <volume>182</volume>, <year>Mar. 2021</year>, <comment>Art. no. 105998</comment>. doi: <pub-id pub-id-type="doi">10.1016/j.compag.2021.105998</pub-id>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Aljabri</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Detecting malicious URLs using machine learning techniques: Review and research directions</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>10</volume>, pp. <fpage>121395</fpage>&#x2013;<lpage>121417</lpage>, <year>Aug. 2022</year>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2022.3222307</pub-id>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Mourtaji</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Bouhorma</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Alghazzawi</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Aldabbagh</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Alghamdi</surname></string-name></person-group>, &#x201C;<article-title>Hybrid rule-based solution for phishing URL detection using convolutional neural network</article-title>,&#x201D; <source>Wirel. Commun. Mob. Comput.</source>, vol. <volume>2021</volume>, no. <issue>1</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>24</lpage>, <year>2021</year>, <comment>Art. no. 8241104</comment>. doi: <pub-id pub-id-type="doi">10.1155/2021/8241104</pub-id>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Bayraktar</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Haznedar</surname></string-name>, <string-name><given-names>K. S.</given-names> <surname>Bayram</surname></string-name>, and <string-name><given-names>M. F.</given-names> <surname>Haso&#x011F;lu</surname></string-name></person-group>, &#x201C;<article-title>Plant disease detection by using adaptive neuro-fuzzy inference system</article-title>,&#x201D; <source>Tamap J. Eng.</source>, vol. <volume>2021</volume>, no. <issue>125</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>10</lpage>, <year>Sep. 2021</year>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>Liu</surname></string-name> and <string-name><given-names>B.</given-names> <surname>Lang</surname></string-name></person-group>, &#x201C;<article-title>Machine learning and deep learning methods for intrusion detection systems: A survey</article-title>,&#x201D; <source>Appl. Sci.</source>, vol. <volume>9</volume>, no. <issue>20</issue>, <year>Oct. 2019</year>, <comment>Art. no. 4396</comment>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>A. R.</given-names> <surname>Mohammed</surname></string-name>, <string-name><given-names>S. A.</given-names> <surname>Mohammed</surname></string-name>, and <string-name><given-names>S.</given-names> <surname>Shirmohammadi</surname></string-name></person-group>, &#x201C;<article-title>Machine learning and deep learning based traffic classification and prediction in software defined networking</article-title>,&#x201D; in <conf-name>IEEE Int. Symp. Meas. Netw. (M&#x0026;N)</conf-name>, <publisher-loc>Catania, Italy</publisher-loc>, <year>Jul. 8&#x2013;10, 2019</year>, pp. <fpage>1</fpage>&#x2013;<lpage>6</lpage>.</mixed-citation></ref>
</ref-list>
</back></article>