<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">IASC</journal-id>
<journal-id journal-id-type="nlm-ta">IASC</journal-id>
<journal-id journal-id-type="publisher-id">IASC</journal-id>
<journal-title-group>
<journal-title>Intelligent Automation &#x0026; Soft Computing</journal-title>
</journal-title-group>
<issn pub-type="epub">2326-005X</issn>
<issn pub-type="ppub">1079-8587</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">50452</article-id>
<article-id pub-id-type="doi">10.32604/iasc.2024.050452</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>A Hierarchical Two-Level Feature Fusion Approach for SMS Spam Filtering</article-title>
<alt-title alt-title-type="left-running-head">A Hierarchical Two-Level Feature Fusion Approach for SMS Spam Filtering</alt-title>
<alt-title alt-title-type="right-running-head">A Hierarchical Two-Level Feature Fusion Approach for SMS Spam Filtering</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Al-Kabbi</surname><given-names>Hussein Alaa</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><xref ref-type="aff" rid="aff-2">2</xref></contrib>
<contrib id="author-2" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Feizi-Derakhshi</surname><given-names>Mohammad-Reza</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><email>mfeizi@tabrizu.ac.ir</email></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Pashazadeh</surname><given-names>Saeed</given-names></name><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<aff id="aff-1"><label>1</label><institution>Computerized Intelligence Systems Laboratory, Department of Computer Engineering, University of Tabriz</institution>, <addr-line>Tabriz, 51368</addr-line>, <country>Iran</country></aff>
<aff id="aff-2"><label>2</label><institution>Ministry of Education Iraq, General Direction of Vocational Education, Al-Najaf, 54001</institution>, <country>Iraq</country></aff>
<aff id="aff-3"><label>3</label><institution>Department of Computer Engineering, University of Tabriz</institution>, <addr-line>Tabriz, 51368</addr-line>, <country>Iran</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Mohammad-Reza Feizi-Derakhshi. Email: <email>mfeizi@tabrizu.ac.ir</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2024</year></pub-date>
<pub-date date-type="pub" publication-format="electronic"><day>06</day><month>09</month><year>2024</year></pub-date>
<volume>39</volume>
<issue>4</issue>
<fpage>665</fpage>
<lpage>682</lpage>
<history>
<date date-type="received">
<day>07</day>
<month>2</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>27</day>
<month>5</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2024 The Authors.</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Published by Tech Science Press.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_IASC_50452.pdf"></self-uri>
<abstract>
<p>SMS spam poses a significant challenge to maintaining user privacy and security. Recently, spammers have employed fraudulent writing styles to bypass spam detection systems. This paper introduces a novel two-level detection system that utilizes deep learning techniques for effective spam identification to address the challenge of sophisticated SMS spam. The system comprises five steps, beginning with the preprocessing of SMS data. RoBERTa word embedding is then applied to convert text into a numerical format for deep learning analysis. Feature extraction is performed using a Convolutional Neural Network (CNN) for word-level analysis and a Bidirectional Long Short-Term Memory (BiLSTM) for sentence-level analysis. The two-level feature extraction enables a complete understanding of individual words and sentence structure. The novel part of the proposed approach is the Hierarchical Attention Network (HAN), which fuses and selects features at two levels through an attention mechanism. The HAN can deal with words and sentences to focus on the most pertinent aspects of messages for spam detection. This network is productive in capturing meaningful features, considering both word-level and sentence-level semantics. In the classification step, the model classifies the messages into spam and ham. This hybrid deep learning method improve the feature representation, and enhancing the model&#x2019;s spam detection capabilities. By significantly reducing the incidence of SMS spam, our model contributes to a safer mobile communication environment, protecting users against potential phishing attacks and scams, and aiding in compliance with privacy and security regulations. This model&#x2019;s performance was evaluated using the SMS Spam Collection Dataset from the UCI Machine Learning Repository. Cross-validation is employed to consider the dataset&#x2019;s imbalanced nature, ensuring a reliable evaluation. The proposed model achieved a good accuracy of 99.48%, underscoring its efficiency in identifying SMS spam.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>SMS spam detection</kwd>
<kwd>hierarchical attention network</kwd>
<kwd>text classification</kwd>
<kwd>natural language processing</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>Short Message Service (SMS) spam has become prevalent in mobile communication systems. Based on a recent publication by Slicktext [<xref ref-type="bibr" rid="ref-1">1</xref>], the usage of SMS is widespread, with approximately five billion people utilizing this communication channel. The number of SMS users is projected to reach 5.9 billion by 2025. Unfortunately, the increased prevalence of SMS usage has also resulted in a surge in malicious activities such as spam and smishing. These activities inconvenience users and pose significant financial risks to individuals and businesses [<xref ref-type="bibr" rid="ref-2">2</xref>]. The primary objective of the senders behind these spam messages is to illicitly acquire personal or financial information via SMS, often through the inclusion of fraudulent content, malicious links, or malware.</p>
<p>The paper highlights three main categories of SMS-based spam: (i) General SMS spam, which includes unwanted messages used for bulk marketing and spreading false information; (ii) premium rate scams that deceive individuals into dialing high-cost numbers or registering for expensive services under pretenses, and (iii) phishing or smishing tactics, in which recipients are sent texts prompting them to contact specific numbers as a ploy to obtain sensitive data for nefarious objectives [<xref ref-type="bibr" rid="ref-3">3</xref>]. The detection of SMS spam becomes essential to maintain user privacy and security. Traditional rule-based and keyword-based methods for SMS spam detection have yet to prove sufficient to handle the evolving nature of spam messages, which often employ low computational techniques to evade detection [<xref ref-type="bibr" rid="ref-4">4</xref>].</p>
<p>Artificial intelligence tools have been developed to assist in various fields, including healthcare [<xref ref-type="bibr" rid="ref-5">5</xref>], social networks [<xref ref-type="bibr" rid="ref-6">6</xref>], network security [<xref ref-type="bibr" rid="ref-7">7</xref>], and other real-life applications. Natural Language Processing (NLP) and deep learning advancements have opened up new opportunities for improving SMS spam detection in recent years. Deep learning models, such as recurrent neural networks (RNN) and transformer-based architectures [<xref ref-type="bibr" rid="ref-8">8</xref>], have performed remarkably in various text classification tasks. These models can effectively capture text messages&#x2019; semantic and contextual information, enabling more accurate spam detection [<xref ref-type="bibr" rid="ref-9">9</xref>]. Spammers have begun adopting novel writing styles to evade SMS spam detection approaches. By modifying linguistic patterns, grammar usage, and content structure, spammers aim to create messages that bypass filters [<xref ref-type="bibr" rid="ref-10">10</xref>]. This dynamic shift in writing techniques presents a significant challenge for existing spam detection methods, as they often rely on historical data and recognizable patterns. In response to these evolving tactics, this paper presents a novel approach to addressing this challenge through a two-level SMS spam detection system based on words and sentences. Recognizing that words make sentences and sentences make documents [<xref ref-type="bibr" rid="ref-11">11</xref>], our approach allows us to detect all attempts by fraudsters to bypass spam detection tools, such as using different words or changing the location of words within sentences.</p>
<p>The proposed two-level SMS spam detection method consists of five steps: preprocessing, RoBERTa word embedding, two-level feature extraction using CNN for word level and BiLSTM for sentence level, feature fusion and selection through Hierarchical Attention Network (HAN), and classification. The innovative part of our approach is using a Hierarchical Attention Network (HAN) [<xref ref-type="bibr" rid="ref-12">12</xref>]. Which innovatively integrate the characteristics extracted from (word-level) and (sentence-level) dimensions. This synthesis is not merely combinative but is competitively evaluative, with the HAN&#x2019;s attention mechanisms intricately assessing the prominence of each linguistic element. We evaluate the performance of the proposed model on the UCI SMS dataset [<xref ref-type="bibr" rid="ref-13">13</xref>], which contains more than 5000 labeled SMS messages. The results demonstrate the effectiveness of the proposed method in accurately detecting SMS spam. To summarize the research insights, this study focuses on the following Research Questions (RQs):
<list list-type="bullet">
<list-item>
<p><bold>RQ1:</bold> How does the integration of a two-level feature fusion approach improve the accuracy and robustness of SMS spam detection, especially for short texts lacking contextual information?</p></list-item>
<list-item>
<p><bold>RQ2:</bold> How can we confirm that the two-level method excels in SMS spam detection?</p></list-item>
</list></p>
<p>The main contributions of this paper and possible Answers to the Research Questions (ARQs) are summarized as follows:
<list list-type="bullet">
<list-item>
<p><bold>ARQ1:</bold> Hierarchical Feature Integration&#x2014;The proposed two-level model innovatively combines feature extraction at both the word and sentence levels through a competitive fusion mechanism within the Hierarchical Attention Network. This allows for a more nuanced representation of textual features, which is critical for accurate SMS spam detection.</p></list-item>
<list-item>
<p><bold>ARQ2:</bold> Superior Performance&#x2014;Empirical evaluations on the UCI SMS dataset demonstrate that the model achieves state-of-the-art performance, with an accuracy of 99.48%, indicating its capability to manage various types of SMS spam.</p></list-item>
</list></p>
<p>The remainder of this paper is organized as follows: <xref ref-type="sec" rid="s2">Section 2</xref> provides an overview of related work in SMS spam detection. <xref ref-type="sec" rid="s3">Section 3</xref> briefly explains the techniques used in the proposed method. <xref ref-type="sec" rid="s4">Section 4</xref> describes the methodology in detail. <xref ref-type="sec" rid="s5">Section 5</xref> presents the experimental setup, and <xref ref-type="sec" rid="s6">Section 6</xref> presents the evaluation results. <xref ref-type="sec" rid="s7">Section 7</xref> for the desiccation, and Finally, <xref ref-type="sec" rid="s8">Section 8</xref> concludes the paper, highlighting the contributions and future work directions.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Work</title>
<p>The field of SMS spam detection has evolved significantly, witnessing a transition from rule-based methods to deep learning techniques. In the early stages, rule-based approaches utilized predefined patterns and keywords to identify spam messages. However, these methods had limitations in adapting to evolving spam tactics and handling noisy data. We divided the related work into two groups.</p>
<sec id="s2_1">
<label>2.1</label>
<title>Traditional Methods</title>
<p>SMS spam filtering has been a long-standing research subject, with traditional machine learning methods like SVM [<xref ref-type="bibr" rid="ref-14">14</xref>], Naive Bayes [<xref ref-type="bibr" rid="ref-15">15</xref>], and decision trees [<xref ref-type="bibr" rid="ref-16">16</xref>] being proposed. However, these approaches need complex feature engineering and have difficulty dealing with noisy or imbalanced data [<xref ref-type="bibr" rid="ref-17">17</xref>]. Most of these studies aimed to improve the classifier&#x2019;s architecture rather than giving priority to feature extraction. A new method was recently presented by Ali et al. [<xref ref-type="bibr" rid="ref-18">18</xref>], who proposed a unique combination of traditional machine-learning techniques for SMS spam detection. Specifically, it uses Multiple Linear Regression (MLR) for feature weighting and an Extreme Learning Machine (ELM) for classification. This method achieved an accuracy of 98.7% on the UCI SMS dataset. Also, Hosseinpour et al. [<xref ref-type="bibr" rid="ref-19">19</xref>] proposed an ensemble learning method based on logistic regression and random forest algorithms. The ensemble learning approach achieved an accuracy of 98.06%. Pudasaini et al. [<xref ref-type="bibr" rid="ref-20">20</xref>] combined Relevance Vector Machine (RVM), SVM, Naive Bayes, and KNN, with a majority vote determining the final output. This research conducts a thorough comparative analysis of text classification algorithms for effective spam detection, emphasizing TF-IDF vectorization for preprocessing. The RVM stands out, achieving an F1-score of 97.51% in the UCI spam SMS dataset.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Deep Learning Methods</title>
<p>With the emergence of deep learning, researchers have explored new approaches to tackle SMS spam detection challenges. Liu et al. [<xref ref-type="bibr" rid="ref-21">21</xref>] introduced a modified version of the Transformer for SMS spam detection. Their comprehensive analysis encompassed various existing methods and evaluated them against datasets like SMS Spam Collection v.1 and UtkMl&#x2019;s Twitter dataset [<xref ref-type="bibr" rid="ref-22">22</xref>]. This method achieved an accuracy of 98.92% for the SMS spam dataset. Srinivasarao et al. [<xref ref-type="bibr" rid="ref-23">23</xref>] present a new model in text mining for spam and ham message differentiation. It introduces a fuzzy-based recurrent neural network with Harris Hawk optimization (FRNN-HHO) for classification. Post-classification sentiment analysis is performed to improve accuracy. In experimental evaluation using SMS, Email, and spam-assassin datasets, this method achieved an accuracy of 98.61% for SMS spam detection. Giri et al. [<xref ref-type="bibr" rid="ref-24">24</xref>] propose four neural network models (CNN BUNOW, CNN-LSTM BUNOW, CNN GloVe, and CNN-LSTM GloVe) for distinguishing spam from non-spam messages using SMS Spam Collection v.1 dataset. The models are trained and tested on different train-test splits. CNN-LSTM BUNOW performs best among four models with an accuracy of 99.04%, 99.01%, 98.92%, and 98.44% for 85%&#x2013;15%, 80%&#x2013;20%, 75%&#x2013;25%, and 70%&#x2013;30% train-test splits, respectively.</p>
<p>Debnath et al. [<xref ref-type="bibr" rid="ref-25">25</xref>] aim to address the SMS spam issue and improve detection accuracy. Various machine learning and deep learning models, including LSTM and BERT, are utilized on a UCI dataset to classify SMS spam. The proposed deep learning approach achieves high accuracy rates of 99.28% with BERT and 98.84% with LSTM. Ghourabi et al. [<xref ref-type="bibr" rid="ref-26">26</xref>] propose a deep learning model, CNN-LSTM, to detect SMS spam messages effectively. It combines CNN and LSTM to handle text messages for Arabic and English datasets. Experimental results demonstrate its performance, achieving an accuracy of 98.37%. Abayomi-Alli et al. [<xref ref-type="bibr" rid="ref-27">27</xref>] propose a deep learning approach that utilizes a Bidirectional Long Short-Term Memory (BiLSTM) model for SMS spam detection. The study involves two datasets: the ExAIS_SMS [<xref ref-type="bibr" rid="ref-28">28</xref>], a unique indigenous dataset, and the well-known UCI dataset. The proposed method leverages the distinctive characteristics of BiLSTM to achieve a high classification rate in detecting spam SMS messages. The BiLSTM attained an accuracy of 98.6% on the UCI SMS dataset. Wei et al. [<xref ref-type="bibr" rid="ref-29">29</xref>] propose a lightweight deep neural model called Lightweight Gated Recurrent Unit (LGRU) for SMS spam detection. They incorporate enhancing semantics retrieved from external knowledge (WordNet) to augment the understanding of SMS text inputs for better classification. LGRU achieved an accuracy of 98.87%. Ardeshir-Larijanie et al. [<xref ref-type="bibr" rid="ref-30">30</xref>] introduce a novel integration of hybrid classical-quantum transfer learning with NLP utilizing a pre-trained BERT model and a variational quantum circuit for text classification. This approach achieved an overall AUC-ROC of 95%.</p>
<p>Some papers deal with SMS spam detection using private datasets such as [<xref ref-type="bibr" rid="ref-31">31</xref>] or using local language datasets [<xref ref-type="bibr" rid="ref-32">32</xref>,<xref ref-type="bibr" rid="ref-33">33</xref>].</p>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Background</title>
<p>The &#x201C;Background&#x201D; section of the paper provides a concise overview of the deep learning approaches integral to the proposed model.</p>
<sec id="s3_1">
<label>3.1</label>
<title>RoBERTa</title>
<p>The Robustly Optimized BERT Pretraining Approach (RoBERTa) is a Large Language Model (LLM) Chabot trained to be more robust to adversarial training. It was developed by Facebook AI and released in 2019 [<xref ref-type="bibr" rid="ref-34">34</xref>]. RoBERTa is based on the BERT architecture but with several modifications, including more training data and longer training sequences. As a result of these modifications, RoBERTa has been shown to outperform BERT on several downstream tasks, including natural language inference (NLI), question answering (QA), and sentiment analysis [<xref ref-type="bibr" rid="ref-35">35</xref>]. RoBERTa is a powerful LLM that can be used for various tasks, such as generating text, translating languages, answering questions, summarizing text, and classifying text.</p>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Word-Level CNN</title>
<p>Word-level CNN is a convolutional neural network (CNN) designed explicitly for text classification tasks. CNN is well-suited for text classification tasks because it can learn to extract features from text data that are relevant to the task [<xref ref-type="bibr" rid="ref-36">36</xref>]. CNN can learn complex features from word embedding, making it a powerful tool for text classification. They have been shown to achieve state-of-the-art results on various tasks, including sentiment analysis, spam detection, and topic classification. Word-level CNNs effectively detect SMS spam because they focus on capturing features from the crucial words in the message. They excel at understanding the significance of specific words and their combinations, allowing them to differentiate between spam and legitimate messages based on these word-level features. Additionally, word embedding enhances their ability to interpret the meaning of words in the context of SMS content [<xref ref-type="bibr" rid="ref-37">37</xref>].</p>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>BiLSTM</title>
<p>Bidirectional LSTM (BiLSTM) is a type of recurrent neural network (RNN) that can be used for text classification tasks. RNN is well-suited for text classification tasks because they can learn long-range dependencies in text data. Unlike other RNNs, BiLSTM processes text data in both forward and backward directions, allowing for a more comprehensive extraction of information [<xref ref-type="bibr" rid="ref-38">38</xref>]. BiLSTM first converts the text data into a sequence of word embedding. Word embedding is dense vector representations of words that capture the semantic and syntactic relationships between words in the sentence. The output of the BiLSTM is a sequence of hidden states. The hidden states contain information the BiLSTM has learned about the text data.</p>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>Hierarchical Attention Network (HAN)</title>
<p>The Hierarchical Attention Network (HAN) is a pivotal component of the suggested model for Text understanding. HAN operates at both word and sentence levels, allowing it to capture hierarchical relationships within text data. The model effectively highlights essential words and sentences within a message by employing attention mechanisms, enabling the acquisition of crucial information [<xref ref-type="bibr" rid="ref-39">39</xref>]. HAN utilizes a GRU (Gated Recurrent Unit) at the word level to capture sequential dependencies among words within a sentence. Attention mechanisms are subsequently applied to assign varying weights to individual words, reflecting their significance in the overall message representation. At the sentence level, another attention mechanism is deployed to assess the importance of each sentence in the message. This dual-level attention mechanism enables the model to prioritize sentences containing substantial information while filtering out irrelevant or less informative ones.</p>
<p>The collaboration of word-level and sentence-level attention within HAN provides a hierarchical representation that comprehensively captures the text&#x2019;s context and semantics at various levels. This makes HAN exceptionally effective in tasks such as sentiment analysis, fake news detection, and text summarization, where the hierarchical structure of the text is paramount for accurate predictions and meaningful interpretations. <xref ref-type="fig" rid="fig-1">Fig. 1</xref> illustrates the HAN architecture.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>HAN architecture reprinted with permission from reference. [<xref ref-type="bibr" rid="ref-39">39</xref>]. Copyright 2019, ACM</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="IASC_50452-fig-1.tif"/>
</fig>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Proposed Method</title>
<p>The proposed method in SMS spam detection consists of five main steps to achieve accurate classification. The method utilizes a combination of techniques to enhance performance. <xref ref-type="fig" rid="fig-2">Fig. 2</xref> shows the proposed model steps.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Proposed method framework structure</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="IASC_50452-fig-2.tif"/>
</fig>
<sec id="s4_1">
<label>4.1</label>
<title>Preprocessing</title>
<p>The initial stage of our SMS spam detection method involves data preprocessing. This crucial process uses machine learning and deep learning models to read raw text messages for optimal analysis [<xref ref-type="bibr" rid="ref-9">9</xref>]. To enhance data quality, we undertake the subsequent preprocessing steps.
<list list-type="bullet">
<list-item>
<p>Punctuation removal: Unnecessary punctuation marks and symbols are removed from the text messages. Removing punctuation enhances the focus on content and meaning, simplifying data representation and decreasing vocabulary size.</p></list-item>
<list-item>
<p>Lowercasing: All text words are converted to lowercase for consistency, ensuring uniformity and standardized vocabulary usage, eliminating variations due to capitalization [<xref ref-type="bibr" rid="ref-40">40</xref>].</p></list-item>
<list-item>
<p>Stop-word removal: Frequently occurring words such as &#x201C;the,&#x201D; &#x201C;is,&#x201D; and &#x201C;a&#x201D; (stop words) are excluded as they often lack substantial meaning. This step reduces data dimensionality and prevents interference in spam detection, emphasizing words with higher discriminatory significance [<xref ref-type="bibr" rid="ref-41">41</xref>].</p></list-item>
<list-item>
<p>Removal: We excise symbols or characters that hold no essential role in the message&#x2019;s content, guaranteeing that the model remains steadfastly attuned to meaningful information. This encompasses the removal of hashtags and other extraneous symbols [<xref ref-type="bibr" rid="ref-42">42</xref>].</p></list-item>
<list-item>
<p>Normalization: Text normalization assumes paramount significance in Short Message Services (SMS), where character limits are stringent, and senders often resort to shortcuts to economize on space and costs. This procedure involves transforming word variations, such as &#x201C;u&#x201D; to &#x201C;you&#x201D; and &#x201C;2&#x201D; to &#x201C;to,&#x201D; into their standardized equivalents [<xref ref-type="bibr" rid="ref-43">43</xref>].</p></list-item>
</list></p>
<p>The text data is suitable for deep learning models by executing these preprocessing procedures. The numerical representations enable practical analysis and enhance our SMS spam detection approach.</p>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Embedding</title>
<p>In this paper, we leverage RoBERTa, a state-of-the-art transformer-based language model, to enhance the word embedding phase of the suggested SMS spam detection methodology. RoBERTa excels in understanding the contextual relationships between words. It achieves this by representing words as dense vectors in a continuous vector space, capturing rich semantic information. Specifically, RoBERTa employs a bidirectional approach, considering each word&#x2019;s left and right contexts during pertaining, resulting in a highly contextualized word embedding. In our methodology, RoBERTa plays a pivotal role by transforming individual words within SMS messages into these context-aware vector representations. These embedding, which encapsulate nuanced word meanings and contextual information, serve as the foundation for subsequent stages of our model, such as feature extraction using convolutional neural networks (CNN) and bidirectional long short-term memory (BiLSTM) networks, ultimately enabling our Hierarchical Attention Network to effectively discern spam from legitimate messages by comprehensively considering the intricate relationships between words and sentences.</p>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Parallel Feature Extraction</title>
<p>The proposed framework employs a dual-branch architecture for concurrent feature extraction at the word level and sentence level, thereby enabling a multifaceted analysis of textual data.</p>
<sec id="s4_3_1">
<label>4.3.1</label>
<title>Word Level</title>
<p>We utilize Convolutional Neural Network (CNN) to refine word-level feature extraction. The efficacy of CNN lies in its ability to capture localized textual patterns by applying convolutional filters across word embedding. These filters, of varying receptive field sizes, are adept at discerning salient word combinations and syntactic structures indicative of spam content. This process is imperative for distinguishing between legitimate and spam SMS messages, as it facilitates the detection of specific linguistic markers that may be obscured in isolated word embedding. The resultant feature map from the word-level CNN encapsulates refined contextual insights, which are instrumental for the subsequent stages of our spam detection methodology.</p>
</sec>
<sec id="s4_3_2">
<label>4.3.2</label>
<title>Sentence Level</title>
<p>Building upon the local features identified by the word-level CNN, we used a sentence-level Bidirectional Long Short-Term Memory (BiLSTM) network. The BiLSTM augments the analytical prowess of the proposed framework by simulating extensive contextual information inherent within the SMS data. The bidirectional processing capabilities of the system enable it to understand the relationships between words in both the preceding and following contexts, thus capturing the nuanced semantic relationships in sentences. This enriched representation of sentence-level features is crucial for accurate spam discrimination. The integrated features derived from the BiLSTM constitute the foundational elements for our advanced Hierarchical Attention Network, designed to synthesize and evaluate the textual data comprehensively at both granular and holistic levels, thereby significantly elevating the precision of our SMS spam detection framework.</p>
</sec>
</sec>
<sec id="s4_4">
<label>4.4</label>
<title>Fusion and Selection</title>
<p>HAN is a pivotal component designed to seamlessly integrate features extracted at both the word and sentence levels. The HAN leverages hierarchical attention mechanisms to weigh the importance of words within sentences and sentences within messages. By doing so, it discerns which words and sentences are most informative for spam detection, effectively filtering out irrelevant or redundant information. At the word level, attention is employed to identify significant words and their contextual importance within sentences. In contrast, at the sentence level, the network determines the relevance of each Sentence within the entire SMS message. This hierarchical attention mechanism ensures that our model concentrates on the most pertinent elements of the text, enhancing the effectiveness of spam classification. Furthermore, the HAN orchestrates feature fusion by combining the weighted word and Sentence embedding to create a comprehensive and contextually rich representation of the SMS message. This fused representation forms the basis for the final spam classification, enabling the model to make informed decisions based on local and global cues in the text data.</p>
</sec>
<sec id="s4_5">
<label>4.5</label>
<title>Classification</title>
<p>The final stage of the proposed SMS spam detection model is the classification, where the features are meticulously extracted and selected according to their weight by the Hierarchical Attention Network, it is time to make binary decisions regarding the nature of incoming messages. For this purpose, a fully connected layer is used as the classifier. This fully connected utilizes the fused features to perform the classification task [<xref ref-type="bibr" rid="ref-44">44</xref>]. Applying a combination of linear transformations and non-linear activation functions effectively maps the complex feature space to a decision boundary that separates spam messages from legitimate ones. The output of this layer provides a probability score, indicating the likelihood of the input message being spam or not. A suitable activation function, the sigmoid is applied to ensure that the output falls within the [0, 1] range, allowing for straightforward probability interpretation. The classification decision is made by comparing this probability score to a predefined threshold, and the message is labeled as either &#x2018;spam&#x2019; or &#x2018;non-spam&#x2019; based on this threshold, thus concluding the SMS spam detection process.</p>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Evaluation</title>
<sec id="s5_1">
<label>5.1</label>
<title>Dataset</title>
<p>The Machine Learning Repository, overseen by the University of California, Irvine (UCI), is a well-known online platform that provides many datasets for machine learning and data analysis. One of the datasets available in this repository is the UCI SMS Spam Collection [<xref ref-type="bibr" rid="ref-11">11</xref>], which consists of 5574 text messages. This dataset is classified into two categories: legitimate messages, making up 4827 instances (86.6%), and spam messages, accounting for 747 instances (13.4%). The distribution of the dataset is shown in <xref ref-type="table" rid="table-1">Table 1</xref>.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Dataset statistic</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th></th>
<th>Number of the messages</th>
<th>Percentage</th>
</tr>
</thead>
<tbody>
<tr>
<td>Spam</td>
<td>4827</td>
<td>86.6%</td>
</tr>
<tr>
<td>Ham</td>
<td>747</td>
<td>13.4%</td>
</tr>
<tr>
<td>Total</td>
<td>5574</td>
<td>100%</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s5_2">
<label>5.2</label>
<title>Performance Matric</title>
<p>When we evaluate how well deep learning models perform, choosing the suitable measurement method is essential. There are various metrics available for this purpose, depending on the application. Sometimes, looking at just one metric may not give us a complete understanding, especially when dealing with imbalanced data. In those cases, we may need to use a combination of metrics to evaluate the models. We used well-known metrics like accuracy, precision, recall, and F1-score for our evaluation [<xref ref-type="bibr" rid="ref-45">45</xref>]. Before we delve into these metrics, it is essential to define four key terms, as shown in <xref ref-type="table" rid="table-2">Table 2</xref>.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Definitions of evaluation key terms for spam detection</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Metric</th>
<th>Explanation</th>
</tr>
</thead>
<tbody>
<tr>
<td>TP (True positive)</td>
<td>Accurately predicted spam messages.</td>
</tr>
<tr>
<td>TN (True negative)</td>
<td>Accurately predicted legitimate messages.</td>
</tr>
<tr>
<td>FP (False positive)</td>
<td>Mistakenly classified legitimate messages as spam.</td>
</tr>
<tr>
<td>FN (False negative)</td>
<td>Mistakenly classified spam messages as legitimate.</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Here is an explanation of each metric:
<list list-type="order">
<list-item><p><bold>Accuracy:</bold> This metric shows how accurately the model sorts out spam and legitimate messages among all its predictions. It looks at correct predictions of spam (true positives) and legitimate (true negatives) while also dealing with incorrect predictions of both. The formula computes the fraction of messages classified correctly (TN and TP) compared to all predicted messages (TN, FN, TP, and FP). It is computed as
<disp-formula id="eqn-1">
<label>(1)</label>
<mml:math id="mml-eqn-1" display="block"><mml:mrow><mml:mtext>Accuracy</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>r</mml:mi><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>P</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mo stretchy="false">(</mml:mo><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>r</mml:mi><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>N</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mo stretchy="false">(</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>N</mml:mi><mml:mi>u</mml:mi><mml:mi>m</mml:mi><mml:mi>b</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>o</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>S</mml:mi><mml:mi>a</mml:mi><mml:mi>m</mml:mi><mml:mi>p</mml:mi><mml:mi>l</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula></p></list-item>
<list-item><p><bold>Precision:</bold> This metric centers on the accuracy of positive predictions, particularly in pinpointing the number of predicted spam messages that are indeed spam. Reducing false positives is important. The calculation involves dividing the true positives by the total number of messages predicted as positive (including both true and false positives). It is computed as
<disp-formula id="eqn-2">
<label>(2)</label>
<mml:math id="mml-eqn-2" display="block"><mml:mrow><mml:mtext>Precision</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>r</mml:mi><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>P</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mo stretchy="false">(</mml:mo><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>r</mml:mi><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>P</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mo stretchy="false">(</mml:mo><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>P</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mo stretchy="false">(</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:math></disp-formula></p></list-item>
<list-item><p><bold>Recall:</bold> Evaluates how well the model detects all real positive cases (spam messages), encompassing true positives and excluding false negatives. Its significance lies in minimizing the overlook of actual positive cases. The calculation divides the true positives by the total real positive cases (comprising both false and true negatives). It is computed as
<disp-formula id="eqn-3">
<label>(3)</label>
<mml:math id="mml-eqn-3" display="block"><mml:mrow><mml:mtext>Recall&#xA0;</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>r</mml:mi><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>P</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mo stretchy="false">(</mml:mo><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>r</mml:mi><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>P</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mo stretchy="false">(</mml:mo><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>N</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mo stretchy="false">(</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:math></disp-formula></p></list-item>
<list-item><p><bold>F1-score:</bold> Merges precision and recall, offering a well-rounded evaluation of the model&#x2019;s effectiveness. This becomes especially valuable when balancing precision and recall, which is essential. Using the harmonic mean, the F1-score addresses scenarios where one metric could considerably overshadow the other, achieving an equilibrium. The F1-score is computed by harmoniously factoring in precision and recall, recognizing their collective impact.
<disp-formula id="eqn-4">
<label>(4)</label>
<mml:math id="mml-eqn-4" display="block"><mml:mrow><mml:mtext>F1-score</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mfrac><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:math></disp-formula></p></list-item>
</list></p>
<p>These metrics comprehensively assess the model&#x2019;s performance in SMS spam filtering, considering its accuracy and ability to Separate between spam and ham messages.</p>
</sec>
<sec id="s5_3">
<label>5.3</label>
<title>Model Configuration and Hyperparameters</title>
<p>Our model&#x2019;s architecture is specifically designed for effective SMS spam detection, leveraging advanced deep learning techniques for comprehensive textual analysis. It transforms textual data into a numerical format using RoBERTa word embedding (roberta-base), chosen for its optimal balance of computational efficiency and performance.</p>
<p>In the word-level feature extraction phase, a CNN employs a dual-layer setup, each layer equipped with 128 filters and initially using kernel sizes of 3. This configuration efficiently captures immediate contextual relationships within the text. The CNN&#x2019;s optimization process utilizes an Adam optimizer with an initial learning rate of 0.001, ensuring effective adjustment during training. For sentence-level analysis, we utilize a BiLSTM network structured with two layers and 128 units each, maintaining a consistent learning rate of 0.001. This setup effectively captures sentence dynamics by text sequences in both forward and reverse directions.</p>
<p>A crucial component of our approach is the Hierarchical Attention Network (HAN), which intricately combines and evaluates features at both the word and sentence levels, employing a dedicated layer of attention for each, fine-tuned at a learning rate of 0.001. This mechanism significantly enhances the model&#x2019;s ability to focus on the most relevant text segments for precise spam detection. The architecture includes a 256-unit fully connected layer with ReLU activation, followed by a 0.5 dropout layer to prevent overfitting [<xref ref-type="bibr" rid="ref-46">46</xref>]. Another fully connected layer with 128 units and ReLU activation precedes an additional dropout layer at the same rate. The final classification layer, equipped with 2 units and utilizing a SoftMax activation function, distinguishes between spam and non-spam messages.</p>
</sec>
</sec>
<sec id="s6">
<label>6</label>
<title>Results and Analysis</title>
<p>We compared the proposed two-level SMS spam detection model with other modern approaches, including the modified Transformer, FRNN-HHO, CNN-LSTM BUNOW, BERT, and CNN-LSTM, using various performance metrics such as accuracy, precision, recall, and F-measure. The evaluation results demonstrated that our model outperformed all other models, achieving an accuracy rate of 99.48% with a high precision of 0.998, recall of 0.997, and F-measure 0.998 values. These results indicate that our model can accurately classify spam and non-spam messages with minimal false positives or false negatives, as shown in <xref ref-type="table" rid="table-3">Table 3</xref>.</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Comparison performance of the SMS spam detection models</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Method</th>
<th>Accuracy</th>
<th>Precision</th>
<th>Recall</th>
<th>F1-score</th>
</tr>
</thead>
<tbody>
<tr>
<td>CNN-LSTM BUNOW</td>
<td>99.04%</td>
<td>97.3%</td>
<td>95.5%</td>
<td>96.4%</td>
</tr>
<tr>
<td>M-Transformer</td>
<td>98.92%</td>
<td>97.8%</td>
<td>94.5%</td>
<td>96.1%</td>
</tr>
<tr>
<td>FRNN-HHO</td>
<td>98.61%</td>
<td>99.7%</td>
<td>98.1%</td>
<td>98.9%</td>
</tr>
<tr>
<td>CNN-LSTM</td>
<td>98.37%</td>
<td>95.3%</td>
<td>87.8%</td>
<td>91.4%</td>
</tr>
<tr>
<td>BERT</td>
<td>99.28%</td>
<td>99.6%</td>
<td>99.2%</td>
<td>99.3%</td>
</tr>
<tr>
<td>LGRU</td>
<td>98.87%</td>
<td>99.7%</td>
<td>99%</td>
<td>99.4%</td>
</tr>
<tr>
<td>Proposed method</td>
<td>99.48%</td>
<td>99.8%</td>
<td>99.7%</td>
<td>99.8%</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Because the dataset is imbalanced, we used a thorough 10-fold cross-validation approach to evaluate our SMS spam detection model [<xref ref-type="bibr" rid="ref-47">47</xref>]. Unlike the standard random split method, this method is well-suited for handling imbalanced datasets. It ensures that spam and legitimate messages are evenly represented in each evaluation cycle, making the results more trustworthy. The accuracy is an average of these ten cycles, offering a more robust and dependable measure of how well the model works in practical situations. <xref ref-type="fig" rid="fig-3">Fig. 3</xref> shows the accuracy assessment using cross-validation.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Accuracy assessment using cross-validation</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="IASC_50452-fig-3.tif"/>
</fig>
<p>We present a comprehensive performance evaluation of the proposed method through a series of visual figures, each specifically dedicated to a critical metric, as shown in the four parts of <xref ref-type="fig" rid="fig-4">Fig. 4</xref>.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>Performance evaluation of SMS spam detection models across various metrics</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="IASC_50452-fig-4.tif"/>
</fig>
<p>Part (a) of <xref ref-type="fig" rid="fig-4">Fig. 4</xref> illustrates accuracy results, providing a clear overview of how each approach performs in terms of accuracy. Notably, the two-level SMS spam detection method is distinctly highlighted, showcasing its exceptional accuracy compared to other methods.</p>
<p>Part (b) focuses on precision results, visually correlating precision performances across the evaluated approaches. The suggested approach&#x2019;s precision performance stands out distinctly, reaffirming its effectiveness in correctly identifying spam messages.</p>
<p>Part (c) visualizes the recall metric, detailing the ability of each method to identify all spam messages. Our method&#x2019;s superior recall performance is evident, reflecting its proficiency in comprehensively capturing spam instances.</p>
<p>Part (d) delves into the F1-score, a balanced metric considering precision and recall. The figure illustrates how our method strikes a notable equilibrium between these aspects, further substantiating its robust performance.</p>
<p>The four segments of <xref ref-type="fig" rid="fig-4">Fig. 4</xref> offer an insightful comparison of the hierarchical two-level SMS spam detection method against other methodologies. This graphical representation effectively underscores the strengths of our approach, demonstrating its capacity to excel across multiple crucial evaluation metrics.</p>
<p>This study used the confusion matrix to evaluate the performance of the SMS spam detection model. It revealed the model&#x2019;s high precision and reliability in distinguishing between spam and non-spam messages. The matrix demonstrated the model&#x2019;s effectiveness in minimizing false positives and false negatives, underscoring its robustness and inaccurate message categorization, as shown in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>.</p>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>Confusion matrix</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="IASC_50452-fig-5.tif"/>
</fig>
<p>The performance of the proposed model for SMS spam detection was evaluated using a confusion matrix, as presented in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>. This matrix graphically displays the model&#x2019;s ability to differentiate between spam and legitimate (ham) messages, showcasing strong precision and minimal misclassification, affirming the effectiveness of our approach. With 158 correctly identified non-spam and 889 identified spam messages, the model demonstrates a high degree of precision in differentiating between categories. The matrix also indicates a low rate of misclassification, with only 3 instances where spam was mislabeled as non-spam and 3 instances where non-spam was incorrectly identified as spam. These minimal errors indicate the proposed model&#x2019;s capability for accurate message categorization. The examination of the model&#x2019;s errors [<xref ref-type="bibr" rid="ref-48">48</xref>] to understand where and why incorrect predictions occur is important to deeper insights into the model&#x2019;s performance and the nature of errors relative to the corpus.</p>
</sec>
<sec id="s7">
<label>7</label>
<title>Discussion</title>
<p>This study developed a SMS spam filtering framework based on Hierarchical Two-Level Feature Fusion. The key characteristics and findings are highlighted below:</p>
<list list-type="simple">
<list-item><label>&#x2022;</label>
<p>This study marks the first application of a Hierarchical Two-Level Feature Fusion approach for SMS spam detection, addressing both word-level and sentence-level analysis within the constrained context of short text messages. By achieving an exceptional accuracy rate of 99.48%, our method significantly addresses the challenges posed by advanced spam techniques, thereby enhancing the security and privacy of mobile communications. This achievement underscores the method&#x2019;s effectiveness in parsing the limited textual content typical of SMS, a critical advantage in identifying and filtering spam.</p></list-item>
<list-item><label>&#x2022;</label>
<p>Our model introduces a pioneering integration of pre-trained deep learning frameworks with a Hierarchical Attention Network (HAN), setting it apart from existing methods such as BERT, CNN-LSTM, and others. It demonstrates superior performance metrics, including accuracy, precision, recall, and F1-score, illustrating the benefits of our two-level feature fusion approach.</p></list-item>
</list>
</sec>
<sec id="s8">
<label>8</label>
<title>Conclusions</title>
<p>This paper introduced the innovative Two-Level SMS Spam Detection Method, which leverages hybrid deep learning and advanced text analysis techniques to establish an accurate framework for spam detection. By integrating the power of the Hierarchical Attention Network (HAN), our method has demonstrated exceptional performance in distinguishing between spam and legitimate SMS messages. The two-level hierarchical approach adeptly captures nuanced patterns within SMS content, combining word-level features with a comprehensive understanding at the sentence level. The proposed model achieved a remarkable accuracy rate of 99.48% on the UCI SMS dataset, significantly outperforming existing methods. For future research, we envision expanding the evaluation of the Two-Level SMS Spam Detection Method to include multilingual databases, showcasing its adaptability across different languages and cultural contexts. Additionally, the integration of external contextual information, such as sender reputation or network attributes, may further enhance the model&#x2019;s accuracy in detecting spam. Also, we plan to enhance our spam detection model by incorporating transfer learning and external knowledge sources. This approach will utilize the rich data from pre-trained models and broaden our understanding of spam indicators, aiming to improve accuracy and adaptability to new spam trends. Exploring these techniques represents a promising direction to advance our model, ensuring it remains effective and efficient in the evolving landscape of spam detection. This study not only advances the field of cybersecurity but also lays the groundwork for broader applications in various natural language processing domains.</p>
</sec>
</body>
<back>
<ack><p>Thanks are due to Ali-Reza Feizi-Derakhshi for the valuable technical support.</p>
</ack>
<sec><title>Funding Statement</title>
<p>The authors received no specific funding for this study.</p>
</sec>
<sec><title>Author Contributions</title>
<p>The authors confirm their contribution to the paper as follows: Study conception and design: Hussein Alaa Al-Kabbi, interpretation of results: Mohammad-Reza Feizi-Derakhshi, draft manuscript preparation: Hussein Alaa Al-Kabbi, Mohammad-Reza Feizi-Derakhshi, and Saeed Pashazadeh. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability"><title>Availability of Data and Materials</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement"><title>Conflicts of Interest</title>
<p>The authors declare that they have no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><collab>SlickText</collab></person-group>, &#x201C;<article-title>44 mind-blowing SMS marketing and texting statistics</article-title>,&#x201D; <comment>2023. Accessed: May 15, 2023.</comment> <year>2023</year>. [Online]. Available: <ext-link ext-link-type="uri" xlink:href="https://www.slicktext.com/blog/2018/11/44-mind-blowing-sms-marketing-and-texting-statistics/">https://www.slicktext.com/blog/2018/11/44-mind-blowing-sms-marketing-and-texting-statistics/</ext-link></mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G.</given-names> <surname>Sonowal</surname></string-name> and <string-name><given-names>K. S.</given-names> <surname>Kuppusamy</surname></string-name></person-group>, &#x201C;<article-title>SmiDCA: An anti-smishing model with machine learning approach</article-title>,&#x201D; <source>Comput. J.</source>, vol. <volume>61</volume>, no. <issue>8</issue>, pp. <fpage>1143</fpage>&#x2013;<lpage>1157</lpage>, <year>Aug. 2018</year>. doi: <pub-id pub-id-type="doi">10.1093/comjnl/bxy039</pub-id>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>V.</given-names> <surname>Dharani</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Hegde</surname></string-name>, and <string-name><surname>Mohana</surname></string-name></person-group>, &#x201C;<article-title>Spam SMS (or) email detection and classification using machine learning</article-title>,&#x201D; in <conf-name>2023 5th Int. Conf. Smart Syst. Inventive Technol. (ICSSIT)</conf-name>, <conf-loc>Tirunelveli, India</conf-loc>, <year>2023</year>, pp. <fpage>1104</fpage>&#x2013;<lpage>1108</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ICSSIT55814.2023.10060908</pub-id>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S. J.</given-names> <surname>Delany</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Buckley</surname></string-name>, and <string-name><given-names>D.</given-names> <surname>Greene</surname></string-name></person-group>, &#x201C;<article-title>SMS spam filtering: Methods and data</article-title>,&#x201D; <source>Expert Syst. Appl.</source>, vol. <volume>39</volume>, no. <issue>10</issue>, pp. <fpage>9899</fpage>&#x2013;<lpage>9908</lpage>, <year>2012</year>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Wo&#x017A;niak</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Si&#x0142;ka</surname></string-name>, and <string-name><given-names>M.</given-names> <surname>Wieczorek</surname></string-name></person-group>, &#x201C;<article-title>Deep neural network correlation learning mechanism for CT brain tumor detection</article-title>,&#x201D; <source>Neural Comput. Appl.</source>, vol. <volume>35</volume>, no. <issue>20</issue>, pp. <fpage>14611</fpage>&#x2013;<lpage>14626</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1007/s00521-021-05841-x</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A. H. J.</given-names> <surname>Almarashy</surname></string-name>, <string-name><given-names>M. R.</given-names> <surname>Feizi-Derakhshi</surname></string-name>, and <string-name><given-names>P.</given-names> <surname>Salehpour</surname></string-name></person-group>, &#x201C;<article-title>Enhancing fake news detection by multi-feature classification</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>11</volume>, pp. <fpage>139601</fpage>&#x2013;<lpage>139613</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2023.3339621</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>Wu</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Han</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Wang</surname></string-name>, and <string-name><given-names>S.</given-names> <surname>Sun</surname></string-name></person-group>, &#x201C;<article-title>Research on artificial intelligence enhancing internet of things security: A survey</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>8</volume>, pp. <fpage>153826</fpage>&#x2013;<lpage>153848</lpage>, <year>2020</year>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2020.3018170</pub-id>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Bani-Almarjeh</surname></string-name> and <string-name><given-names>M. B.</given-names> <surname>Kurdy</surname></string-name></person-group>, &#x201C;<article-title>Arabic abstractive text summarization using RNN-based and transformer-based architectures</article-title>,&#x201D; <source>Inf. Process. Manage.</source>, vol. <volume>60</volume>, no. <issue>2</issue>, pp. <fpage>103227</fpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1016/j.ipm.2022.103227</pub-id>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Gasparetto</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Marcuzzo</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Zangari</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Albarelli</surname></string-name></person-group>, &#x201C;<article-title>A survey on text classification algorithms: From text to predictions</article-title>,&#x201D; <source>Information</source>, vol. <volume>13</volume>, no. <issue>2</issue>, pp. <fpage>83</fpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.3390/info13020083</pub-id>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>J&#x00E1;&#x00F1;ez-Martino</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Alaiz-Rodr&#x00ED;guez</surname></string-name>, <string-name><given-names>V.</given-names> <surname>Gonz&#x00E1;lez-Castro</surname></string-name>, <string-name><given-names>E.</given-names> <surname>Fidalgo</surname></string-name>, and <string-name><given-names>E.</given-names> <surname>Alegre</surname></string-name></person-group>, &#x201C;<article-title>A review of spam email detection: Analysis of spammer strategies and the dataset shift problem</article-title>,&#x201D; <source>Artif. Intell. Rev.</source>, vol. <volume>56</volume>, no. <issue>2</issue>, pp. <fpage>1145</fpage>&#x2013;<lpage>1173</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1007/s10462-022-10195-4</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>Q.</given-names> <surname>Le</surname></string-name> and <string-name><given-names>T.</given-names> <surname>Mikolov</surname></string-name></person-group>, &#x201C;<article-title>Distributed representations of sentences and documents</article-title>,&#x201D; in <conf-name>Proc. Int. Conf. Mach. Learn.</conf-name>, <conf-loc>Beijing, China</conf-loc>, <year>Jun. 2014</year>, pp. <fpage>1188</fpage>&#x2013;<lpage>1196</lpage>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Dyer</surname></string-name>, <string-name><given-names>X.</given-names> <surname>He</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Smola</surname></string-name> and <string-name><given-names>E.</given-names> <surname>Hovy</surname></string-name></person-group>, &#x201C;<article-title>Hierarchical attention networks for document classification</article-title>,&#x201D; in <conf-name>Proc. 2016 Conf. North Am. Chapter Assoc. Comput. Linguistics: Hum. Lang. Technol. (NAACL HLT 2016)</conf-name>, <conf-loc>San Diego, CA, USA</conf-loc>, <year>2016</year>, pp. <fpage>1480</fpage>&#x2013;<lpage>1489</lpage>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>T. A.</given-names> <surname>Almeida</surname></string-name>, <string-name><given-names>J. M. G.</given-names> <surname>Hidalgo</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Yamakami</surname></string-name></person-group>, &#x201C;<article-title>Contributions to the study of SMS spam filtering: New collection and results</article-title>,&#x201D; in <conf-name>Proc. 11th ACM Symp. Docu. Eng. (DocEng &#x2018;11)</conf-name>, <conf-loc>New York, NY, USA</conf-loc>, <year>2011</year>, pp. <fpage>259</fpage>&#x2013;<lpage>262</lpage>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>S. Y.</given-names> <surname>Yerima</surname></string-name> and <string-name><given-names>A.</given-names> <surname>Bashar</surname></string-name></person-group>, &#x201C;<article-title>Semi-supervised novelty detection with one class SVM for SMS spam detection</article-title>,&#x201D; in <conf-name>Proc. 29th Int. Conf. Syst., Signals Image Process. (IWSSIP)</conf-name>, <conf-loc>Sofia, Bulgaria</conf-loc>, <year>2022</year>, pp. <fpage>1</fpage>&#x2013;<lpage>4</lpage>. doi: <pub-id pub-id-type="doi">10.1109/IWSSIP55020.2022.9854496</pub-id>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>D. D.</given-names> <surname>Arifin</surname></string-name>, <string-name><surname>Shaufiah</surname></string-name>, and <string-name><given-names>M. A.</given-names> <surname>Bijaksana</surname></string-name></person-group>, &#x201C;<article-title>Enhancing spam detection on mobile phone short message service (SMS) performance using FP-growth and naive bayes classifier</article-title>,&#x201D; in <conf-name>Proc. IEEE Asia Pacific Conf. Wireless Mobile (APWiMob)</conf-name>, <conf-loc>Bandung, Indonesia</conf-loc>, <year>2016</year>, pp. <fpage>80</fpage>&#x2013;<lpage>84</lpage>. doi: <pub-id pub-id-type="doi">10.1109/APWiMob.2016.7811442</pub-id>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Gupta</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Bakliwal</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Agarwal</surname></string-name>, and <string-name><given-names>P.</given-names> <surname>Mehndiratta</surname></string-name></person-group>, &#x201C;<article-title>A comparative study of spam SMS detection using machine learning classifiers</article-title>,&#x201D; in <conf-name>2018 Eleventh Int. Conf. Contemp. Comput. (IC3)</conf-name>, <conf-loc>Noida, India</conf-loc>, <year>2018</year>, pp. <fpage>1</fpage>&#x2013;<lpage>7</lpage>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Gao</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Feng</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Song</surname></string-name>, and <string-name><given-names>X.</given-names> <surname>Wu</surname></string-name></person-group>, &#x201C;<article-title>Target-dependent sentiment classification with BERT</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>7</volume>, pp. <fpage>154290</fpage>&#x2013;<lpage>154299</lpage>, <year>2019</year>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2019.2946594</pub-id>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z. H.</given-names> <surname>Ali</surname></string-name>, <string-name><given-names>H. M.</given-names> <surname>Salman</surname></string-name>, and <string-name><given-names>A. H.</given-names> <surname>Harif</surname></string-name></person-group>, &#x201C;<article-title>SMS spam detection using multiple linear regression and extreme learning machines</article-title>,&#x201D; <source>Iraqi J. Sci.</source>, vol. <volume>64</volume>, no. <issue>10</issue>, pp. <fpage>6342</fpage>&#x2013;<lpage>6351</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.24996/ijs.2023.64.10.45</pub-id>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Hosseinpour</surname></string-name> and <string-name><given-names>H.</given-names> <surname>Shakibian</surname></string-name></person-group>, &#x201C;<article-title>An ensemble learning approach for SMS spam detection</article-title>,&#x201D; in <conf-name>2023 9th Int. Conf. Web Res. (ICWR)</conf-name>, <conf-loc>Tehran</conf-loc>, <publisher-name>Islamic Republic of Iran</publisher-name>, <year>2023</year>, pp. <fpage>125</fpage>&#x2013;<lpage>128</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ICWR57742.2023.10139070</pub-id>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Pudasaini</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Shakya</surname></string-name>, <string-name><given-names>S. P.</given-names> <surname>Pandey</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Paudel</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Ghimire</surname></string-name> and <string-name><given-names>P.</given-names> <surname>Ale</surname></string-name></person-group>, &#x201C;<article-title>SMS spam detection using relevance vector machine</article-title>,&#x201D; in <source>Procedia Comput. Sci.</source>, <comment>Halifax, NS, Canada, 2023,</comment> vol. <volume>230</volume>, pp. <fpage>337</fpage>&#x2013;<lpage>346</lpage>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>X.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Lu</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Nayak</surname></string-name></person-group>, &#x201C;<article-title>A spam transformer model for SMS spam detection</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>9</volume>, pp. <fpage>80253</fpage>&#x2013;<lpage>80263</lpage>, <year>2021</year>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2021.3081479</pub-id>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>T.</given-names> <surname>Xia</surname></string-name> and <string-name><given-names>X.</given-names> <surname>Chen</surname></string-name></person-group>, &#x201C;<article-title>A discrete hidden markov model for SMS spam detection</article-title>,&#x201D; <source>Appl. Sci.</source>, vol. <volume>10</volume>, no. <issue>14</issue>, pp. <fpage>5011</fpage>, <year>2020</year>. doi: <pub-id pub-id-type="doi">10.3390/app10145011</pub-id>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>U.</given-names> <surname>Srinivasarao</surname></string-name> and <string-name><given-names>A.</given-names> <surname>Sharaff</surname></string-name></person-group>, &#x201C;<article-title>SMS sentiment classification using an evolutionary optimization based fuzzy recurrent neural network</article-title>,&#x201D; <source>Multimed. Tools Appl.</source>, vol. <volume>82</volume>, no. <issue>27</issue>, pp. <fpage>42207</fpage>&#x2013;<lpage>42238</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1007/s11042-023-15206-2</pub-id>; <pub-id pub-id-type="pmid">37362691</pub-id></mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Giri</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Das</surname></string-name>, <string-name><given-names>S. B.</given-names> <surname>Das</surname></string-name>, and <string-name><given-names>S.</given-names> <surname>Banerjee</surname></string-name></person-group>, &#x201C;<article-title>SMS spam classification-simple deep learning models with higher accuracy using BUNOW and GloVe word embedding</article-title>,&#x201D; <source>J. Appl. Sci. Eng.</source>, vol. <volume>26</volume>, pp. <fpage>1501</fpage>&#x2013;<lpage>1511</lpage>, <year>2023</year>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Debnath</surname></string-name> and <string-name><given-names>N.</given-names> <surname>Kar</surname></string-name></person-group>, &#x201C;<article-title>SMS spam detection using deep learning approach</article-title>,&#x201D; in <conf-name>Proc. ICHCSC 2022</conf-name>, <conf-loc>Singapore</conf-loc>, <publisher-name>Springer Nature Singapore</publisher-name>, <year>2022</year>, pp. <fpage>337</fpage>&#x2013;<lpage>347</lpage>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Ghourabi</surname></string-name>, <string-name><given-names>M. A.</given-names> <surname>Mahmood</surname></string-name>, and <string-name><given-names>Q. M.</given-names> <surname>Alzubi</surname></string-name></person-group>, &#x201C;<article-title>A hybrid CNN-LSTM model for SMS spam detection in Arabic and english messages</article-title>,&#x201D; <source>Future Internet</source>, vol. <volume>12</volume>, no. <issue>156</issue>, pp. <fpage>156</fpage>, <year>2020</year>. doi: <pub-id pub-id-type="doi">10.3390/fi12090156</pub-id>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>O.</given-names> <surname>Abayomi-Alli</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Misra</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Abayomi-Alli</surname></string-name></person-group>, &#x201C;<article-title>A deep learning method for automatic SMS spam classification: Performance of learning algorithms on indigenous dataset</article-title>,&#x201D; <source>Concurr. Comput.: Pract. Exp.</source>, vol. <volume>34</volume>, no. <issue>17</issue>, pp. <fpage>e6989</fpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1002/cpe.6989</pub-id>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A. S.</given-names> <surname>Onashoga</surname></string-name>, <string-name><given-names>O. O.</given-names> <surname>Abayomi-Alli</surname></string-name>, <string-name><given-names>A. S.</given-names> <surname>Sodiya</surname></string-name>, and <string-name><given-names>D. A.</given-names> <surname>Ojo</surname></string-name></person-group>, &#x201C;<article-title>An adaptive and collaborative server-side SMS spam filtering scheme using artificial immune system</article-title>,&#x201D; <source>Inf Sec. J.: A Global Perspect</source>, vol. <volume>24</volume>, no. <issue>4&#x2013;6</issue>, pp. <fpage>133</fpage>&#x2013;<lpage>145</lpage>, <year>2015</year>. doi: <pub-id pub-id-type="doi">10.1080/19393555.2015.1078017</pub-id>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>Wei</surname></string-name> and <string-name><given-names>T.</given-names> <surname>Nguyen</surname></string-name></person-group>, &#x201C;<article-title>A lightweight deep neural model for SMS spam detection</article-title>,&#x201D; in <conf-name>2020 Int. Symp. Netw., Comput. Commun. (ISNCC)</conf-name>, <conf-loc>Montreal, QC, Canada</conf-loc>, <year>2020</year>, pp. <fpage>1</fpage>&#x2013;<lpage>6</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ISNCC49221.2020.9297350</pub-id>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>E.</given-names> <surname>Ardeshir-Larijani</surname></string-name> and <string-name><given-names>M. M.</given-names> <surname>Nasiri Fatmehsari</surname></string-name></person-group>, &#x201C;<article-title>Hybrid classical-quantum transfer learning for text classification</article-title>,&#x201D; <source>Quantum Mach. Intell.</source>, vol. <volume>6</volume>, p. <fpage>19</fpage>, <year>2024</year>. doi: <pub-id pub-id-type="doi">10.1007/s42484-024-00147-2</pub-id>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M. R.</given-names> <surname>Julis</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Alagesan</surname></string-name></person-group>, &#x201C;<article-title>Spam detection in SMS using machine learning through text mining</article-title>,&#x201D; <source>Int. J. Sci. Technol. Res.</source>, vol. <volume>9</volume>, no. <issue>2</issue>, pp. <fpage>171</fpage>&#x2013;<lpage>175</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>I. S.</given-names> <surname>Mambina</surname></string-name>, <string-name><given-names>J. D.</given-names> <surname>Ndibwile</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Uwimpuhwe</surname></string-name>, and <string-name><given-names>K. F.</given-names> <surname>Michael</surname></string-name></person-group>, &#x201C;<article-title>Uncovering SMS spam in swahili text using deep learning approaches</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>12</volume>, pp. <fpage>25164</fpage>&#x2013;<lpage>25175</lpage>, <year>2024</year>.</mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>E.</given-names> <surname>Zafarani-Moattar</surname></string-name>, <string-name><given-names>M. R.</given-names> <surname>Kangavari</surname></string-name>, and <string-name><given-names>A. M.</given-names> <surname>Rahmani</surname></string-name></person-group>, &#x201C;<article-title>Neural network meaningful learning theory and its application for deep text clustering,</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>12</volume>, pp. <fpage>42411</fpage>&#x2013;<lpage>42422</lpage>, <year>2024</year>.</mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>P.</given-names> <surname>Rajapaksha</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Farahbakhsh</surname></string-name>, and <string-name><given-names>N.</given-names> <surname>Crespi</surname></string-name></person-group>, &#x201C;<article-title>Bert, xlnet or roberta: The best transfer learning model to detect clickbaits</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>9</volume>, pp. <fpage>154704</fpage>&#x2013;<lpage>154716</lpage>, <year>2021</year>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2021.3128742</pub-id>.</mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K. L.</given-names> <surname>Tan</surname></string-name>, <string-name><given-names>C. P.</given-names> <surname>Lee</surname></string-name>, and <string-name><given-names>K. M.</given-names> <surname>Lim</surname></string-name></person-group>, &#x201C;<article-title>RoBERTa-GRU: A hybrid deep learning model for enhanced sentiment analysis</article-title>,&#x201D; <source>Appl. Sci.</source>, vol. <volume>13</volume>, no. <issue>6</issue>, pp. <fpage>3915</fpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.3390/app13063915</pub-id>.</mixed-citation></ref>
<ref id="ref-36"><label>[36]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>Tajaddodianfar</surname></string-name>, <string-name><given-names>J. W.</given-names> <surname>Stokes</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Gururajan</surname></string-name></person-group>, &#x201C;<article-title>Texception: A character/word-level deep learning model for phishing URL detection</article-title>,&#x201D; in <conf-name>Proc. 2020 IEEE Int. Conf. Acoust., Speech, and Signal Process. (ICASSP)</conf-name>, <conf-loc>Barcelona, Spain</conf-loc>, <year>2020</year>, pp. <fpage>2857</fpage>&#x2013;<lpage>2861</lpage>.</mixed-citation></ref>
<ref id="ref-37"><label>[37]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>N. S. M.</given-names> <surname>Nafis</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Awang</surname></string-name></person-group>, &#x201C;<article-title>The evaluation of accuracy performance in an enhanced embedded feature selection for unstructured text classification</article-title>,&#x201D; <source>Iraqi J. Sci.</source>, vol. <volume>61</volume>, no. <issue>12</issue>, pp. <fpage>3397</fpage>&#x2013;<lpage>3407</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-38"><label>[38]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>G. A.</given-names> <surname>Vlad</surname></string-name>, <string-name><given-names>M. A.</given-names> <surname>Tanase</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Onose</surname></string-name>, and <string-name><given-names>D. C.</given-names> <surname>Cercel</surname></string-name></person-group>, &#x201C;<article-title>Sentence-level propaganda detection in news articles with transfer learning and BERT-BiLSTM-capsule model</article-title>,&#x201D; in <conf-name>Proc. Second Workshop Nat. Lang. Process. Internet Freedom: Censorship, Disinformation, and Propaganda</conf-name>, <conf-loc>Hong Kong, China</conf-loc>, <publisher-name>Association for Computational Linguistics</publisher-name>, <year>2019</year>, pp. <fpage>148</fpage>&#x2013;<lpage>154</lpage>.</mixed-citation></ref>
<ref id="ref-39"><label>[39]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Huang</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Hierarchical multi-label text classification: An attention-based recurrent network approach</article-title>,&#x201D; in <conf-name>Proc. 28th ACM Int. Conf. Inform. Knowl. Manag. (CIKM &#x2018;19)</conf-name>, <conf-loc>New York, NY, USA</conf-loc>, <year>2019</year>, pp. <fpage>1051</fpage>&#x2013;<lpage>1060</lpage>.</mixed-citation></ref>
<ref id="ref-40"><label>[40]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K. L.</given-names> <surname>Tan</surname></string-name>, <string-name><given-names>C. P.</given-names> <surname>Lee</surname></string-name>, and <string-name><given-names>K. M.</given-names> <surname>Lim</surname></string-name></person-group>, &#x201C;<article-title>A survey of sentiment analysis: Approaches, datasets, and future research</article-title>,&#x201D; <source>Appl. Sci.</source>, vol. <volume>13</volume>, no. <issue>7</issue>, pp. <fpage>4550</fpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.3390/app13074550</pub-id>.</mixed-citation></ref>
<ref id="ref-41"><label>[41]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>I. K. S.</given-names> <surname>Al-Tameemi</surname></string-name>, <string-name><given-names>M. R.</given-names> <surname>Feizi-Derakhshi</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Pashazadeh</surname></string-name>, and <string-name><given-names>M.</given-names> <surname>Asadpour</surname></string-name></person-group>, &#x201C;<article-title>An efficient sentiment classification method with the help of neighbors and a hybrid of RNN models</article-title>,&#x201D; <source>Complexity</source>, vol. <volume>2023</volume>, pp. <fpage>1</fpage>&#x2013;<lpage>14</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1155/2023/1896556</pub-id>.</mixed-citation></ref>
<ref id="ref-42"><label>[42]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C. D. P.</given-names> <surname>Laureate</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Buntine</surname></string-name>, and <string-name><given-names>H.</given-names> <surname>Linger</surname></string-name></person-group>, &#x201C;<article-title>A systematic review of the use of topic models for short text social media analysis</article-title>,&#x201D; <source>Artif. Intell. Rev.</source>, vol. <volume>56</volume>, no. <issue>12</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>33</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1007/s10462-023-10471-x</pub-id>; <pub-id pub-id-type="pmid">37362887</pub-id></mixed-citation></ref>
<ref id="ref-43"><label>[43]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>Sakketou</surname></string-name> and <string-name><given-names>N.</given-names> <surname>Ampazis</surname></string-name></person-group>, &#x201C;<article-title>A constrained optimization algorithm for learning GloVe embeddings with semantic lexicons</article-title>,&#x201D; <source>Knowl. Based Syst.</source>, vol. <volume>195</volume>, pp. <fpage>105737</fpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-44"><label>[44]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Kowsari</surname></string-name>, <string-name><given-names>K. J.</given-names> <surname>Meimandi</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Heidarysafa</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Mendu</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Barnes</surname></string-name> and <string-name><given-names>D.</given-names> <surname>Brown</surname></string-name></person-group>, &#x201C;<article-title>Text classification algorithms: A survey</article-title>,&#x201D; <source>Information</source>, vol. <volume>10</volume>, no. <issue>4</issue>, pp. <fpage>150</fpage>, <year>2019</year>. doi: <pub-id pub-id-type="doi">10.3390/info10040150</pub-id>.</mixed-citation></ref>
<ref id="ref-45"><label>[45]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C. M.</given-names> <surname>Fuller</surname></string-name>, <string-name><given-names>D. P.</given-names> <surname>Biros</surname></string-name>, and <string-name><given-names>D.</given-names> <surname>Delen</surname></string-name></person-group>, &#x201C;<article-title>An investigation of data and text mining methods for real world deception detection</article-title>,&#x201D; <source>Expert Syst. Appl.</source>, vol. <volume>38</volume>, no. <issue>7</issue>, pp. <fpage>8392</fpage>&#x2013;<lpage>8398</lpage>, <year>2011</year>.</mixed-citation></ref>
<ref id="ref-46"><label>[46]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>T.</given-names> <surname>Prexawanprasut</surname></string-name> and <string-name><given-names>P.</given-names> <surname>Chaipornkaew</surname></string-name></person-group>, &#x201C;<article-title>An analytical study on email classification using 10-fold cross-validation</article-title>,&#x201D; in <conf-name>2019 5th Int. Conf. Sci. Inform. Technol. (ICSITech)</conf-name>, <conf-loc>Yogyakarta, Indonesia</conf-loc>, <year>2019</year>, pp. <fpage>38</fpage>&#x2013;<lpage>43</lpage>.</mixed-citation></ref>
<ref id="ref-47"><label>[47]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>I. K. S.</given-names> <surname>Al-Tameemi</surname></string-name>, <string-name><given-names>M. R.</given-names> <surname>Feizi-Derakhshi</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Pashazadeh</surname></string-name>, and <string-name><given-names>M.</given-names> <surname>Asadpour</surname></string-name></person-group>, &#x201C;<article-title>Interpretable multimodal sentiment classification using deep multi-view attentive network of image and text data</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>11</volume>, pp. <fpage>91060</fpage>&#x2013;<lpage>91081</lpage>, <year>2023</year>.</mixed-citation></ref>
<ref id="ref-48"><label>[48]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M. R.</given-names> <surname>Hossain</surname></string-name>, <string-name><given-names>M. M.</given-names> <surname>Hoque</surname></string-name>, and <string-name><given-names>N.</given-names> <surname>Siddique</surname></string-name></person-group>, &#x201C;<article-title>Leveraging the meta-embedding for text classification in a resource-constrained language</article-title>,&#x201D; <source>Eng. Appl. Artif. Intell.</source>, vol. <volume>124</volume>, no. <issue>1</issue>, pp. <fpage>106586</fpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1016/j.engappai.2023.106586</pub-id>.</mixed-citation></ref>
</ref-list>
</back></article>