<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMES</journal-id>
<journal-id journal-id-type="nlm-ta">CMES</journal-id>
<journal-id journal-id-type="publisher-id">CMES</journal-id>
<journal-title-group>
<journal-title>Computer Modeling in Engineering &#x0026; Sciences</journal-title>
</journal-title-group>
<issn pub-type="epub">1526-1506</issn>
<issn pub-type="ppub">1526-1492</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">66580</article-id>
<article-id pub-id-type="doi">10.32604/cmes.2025.066580</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Enhancing 3D U-Net with Residual and Squeeze-and-Excitation Attention Mechanisms for Improved Brain Tumor Segmentation in Multimodal MRI</article-title>
<alt-title alt-title-type="left-running-head">Enhancing 3D U-Net with Residual and Squeeze-and-Excitation Attention Mechanisms for Improved Brain Tumor Segmentation in Multimodal MRI</alt-title>
<alt-title alt-title-type="right-running-head">Enhancing 3D U-Net with Residual and Squeeze-and-Excitation Attention Mechanisms for Improved Brain Tumor Segmentation in Multimodal MRI</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Chen</surname><given-names>Yao-Tien</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-2" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Ahmad</surname><given-names>Nisar</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><email>d101gp006@mail2.mcut.edu.tw</email></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Aurangzeb</surname><given-names>Khursheed</given-names></name><xref ref-type="aff" rid="aff-2">2</xref></contrib>
<aff id="aff-1"><label>1</label><institution>International Ph.D. Program in Innovative Technology of Biomedical Engineering and Medical Devices, Ming Chi University of Technology</institution>, <addr-line>New Taipei City, 243303</addr-line>, <country>Taiwan</country></aff>
<aff id="aff-2"><label>2</label><institution>Department of Computer Engineering, College of Computer and Information Sciences, King Saud University</institution>, P.O. Box <addr-line>51178, Riyadh, 11543</addr-line>, <country>Saudi Arabia</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Nisar Ahmad. Email: <email>d101gp006@mail2.mcut.edu.tw</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2025</year>
</pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>31</day><month>07</month><year>2025</year>
</pub-date>
<volume>144</volume>
<issue>1</issue>
<fpage>1197</fpage>
<lpage>1224</lpage>
<history>
<date date-type="received">
<day>11</day>
<month>4</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>26</day>
<month>6</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2025 The Authors.</copyright-statement>
<copyright-year>2025</copyright-year>
<copyright-holder>Published by Tech Science Press.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMES_66580.pdf"></self-uri>
<abstract>
<p>Accurate and efficient brain tumor segmentation is essential for early diagnosis, treatment planning, and clinical decision-making. However, the complex structure of brain anatomy and the heterogeneous nature of tumors present significant challenges for precise anomaly detection. While U-Net-based architectures have demonstrated strong performance in medical image segmentation, there remains room for improvement in feature extraction and localization accuracy. In this study, we propose a novel hybrid model designed to enhance 3D brain tumor segmentation. The architecture incorporates a 3D ResNet encoder known for mitigating the vanishing gradient problem and a 3D U-Net decoder. Additionally, to enhance the model&#x2019;s generalization ability, Squeeze and Excitation attention mechanism is integrated. We introduce Gabor filter banks into the encoder to further strengthen the model&#x2019;s ability to extract robust and transformation-invariant features from the complex and irregular shapes typical in medical imaging. This approach, which is not well explored in current U-Net-based segmentation frameworks, provides a unique advantage by enhancing texture-aware feature representation. Specifically, Gabor filters help extract distinctive low-level texture features, reducing the effects of texture interference and facilitating faster convergence during the early stages of training. Our model achieved Dice scores of 0.881, 0.846, and 0.819 for Whole Tumor (WT), Tumor Core (TC), and Enhancing Tumor (ET), respectively, on the BraTS 2020 dataset. Cross-validation on the BraTS 2021 dataset further confirmed the model&#x2019;s robustness, yielding Dice score values of 0.887 for WT, 0.856 for TC, and 0.824 for ET. The proposed model outperforms several state-of-the-art existing models, particularly in accurately identifying small and complex tumor regions. Extensive evaluations suggest integrating advanced preprocessing with an attention-augmented hybrid architecture offers significant potential for reliable and clinically valuable brain tumor segmentation.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>3D MRI</kwd>
<kwd>artificial intelligence</kwd>
<kwd>deep learning</kwd>
<kwd>AI in healthcare</kwd>
<kwd>attention mechanism</kwd>
<kwd>U-Net</kwd>
<kwd>medical image analysis</kwd>
<kwd>brain tumor segmentation</kwd>
<kwd>BraTS 2021</kwd>
<kwd>BraTS 2020</kwd>
</kwd-group>
<funding-group>
<award-group id="awg1">
<funding-source>National Science and Technology Council</funding-source>
<award-id>112-2637-M-131-001</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>Tumors in the brain result from uncontrolled cellular growth, which can interfere with neural processes and harm adjacent healthy tissues. Because the brain is essential for regulating bodily activities, such growths can greatly affect its functioning, making them some of the most critical health threats to individuals. The incidence of malignant brain tumors is currently high, impacting both individuals and society as a whole [<xref ref-type="bibr" rid="ref-1">1</xref>]. The most common type of brain tumor is glioma, which occurs in the brain and exhibits varying degrees of aggressiveness. Gliomas can present with different symptoms and affect different brain sub-regions. These sub-regions can be categorized into peritumoral edema, necrotic core, and enhancing and non-enhancing tumors [<xref ref-type="bibr" rid="ref-2">2</xref>]. Magnetic Resonance Imaging (MRI) sequences are highly beneficial while assessing gliomas as they provide complementary information [<xref ref-type="bibr" rid="ref-3">3</xref>]. Radiologists choose MRI scans for diagnosing and assessing brain tumors. Complementary MRI modalities are T1-weighted (T1), contrast-enhanced T1-weighted (T1CE), T2-weighted (T2) and Fluid Attenuated Inversion Recovery (FLAIR). These scans are acquired based on the repetition and excitation durations, allowing their use alongside additional information to identify different tumor subregions [<xref ref-type="bibr" rid="ref-4">4</xref>&#x2013;<xref ref-type="bibr" rid="ref-6">6</xref>].</p>
<p>Identifying brain tumor sub-regions manually using MRI data is a subjective process that is time-consuming and prone to errors. Radiologists may face challenges distinguishing brain cell nuclei from the MRI image background, adding complexity to the medical interpretation [<xref ref-type="bibr" rid="ref-7">7</xref>]. The complex shapes and positions of brain tumors in multimodal images pose segmentation challenges in MRI scans, which makes tumor identification in brain MRI images challenging. However, accurate tumor segmentation and delineation are essential for diagnosing and characterizing brain tumors [<xref ref-type="bibr" rid="ref-8">8</xref>]. Accurate segmentation enables the extraction of qualitative and quantitative data, distinguishing between benign and malignant tumors. This information aids in tailoring optimal therapies for patients and assists healthcare providers in devising more effective treatment strategies. Simplifying image analysis and segmentation facilitates efficient tumor identification.</p>
<p>Given the complexity of tumor segmentation in MRI images, numerous algorithms and techniques, ranging from manual to fully automated approaches, have been developed to address this challenge. Automated segmentation of gliomas from multimodal MRI scans can assist in surgical planning and diagnosis for clinicians. Furthermore, it provides a reliable and consistent method for future tumor research and monitoring [<xref ref-type="bibr" rid="ref-9">9</xref>]. Computer-Aided Detection (CAD) systems, particularly those using deep learning and Convolutional Neural Networks (CNNs), have shown strong potential in brain tumor identification from MRI scans [<xref ref-type="bibr" rid="ref-10">10</xref>]. Studies have demonstrated that AI systems can surpass human performance in various medical imaging tasks, including segmentation and diagnosis [<xref ref-type="bibr" rid="ref-11">11</xref>]. CAD systems offer numerous benefits, including improving radiologists&#x2019; subjective judgment and accelerating the screening process. As in other medical imaging fields, machine learning and Artificial Intelligence (AI) play significant roles in CAD systems for brain tumor classification and identification. Various CAD methods have been proposed for diagnosing and categorizing brain tumors. Several CAD approaches have been presented in the literature for diagnosing and categorizing brain tumors.</p>
<p>The emerging deep learning techniques can address traditional machine learning methods&#x2019; limitations [<xref ref-type="bibr" rid="ref-12">12</xref>]. The capacity for self-learning may help create new imaging features that are beneficial for statistical brain MRI analysis. Numerous research studies have concentrated on utilizing CNNs for delineating brain tumors. Deep Convolutional Neural Networks (DCNNs) have proven to be capable of performing tasks such as segmenting brain tumors in both real-world and clinical image datasets [<xref ref-type="bibr" rid="ref-13">13</xref>]. U-Net [<xref ref-type="bibr" rid="ref-14">14</xref>] and Fully Convolutional Networks (FCN) [<xref ref-type="bibr" rid="ref-15">15</xref>] stand out as the most commonly employed DL-based methods for medical image segmentation. Among them, U-Net has proven to be the most effective in terms of performance. While U-Nets have demonstrated accuracy comparable to human performance in segmenting 2D images, their application to volumetric medical images requires treating 3D images as multiple 2D slices. This approach obstructs the capture of connections between adjacent slices. Consequently, several subsequent studies encourage volumetric extensions of the U-Net to achieve finer localization. The creator of the U-Net has proposed a practical solution to the volumetric segmentation challenge, known as 3D-U-Net [<xref ref-type="bibr" rid="ref-16">16</xref>], which replaces the U-Net&#x2019;s 2D convolutions with their 3D counterparts.</p>
<p>Various advanced filtering methods have been utilized to extract meaningful image representations. Among them, deformable filters [<xref ref-type="bibr" rid="ref-17">17</xref>] improve the model&#x2019;s capability to handle geometric transformations by adapting their shape to the input features. However, this flexibility comes at the cost of increased model complexity and higher computational demands during training. Another notable approach involves the application of rotating filters. For instance, Zhou et al. [<xref ref-type="bibr" rid="ref-18">18</xref>] proposed actively rotating filters that dynamically rotate during convolution, enabling the generation of feature maps that explicitly encode spatial position and orientation. Despite their innovation, these filters are more effective when applied to small and relatively simple filter configurations.</p>
<p>In computer vision and image processing, the Gabor filter [<xref ref-type="bibr" rid="ref-19">19</xref>] is one of the most renowned texture analysis and feature extraction mechanisms. A recent study [<xref ref-type="bibr" rid="ref-20">20</xref>] showed that the meaningful features extracted by the Gabor filter enhances the accuracy of the model and also help modulate learned representations, thereby expanding the network&#x2019;s interpretability.</p>
<p>Many researchers have adapted attention mechanisms [<xref ref-type="bibr" rid="ref-21">21</xref>], initially developed in natural language processing (NLP), for machine vision tasks. This integration aims to enhance the capability of CNNs in the analysis of images, particularly for precision-demanding tasks such as segmenting brain tumor. In the complex task of predicting 3D medical image segmentation, it is essential to consider both local and global features. Hatamizadeh et al. [<xref ref-type="bibr" rid="ref-22">22</xref>] introduced the U-NET Transformer (UNETR), an innovative architecture that employed Transformers as encoders to learn sequential representations from input volumes. This design efficiently extracted global multiscale information while incorporating an effective &#x201C;U-shaped&#x201D; encoding and decoder system. SwinBTS [<xref ref-type="bibr" rid="ref-23">23</xref>] was a novel approach that combined transformers, CNN, and an encoder-decoder architecture for 3D medical image segmentation.</p>
<p>In dense prediction tasks like segmentation, capturing local and global information is highly significant. However, splitting images into patches overlooks local structures. This limitation is particularly significant in medical volumetric data, such as 3D MRI scans, where modeling local features across continuous slices (the depth dimension) is essential for accurate segmentation [<xref ref-type="bibr" rid="ref-24">24</xref>]. Therefore, an efficient model that can capture local and global features without overlooking the significant details in volumetric segmentation is indispensable. Traditional image segmentation approaches are restricted by their ability to detect highly precise objects.</p>
<p>As neural networks deepen, the vanishing gradient problem often arises during training, where the gradient norms in early layers diminish toward zero. In conventional U-Net architectures, down-sampling operations tend to suppress low-level features essential for accurate segmentation in favor of high-level semantic information. Consequently, important local and positional details are progressively lost in deeper layers due to successive convolutional and non-linear operations. Although incorporating residual layers in the encoder helps mitigate the vanishing gradient issue by enabling gradient flow through shortcut connections, this strategy alone is not sufficient to achieve optimal segmentation performance. The network must dynamically emphasize the most important features across different channels enhancing the model&#x2019;s generalization ability. This is where the Squeeze-and-Excitation (SE) mechanism becomes essential. By recalibrating channel-wise feature responses, the SE mechanism enhances the network&#x2019;s ability to emphasize significant features and suppress less important ones. This adaptive feature prioritization helps retain crucial spatial and contextual information, leading to more accurate segmentation results.</p>
<p>Incorporating the SE mechanism with Residual networks minimizes the vanishing gradient problem and improves the network&#x2019;s capacity to generalize better by focusing on the most relevant features. This dual approach ensures that low-level and high-level features are effectively utilized, enhancing the model&#x2019;s overall performance in brain tumor segmentation tasks.</p>
<p>Although prior studies have individually explored residual connections, attention mechanisms, or texture-based filtering in medical imaging, no previous study has combined a ResNet-based encoder, SE attention mechanisms with skip connections, and Gabor filtering into a unified 3D U-Net framework for brain tumor segmentation. This architecture is specifically designed address the complexity of tumor subregions delineation in 3D MRI, and our experimental results on the BraTS datasets confirm its effectiveness over conventional methods.</p>
<p>This novel study emphasizes the importance of addressing the vanishing gradient problem and the need for adaptive feature prioritization through integrating Residual Networks and SE attention mechanisms, ensuring enhanced generalization and improved accuracy in brain tumor segmentation. To address the limitations of current segmentation approaches, we propose a novel architecture named dSEAT-UNet, which integrates deep residual encoding, SE Attention, and a 3D UNet-based decoder for effective brain tumor segmentation in multimodal MRI scans.</p>
<p>To achieve accurate and efficient brain tumor segmentation, we have implemented the following steps, which represent significant contributions to this study,
<list list-type="bullet">
<list-item>
<p>Conventional spectral techniques are integrated into the multimodal segmentation model to incorporate the feature maps by utilizing a fixed and optimized Gabor filter to mitigate the complexity of segmenting complex tumor shapes.</p></list-item>
<list-item>
<p>A Hybrid 3D model is designed based on the residual encoder and U-Net decoder. The residual encoder effectively addresses the issue of vanishing gradients. Skip connections are employed to accelerate the training process.</p></list-item>
<list-item>
<p>A Squeeze-and-Excitation based attention mechanism is introduced in the skip connections making the model automatically focus solely on significant features crucial for brain tumor segmentation.</p></list-item>
<list-item>
<p>To address the class imbalance problem in brain tumor images, we incorporate a combined dice loss and focal loss in the total loss function.</p></list-item>
<list-item>
<p>An in-depth comparison of the developed and evaluated hybrid 3D model is presented for Whole Tumor (WT), Tumor Core (TC), and Enhancing Tumor (ET) with the state-of-the-art brain tumor segmentation methods to highlight the model&#x2019;s significantly improved performance.</p></list-item>
</list></p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Work</title>
<p>This section contains related work that emphasizes the application of implementing CNN architectures in brain tumor segmentation and integrating the attention mechanisms in deep neural networks. The related work for both areas is separately presented in the following sections.</p>
<sec id="s2_1">
<label>2.1</label>
<title>CNN Architectures in Brain Tumor Segmentation</title>
<p>In recent years, automated segmentation of brain tumors from multimodal MRI scans has gained significant attention within the healthcare imaging sector. Various advanced AI models have been proposed to mitigate the complexities regarding segmentation of brain tumors. One such approach, proposed by Raza et al. [<xref ref-type="bibr" rid="ref-25">25</xref>], introduced the dResU-Net model for 3D brain tumor segmentation. The dResU-Net architecture is based on the U-Net framework, enhanced with residual connections to improve segmentation accuracy. By leveraging multimodal MRI data, including T1, T1CE, T2, and FLAIR images, the model aims to provide robust brain tumor segmentation results. Incorporating residual connections enables efficient information flow and gradient propagation, facilitating the segmentation of complex brain tumor structures. While the dResU-Net architecture significantly advances brain tumor segmentation, ongoing research further explores novel deep-learning approaches and data augmentation techniques to improve segmentation accuracy and robustness.</p>
<p>Wang et al. [<xref ref-type="bibr" rid="ref-26">26</xref>] introduced the TransBTS architecture, which effectively integrates a transformer into a three-dimensional convolutional neural network structured around an encoding-decoding pipeline. Initially, a 3D convolutional backbone is employed to capture detailed local features and spatial representations. A transformer is fed with these extracted features extract global features. Subsequently, the decoder module combines these local and global features during upsampling to generate segmentation results. The experiments were conducted using the BraTS 2019 and 2020 datasets, demonstrating comparable performance.</p>
<p>Chen et al. [<xref ref-type="bibr" rid="ref-27">27</xref>] proposed a separable 3D U-Net architecture to overcome the limitations of traditional 2D CNNs, which often fail to fully capture the spatial context of volumetric data. To reduce memory consumption while preserving spatial information, their model replaces standard 3D convolutions with a sequence of two layers: a 2D convolution for extracting spatial features and a 1D convolution for capturing temporal dependencies. This approach efficiently processes 3D brain volumes by decomposing convolutions into three separate branches. Additionally, separable temporal convolutions were integrated into a residual inception framework. The model was independently trained on axial, sagittal, and coronal views, with the outputs from each orientation combined using a multi-view fusion strategy to boost performance. The effectiveness of this design was demonstrated through strong results on the BraTS 2018 test dataset. In segmentation tasks, capturing both local and global features is essential for accurate predictions. However, as the network depth increases, the gradients associated with low-level features such as edges, boundaries, and fine textures tend to vanish, reducing their influence during training. Maji et al. [<xref ref-type="bibr" rid="ref-28">28</xref>] presented a ResUNet model that incorporates attention mechanisms alongside a decoder guided by auxiliary features for brain tumor segmentation. This model guided the learning process at each decoder layer. Benefiting from attention mechanisms, the model focused on significant features rather than including all features, thereby reducing the introduction of noisy features into the decoder for segmentation mapping. The proposed model outperformed the other methods when evaluated on the BraTS 2019 dataset.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Integrating Attention Mechanism in CNN Architectures</title>
<p>Researchers suggest that integrating the attention mechanism into CNNs could enhance the expression of local features and improve regions&#x2019; segmentation performance. Similarly, some features are more important than others for accurate segmentation. Therefore, attention mechanisms have become valuable tools for highlighting the essential features while minimizing the impact of less important ones. Attention mechanisms have also demonstrated strong potential in broader decision-making applications beyond medical imaging. Kia [<xref ref-type="bibr" rid="ref-29">29</xref>] applied attention-guided deep learning to multi-criteria decision analysis, highlighting the effectiveness of attention modules in directing computational focus toward the most relevant information. This cross-domain success further supports the growing adoption of attention-based strategies in brain tumor segmentation tasks.</p>
<p>Zhang et al. [<xref ref-type="bibr" rid="ref-30">30</xref>] proposed the AResU-Net, a model designed to perform volumetric brain tumor segmentation. Their approach incorporated attention mechanisms and residual units into up and down-sampling layers. The enhancement was intended to strengthen local feature responsiveness in the process of down-sampling to improve feature restoration while increasing the resolution. Given the constraints in computational power, the evaluations were conducted using 2D images from two datasets i.e., BraTS 2017 and 2018. Cao et al. [<xref ref-type="bibr" rid="ref-31">31</xref>] introduced a novel architecture named MBANet, which incorporates a multi-branch attention mechanism into 3D CNNs. Building upon this approach, this study emphasizes the importance of integrating attention mechanisms into brain tumor segmentation networks. This integration aims to minimize the focus on irrelevant data while enhancing the precise identification of brain tumor regions. Akbar et al. [<xref ref-type="bibr" rid="ref-32">32</xref>] presented the modified U-Net method by integrating attention-based skip connections. Additionally, they developed the Multi-path Residual Attention Block (MRAB), which combines two deeply convolutional sequences linked with an attention block and a residual path. Zhang et al. [<xref ref-type="bibr" rid="ref-24">24</xref>] introduced an innovative brain tumor segmentation method that addresses the effect of using an attention mechanism. By leveraging the attention mechanism into U-Net, segmentation becomes more robust, enhances local feature expression, and improves medical image segmentation performance. Liu et al. [<xref ref-type="bibr" rid="ref-33">33</xref>] presented a lightweight 3D method integrating an attention mechanism. This feature enables the network to autonomously concentrate on the tumor region, thereby strengthening the correlation between the whole tumor and tumor core, resulting in improved segmentation accuracy. Li et al. [<xref ref-type="bibr" rid="ref-34">34</xref>] developed an intelligent method for brain tumor identification and classification from MRI data. Their approach includes a preprocessing step to remove image background and identify brain tissue, followed by a novel segmentation technique based on parallel CNNs to classify tumor types. Yuan et al. [<xref ref-type="bibr" rid="ref-35">35</xref>] implemented channel attention as an SE network to improve the efficiency of the T2T-ViT backbone to implement a transformer-based application for image classification.</p>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Methodology</title>
<p>This section presents details on the dataset used in the study, the preprocessing strategies applied, and the architectural design of the proposed model.</p>
<sec id="s3_1">
<label>3.1</label>
<title>Brain Tumor Dataset</title>
<p>In this study, we used a benchmark dataset, i.e., BraTS (Brain Tumor Segmentation) 2020 [<xref ref-type="bibr" rid="ref-36">36</xref>&#x2013;<xref ref-type="bibr" rid="ref-38">38</xref>], for training and testing the designed model. This multimodal BraTS 2020 dataset contains four channels (T1, T1CE, T2, and T2-FLAIR). Various image modalities and sequences are used in MRI scans for diagnosing brain tumors, including T1, T1CE, T2, and FLAIR. T1 predominantly evaluates healthy tissues, while T2 highlights tumor regions. T1 images are obtained through sagittal or axial 2D acquisitions with slice thicknesses ranging from 1 to 6 mm. T1CE, and T1 images acquired via 3D acquisitions featuring a voxel size of 1 mm isotropic. T2 images acquired through axial 2D acquisitions, with slice thicknesses varying from 2 to 6 mm. FLAIR contains T2-weighted FLAIR images acquired in axial, coronal, or sagittal 2D acquisitions, with slice thicknesses ranging from 2 to 6 mm. However, T1CE emphasizes tumor borders. FLAIR scans assist in distinguishing edema from Cerebrospinal Fluid (CSF) [<xref ref-type="bibr" rid="ref-39">39</xref>,<xref ref-type="bibr" rid="ref-40">40</xref>]. The mask contained four labels: i.e., Background, Edema (ED), Enhancing Tumor (ET) and Non-Enhancing Tumor (NET). BraTS 2020 dataset sample images are displayed in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Multimodal dataset samples shown here are provided in the BraTS 2020 benchmark for four modalities, i.e., T1, T1CE, T2, and FLAIR (in actual dimensions)</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-1.tif"/>
</fig>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Dataset Preprocessing</title>
<p>Segmenting brain tumors in MRI is a difficult task due to the brain&#x2019;s complex structure, different types of tissues, and varying image quality. Even though deep learning models can handle some noise, proper data preprocessing is still essential to enhance segmentation accuracy. We performed data preprocessing on the original BraTS 2020 dataset to make it suitable for the model. The following sections provide an explanation of these steps.</p>
<p>The BraTS 2020 dataset contains 369 images with a 240 &#x00D7; 240 &#x00D7; 155 resolution. The original dimensions of the BraTS2020 dataset are 240 &#x00D7; 240 &#x00D7; 155. Brain tumor sample images for all modalities and respective ground truth are presented in <xref ref-type="fig" rid="fig-2">Fig. 2</xref>. In our analysis, we resized the images to 160 &#x00D7; 160 &#x00D7; 128. While some studies have resized the images to 128 &#x00D7; 128 &#x00D7; 128, we observed that this resolution did not cover the entire tumor in some cases. Therefore, we chose 160 &#x00D7; 160 &#x00D7; 128 to ensure better coverage of the tumor regions, improving our segmentation results&#x2019; accuracy, reliability, and analysis.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Samples of multimodal MRI BraTS 2020 images with segmentation mask (ground truth)</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-2.tif"/>
</fig>
<p>As shown in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>, there are extra pixels that should be resized to avoid computational overload. Finally, the dataset is resized to 160 &#x00D7; 160 &#x00D7; 128 pixels, which contains the required region of interest. <xref ref-type="fig" rid="fig-3">Fig. 3</xref> shows the resized dataset samples with channels and masks. The dataset is divided into 75% for training, 15% for validation, and 10% for testing.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Resized BraTS 2020 sample MRI images with mask (ground truth)</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-3.tif"/>
</fig>
<p>We employed a min-max scaler, also called normalization, regarded as one of the simplest scaling techniques. Improper feature scaling can lead the model to give too much importance to features with larger numerical values, such as the second feature in this case. To avoid this, normalization is applied to transform the data into a standard range between 0 and 1 [<xref ref-type="bibr" rid="ref-41">41</xref>]. This process adjusts each minimum and maximum feature value, standardizing the distribution and preventing scale-related bias during training.</p>
<p>This normalization ensures fair comparisons between images and enables each pixel to contribute proportionally to the overall image, a critical step emphasized in recent segmentation frameworks that combine preprocessing with optimized model pipelines [<xref ref-type="bibr" rid="ref-42">42</xref>]. Image normalization is also performed to adjust the scaled pixel values to have mean and standard deviation values of 0 and 1. Mean values is subtracted from scaled value, the result is divided by standard deviation [<xref ref-type="bibr" rid="ref-20">20</xref>]. Normalization can be computed using the following <xref ref-type="disp-formula" rid="eqn-1">Eq. (1)</xref>.
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:msub><mml:mi>z</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>V</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03BC;</mml:mi></mml:mrow><mml:mi>&#x03C3;</mml:mi></mml:mfrac></mml:math></disp-formula>where <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:msub><mml:mi>z</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the calculated normalized value for each image patch, <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:msub><mml:mi>V</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the scaled image value obtained from the first step of min-max scaling. <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:mi>&#x03BC;</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mi>&#x03C3;</mml:mi></mml:math></inline-formula> are the mean and standard deviation of the scaled pixel values.</p>
<p>The BraTS 2020 dataset contains four main tumor classes with labels i&#x2013;e, Background (Label 0), NET (Label 1), ED (Label 2), and ET (Label 4). Label 3 is missing; therefore, for better data handling, label four is reorganized as label 3, as <xref ref-type="table" rid="table-1">Table 1</xref> presents.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Information related to tumor classes with labels (before and after re-arrangement)</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th colspan="2">Actual</th>
<th colspan="2">Re-arranged</th>
</tr>
<tr>
<th>Tumor class</th>
<th>Label</th>
<th>Tumor class</th>
<th>Label</th>
</tr>
</thead>
<tbody>
<tr>
<td>Background</td>
<td>0</td>
<td>Background</td>
<td>0</td>
</tr>
<tr>
<td>Necrosis/Non enhancing tumor (NET)</td>
<td>1</td>
<td>Necrosis<bold>/</bold>Non enhancing tumor (NET)</td>
<td>1</td>
</tr>
<tr>
<td>Edema (ED)</td>
<td>2</td>
<td>Edema (ED)</td>
<td>2</td>
</tr>
<tr>
<td>Enhancing tumor (ET)</td>
<td>4</td>
<td>Enhancing tumor (ET)</td>
<td>3</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Gabor Filter</title>
<p>In computer vision, different ways of image filtering are used to show specific features in images. One such method is the Gabor filter, made from wave patterns with different frequencies and directions. These filters help capture details in images. Their functions can be explained using mathematical equations [<xref ref-type="bibr" rid="ref-43">43</xref>]. The mathematical formulation of the Gabor filter, including its equation, variable definitions, and parameter design considerations, is detailed in Appendix A.</p>
<p>Before application, the filter parameters, wavelength (&#x03BB;), orientation (&#x03B8;), phase offset (&#x03C8;), standard deviation (&#x03C3;), and aspect ratio (&#x03B3;), were empirically optimized. The selected value ranges and their functional roles are summarized in <xref ref-type="table" rid="table-8">Table A1</xref> (Appendix A).</p>
<p>For 3D brain tumor segmentation, we implemented a volumetric Gabor filter with a kernel size of (3, 3, 3). Sample slices of this 3D Gabor filter across the <italic>x</italic>, <italic>y</italic>, and <italic>z</italic> axes are illustrated in <xref ref-type="fig" rid="fig-4">Fig. 4</xref>, showing how varying parameter combinations affect texture response. These are arranged in a 3 &#x00D7; 5 grid to visually demonstrate the diversity and directional sensitivity of the filter design.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>An illustration of when Gabor filter banks were applied to a sample input image and generated filtered sample images. After experimenting with different parameter values, the most suitable parameters are selected as optimization parameters</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-4.tif"/>
</fig>
<p>Gabor filters are widely acknowledged for their ability to identify spatial and frequency domain characteristics, making them a preferred tool in numerous pattern analysis tasks.</p>
<p><xref ref-type="fig" rid="fig-5">Fig. 5</xref> illustrates the architecture of the system. The architecture displays the main modules of the system. In the first stage, the brain tumor dataset is preprocessed with the normalization method, min-max scaling, and dimensionality reduction following this, Gabor filter operations are applied to enhance spatial frequency features and improve texture representation in the MRI images. The enhanced images are then used to train the hybrid 3D model after tuning the appropriate hyperparameters. Finally, the trained model is tested on unseen MRI images, and the segmented brain tumor regions are produced in the output stage.</p>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>An overview of the system architecture: dataset preprocessing, implementation of Gabor filters, model integration, and illustration of tumor segmentation with tumor classes</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-5.tif"/>
</fig>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>Model Architecture</title>
<p>The proposed dSEAT-UNet architecture for 3D brain tumor segmentation combines elements of a 3D ResNet encoder and a 3D U-Net decoder [<xref ref-type="bibr" rid="ref-17">17</xref>]. It resembles a typical U-Net with encoder and decoder sections interconnected through skip connections that incorporate SE attention. The model inputs a 4-channel 3D MRI scan, where each channel represents a different MRI modality: T1, T2, T1CE, and FLAIR. This image has dimensions of 160 &#x00D7; 160 &#x00D7; 128 voxels. ResNet-based encoder extracts features from this multi-modal data, reducing spatial resolution (160 &#x00D7; 160 &#x00D7; 128 to 16 &#x00D7; 16 &#x00D7; 8) while increasing feature map depth.</p>
<p>Residual connections and SE attention within the encoder boost feature learning and focus on informative channels. The complete model design is displayed in <xref ref-type="fig" rid="fig-6">Fig. 6</xref>. The U-Net decoder utilizes transposed convolutions to expand feature maps (16 &#x00D7; 16 &#x00D7; 8 to 160 &#x00D7; 160 &#x00D7; 128) and incorporates skip connections for detailed segmentation. SE attention refines feature maps throughout the network, aiding in essential area classification.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>The proposed architecture of dSEAT-UNet</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-6.tif"/>
</fig>
<p>The final output is a segmented image classifying each voxel into one of four tumor classes:
<list list-type="bullet">
<list-item>
<p>Background</p></list-item>
<list-item>
<p>Necrosis/Non-enhancing tumor (NET)</p></list-item>
<list-item>
<p>Edema (ED)</p></list-item>
<list-item>
<p>Enhancing tumor (ET)</p></list-item>
</list></p>
<p>This hybrid approach leverages residual connections, SE attention, and skip connections for improved brain tumor segmentation.</p>
<sec id="s3_4_1">
<label>3.4.1</label>
<title>3D ResNet Encoder and Feature Learning</title>
<p>In the 3D ResNet Encoder, we begin with a 3D convolutional layer to process the input data, utilizing 16 filters and a kernel size of (3, 3, 3). This initial layer extracts basic features from the input volume. The encoder consists of multiple residual stages, each containing residual blocks for feature extraction. In each stage, the number of filters is doubled to capture increasingly complex features. The residual blocks play a key role in hierarchical representations learning of the input data. Each block is composed of two 3D convolutional layers, each followed by a ReLU activation function, and then batch normalization. To improve computational efficiency, 3D max pooling with a kernel size of (2, 2, 2) is applied after each residual stage, reducing the spatial dimensions of the feature maps.</p>
<p>The feature learning part involves reducing the input data size using residual (ResNet) blocks. These blocks use a shortcut connection, which adds the original input to the output after passing through some layers. This shortcut helps improve training speed and accuracy without adding more parameters. As mentioned above, each ResNet block in the encoder has two convolution layers: activation function and batch normalization. Our design is based on four feature learning modules before feeding the features to the bottleneck module, as shown in <xref ref-type="fig" rid="fig-6">Fig. 6</xref>. The operation of residual block [<xref ref-type="bibr" rid="ref-28">28</xref>,<xref ref-type="bibr" rid="ref-44">44</xref>] can be expressed in the following <xref ref-type="disp-formula" rid="eqn-2">Eq. (2)</xref>.
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mi>F</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo fence="false" stretchy="false">}</mml:mo><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>x</mml:mi></mml:math></disp-formula>where <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:mi>x</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:mi>y</mml:mi></mml:math></inline-formula> are input and output vector, and function <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:mi>F</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo fence="false" stretchy="false">}</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the mapping function for the residual path. The resultant dimensions of both input <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mi>x</mml:mi></mml:math></inline-formula> and function <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:mi>F</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo fence="false" stretchy="false">}</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> should be the same. The shortcut method in the residual block helps avoid the problem of gradients vanishing and speeds up the network&#x2019;s learning. It efficiently combines detailed local features with broader global features. The structure of the residual block is shown in <xref ref-type="fig" rid="fig-7">Fig. 7</xref>. This block includes convolution operations (3 &#x00D7; 3 &#x00D7; 3) followed by batch normalization and ReLu activation function. The shortcut connection and element-wise addition operation are finally combined and computed as the output of residual block.</p>
<fig id="fig-7">
<label>Figure 7</label>
<caption>
<title>Residual block architecture: demonstrating the modules processing the input to generate the output</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-7.tif"/>
</fig>
</sec>
<sec id="s3_4_2">
<label>3.4.2</label>
<title>3D U-Net Decoder</title>
<p>The 3D U-Net decoder, utilizes skip connections to combine low-level features from the encoder with high-level features from the decoder, facilitating precise localization and segmentation. We start the decoder with a contracting path comprising two 3D convolutional layers with ReLu activation and max pooling, followed by dropout regularization to prevent overfitting. The expansive path consists of transpose convolutional layers (Conv3DTranspose) to up-sample the feature maps and recover the spatial resolution lost during encoding. Skip connections are established between the corresponding encoder and decoder layers, allowing the model to access local and global features. Finally, output layer is based on a 3D convolutional layer with softmax activation, generating probability maps for each of four classes, i.e., Background, ED, NET, ET.</p>
</sec>
<sec id="s3_4_3">
<label>3.4.3</label>
<title>Attention Mechanism</title>
<p>We integrated the SE attention mechanism into our model. This mechanism is known for its compact design and high efficiency and lightweight design. The SE block consists of two primary operations, i.e., squeeze and excitation.</p>
<p>As shown <xref ref-type="fig" rid="fig-8">Fig. 8</xref>, in the initial phase, feature maps <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mi>F</mml:mi><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mrow><mml:mi>H</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>W</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>D</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> are provided as input, where H, W, D, and C represent height, width, depth, and number of channels. Global average pooling operation transforms these features by squeezing global spatial information into 1 &#x00D7; 1 &#x00D7; 1 &#x00D7; C format, generating channel wise statistics. The mathematical formulation of the squeeze operation is presented in Appendix B. This operation reduces spatial dimensions while retaining essential channel-specific information.</p>
<fig id="fig-8">
<label>Figure 8</label>
<caption>
<title>Architecture of squeeze and excitation mechanism. The global average pooling performs the squeeze operation to aggregate global information for each channel of the entire image. The squeeze operation is followed by the excitation phase, which is based on two fully connected layers connected with ReLU activation followed by the activation function sigmoid before scaling</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-8.tif"/>
</fig>
<p>In the excitation phase, the process begins with a fully connected layer that applies a reduction factor <italic>r</italic>, followed by a ReLU activation. This is then followed by an additional fully connected layer, with a sigmoid activation to generate the excitation output. Finally, a scaling operation integrates this refined channel information to enhance feature selectivity.</p>
</sec>
</sec>
<sec id="s3_5">
<label>3.5</label>
<title>Model Integration</title>
<p>The hybrid model architecture effectively integrates the strengths of both ResNet and U-Net architectures, leveraging residual connections for feature extraction and skip connections for accurate segmentation. Combining these components, our model exhibits promising 3D brain tumor segmentation results. Additionally, we incorporated the SE mechanism, which is significant for improving the performance of the model. The SE mechanism dynamically recalibrates the feature maps, allowing the model to focus on the most informative features while suppressing less useful ones. This results in improved representation learning and better segmentation accuracy. Overall, enhanced with the SE mechanism, this hybrid architecture enables accurate and efficient brain tumors segmentation in 3D MRI images.</p>
<p>The value of the dropout rate is typically specified as a parameter when adding a dropout layer in a neural network model. In the designed model, the dropout rate is set to 0.1 for the contracting path (encoder) and 0.2 for the expansive path (decoder). This means that 10% of the input units will be randomly set to 0 during training in the contracting path and 20% in the expansive path. These dropout rates are chosen based on experimentation preventing overfitting and improving the generalization ability of the model.</p>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Implementation</title>
<p>This study implements the model using the Keras library and TensorFlow 2.8.0. We designed the model using Python and executed the computations on NVIDIA RTX 3080 12GB GPU. For model training, Adam optimizer is selected with a learning rate of 0.0001. The selection of activation function and normalization techniques is ReLU and batch normalization, respectively. The model training is performed for 150 epochs. The image dimension is 160 &#x00D7; 160 &#x00D7; 128 to locate the tumor area properly. Initially, we tried to train the model on an actual dimension of 240 &#x00D7; 240 &#x00D7; 155, but it offers an intensive load on the GPU and halts the execution with GPU memory overflow errors. A batch size of 1 is selected to load the 3D images properly by considering the memory constraints of the GPU. In the encoder, a dropout of 0.1 is applied after the convolution operation, while in the decoder, a dropout of 0.2 is applied after each convolution operation. This configuration helps regularize the model and prevent overfitting during training. As the decoder is responsible for generating the final segmentation output, it typically has more parameters and may be prone to overfitting. Therefore, a higher dropout rate can help regularize the decoder&#x2019;s parameters and prevent it from fitting noise in the training data too closely. Hyperparameter values are listed in <xref ref-type="table" rid="table-2">Table 2</xref>.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Hyperparameter values for the proposed model</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Hyperparameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Image input size</td>
<td>160 &#x00D7; 160 &#x00D7; 128 &#x00D7; 4</td>
</tr>
<tr>
<td>Batch size</td>
<td>1</td>
</tr>
<tr>
<td>Activation function (Hidden layer)</td>
<td>ReLu</td>
</tr>
<tr>
<td>Learning rate</td>
<td>1 &#x00D7; 10<sup>&#x2212;4</sup></td>
</tr>
<tr>
<td>Optimizer</td>
<td>Adam</td>
</tr>
<tr>
<td>Epochs</td>
<td>150</td>
</tr>
<tr>
<td>Loss</td>
<td>Dice Loss &#x002B; Focal Loss</td>
</tr>
<tr>
<td>Dropout rate (Enc.)</td>
<td>0.1</td>
</tr>
<tr>
<td>Dropout rate (Dec.)</td>
<td>0.2</td>
</tr>
<tr>
<td>Activation function (Output)</td>
<td>Softmax</td>
</tr>
<tr>
<td>Segmented output size</td>
<td>160 &#x00D7; 160 &#x00D7; 128 &#x00D7; 4</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec id="s4_1">
<label>4.1</label>
<title>Combined Loss Function</title>
<p>Choosing the suitable loss function is highly important for deep learning models, especially when working on brain tumor segmentation. Latest research suggests that there is not a universal loss function that always works great for all segmentation tasks. Deep learning model&#x2019;s performance also depends on selecting a suitable loss function [<xref ref-type="bibr" rid="ref-45">45</xref>]. Combined loss functions, integrating two or more types of losses, have become the most robust and effective in various situations [<xref ref-type="bibr" rid="ref-46">46</xref>]. Combining dice and focal loss allows the model to benefit from their complementary effects. We aim to mitigate the class imbalance by combining two loss functions, i.e., dice loss and focal loss. Together, they provide a robust training objective that enables the model to effectively learn from both majority and minority classes, leading to better segmentation results on imbalanced datasets like BraTS [<xref ref-type="bibr" rid="ref-47">47</xref>].</p>
<p>Using <xref ref-type="disp-formula" rid="eqn-3">Eq. (3)</xref>, we can calculate the dice loss for all tumor classes: Background, NET, ED, and ET.
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>a</mml:mi><mml:mo>,</mml:mo><mml:mi>b</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>c</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:munderover><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>m</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>m</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mspace width="thinmathspace" /><mml:mo>+</mml:mo><mml:mspace width="thinmathspace" /><mml:mo>&#x2208;</mml:mo></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:munder><mml:msup><mml:mi>a</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mrow><mml:msup><mml:mi>b</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>m</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mo>&#x2208;</mml:mo></mml:mrow></mml:mfrac></mml:math></disp-formula>where <italic>a</italic> and <italic>b</italic> represent predicted output and its mask, respectively. <italic>m</italic> is voxel representation, <italic>c</italic> is class, <italic>n</italic> denotes the total number of tumor classes, and &#x2208; is a negligible constant value used to avoid division by zero.
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>f</mml:mi><mml:mi>o</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>c</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:munderover><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x03B3;</mml:mi></mml:mrow></mml:msup><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:math></disp-formula>where <italic>a</italic> represents predicted output, <italic>b</italic> is the ground truth, <italic>c</italic> represents the class, and <italic>n</italic> is the total number of classes. <xref ref-type="disp-formula" rid="eqn-5">Eq. (5)</xref> represents the combined loss used in this study.
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:mi>L</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>f</mml:mi><mml:mi>o</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></disp-formula>where <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>f</mml:mi><mml:mi>o</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are dice and focal losses.</p>
<p>In this study, the combined loss for each class is computed as a total loss by adding the dice loss and focal loss. This loss function ensures that the model focuses on the most critical parts of the tumor during segmentation, connecting the relevance of each tumor area with the network&#x2019;s predictions. This helps the model prioritize the most clinically significant regions, vital for getting the best segmentation results.</p>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Performance Evaluation</title>
<p>Our study used the Dice Similarity Coefficient (DSC), sensitivity, and specificity measuring the model&#x2019;s effectiveness. DSC is the most used evaluation metric in Brain Tumor Segmentation studies. The DSC calculates the overlapping between the segmentation and actual area in the segmentation range between 0 and 1. 0 represents no overlap, and 1 shows complete overlap between the actual and predicted tumor regions. <xref ref-type="disp-formula" rid="eqn-6">Eq. (6)</xref> can be used to calculate the DSC.
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mi>D</mml:mi><mml:mi>S</mml:mi><mml:mi>C</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>B</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:mspace width="thinmathspace" /><mml:mrow><mml:mo>|</mml:mo><mml:mi>A</mml:mi><mml:mo>&#x2229;</mml:mo><mml:mi>B</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>A</mml:mi><mml:mo>|</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mi>B</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula>where <italic>A</italic> and <italic>B</italic> represent predicted and ground truth values.</p>
<p>Sensitivity measures the proportion of actual tumor area that is correctly predicted. Specificity measures the proportion of actual healthy area that is correctly predicted as non-tumor. Both sensitivity and specificity can be calculated using <xref ref-type="disp-formula" rid="eqn-7">Eqs. (7)</xref> and <xref ref-type="disp-formula" rid="eqn-8">(8)</xref>.
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>S</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>S</mml:mi><mml:mi>p</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>where the terms <italic>TP</italic>, <italic>TN</italic>, <italic>FP</italic>, and <italic>FN</italic> represent true positive, true negative, false positive, and false negative, respectively. <xref ref-type="fig" rid="fig-9">Fig. 9</xref> shows separate tumor classes and areas for better understanding when analyzing segmentation results.</p>
<fig id="fig-9">
<label>Figure 9</label>
<caption>
<title>Brain tumor segmentation method depicting the tumor classes (NCR/NET, ET, ED) and tumor regions (WT, TC, ET). For better analyses and understanding of segmentation results, Tumor classes are separately displayed</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-9.tif"/>
</fig>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Experiments and Results</title>
<p>dSEAT-UNet was trained, validated, and tested on the benchmark BraTS 2020 MRI dataset with a separation weightage of 75%, 15%, and 10% for training, validation, and testing, respectively. While testing the model performance on random test dataset samples, we observed that dSEAT-UNet accurately generated the 3D segmentation volumes with tumor classes. Prediction results show the model&#x2019;s generalization ability. Results analysis for three tumor classes, i.e., WT, TC, and ET, show high similarity between ground truth and predictions. As shown in <xref ref-type="fig" rid="fig-9">Fig. 9</xref>, tumor classes and regions are separately displayed for a better understanding of the segmentation results, highlighting the WT, TC, and ET tumor regions.</p>

<p>To optimize the performance of our proposed model, we conducted empirical hyperparameter tuning using a combination of grid search and validation-based selection. The learning rate was varied across a range from 1 &#x00D7; 10<sup>&#x2212;3</sup> to 1 &#x00D7; 10<sup>&#x2212;5</sup> and the optimal value of 1 &#x00D7; 10<sup>&#x2212;4</sup> was selected based on validation dice scores. Dropout rates were tested in the range of 0.0 to 0.3, with 0.1 and 0.2 providing the best balance between regularization and model capacity. The Adam optimizer was chosen due to its stable convergence behavior, and the ReLU activation function was selected after comparing it with GELU and LeakyReLU, as it provided slightly better performance and training stability. These hyperparameters were validated on a subset of the BraTS 2020 dataset to ensure generalization and stability before full-scale training. Training stability was monitored across epochs using validation loss and dice score to ensure smooth convergence and prevent overfitting.</p>
<p>The performance metrics, including dice score, specificity, and sensitivity, are computed and presented in <xref ref-type="table" rid="table-3">Table 3</xref>. <xref ref-type="fig" rid="fig-10">Fig. 10</xref> displays the training and validation losses for each epoch, for 150 epochs, providing insight into the model&#x2019;s learning progress and performance. In the initial stages, both training and validation losses decrease sharply, indicating effective learning and the ability of model to identify data patterns. As training progresses, loss continues to decrease consistently, reflecting the model&#x2019;s ability to reduce error on the training data. Although the validation loss experiences some fluctuations, it generally trends downward, suggesting that the model is enhancing its ability to generalize to unseen data. The overall trend in validation loss highlights the ability of model to adapt and learn relevant data features, even though it faces typical challenges such as minor overfitting. The consistent decline in training loss and the ultimate stabilization of validation loss reflects the model&#x2019;s robustness and efficiency in handling the training process over a long period.</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>dSEAT-UNet dice score, specificity, and sensitivity for WT, TC, and ET</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th rowspan="2">Method</th>
<th colspan="3">DSC</th>
<th colspan="3">Specificity</th>
<th colspan="3">Sensitivity</th>
</tr>
<tr>
<th>WT</th>
<th>TC</th>
<th>ET</th>
<th>WT</th>
<th>TC</th>
<th>ET</th>
<th>WT</th>
<th>TC</th>
<th>ET</th>
</tr>
</thead>
<tbody>
<tr>
<td>dSEAT-UNet</td>
<td>0.881</td>
<td>0.846</td>
<td>0.819</td>
<td>0.996</td>
<td>0.991</td>
<td>0.987</td>
<td>0.985</td>
<td>0.978</td>
<td>0.967</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-10">
<label>Figure 10</label>
<caption>
<title>dSEAT-UNet training and validation loss</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-10.tif"/>
</fig>
<p><xref ref-type="fig" rid="fig-11">Fig. 11</xref> presents a box plot displaying three tumor regions dice scores such as WT in blue, TC in green, and ET in purple. The box indicates the interquartile range, with the median value marked by the red line inside the box. The whiskers reach the minimum and maximum values within 1.5 times the interquartile range, and outliers are shown as separate dots. The plot shows that WT has the highest median dice score, followed by TC and ET, indicating better segmentation performance for WT than the other regions. The presence of outliers suggests variability in segmentation accuracy across different samples.</p>
<fig id="fig-11">
<label>Figure 11</label>
<caption>
<title>Box plot for the dSEAT-UNet dice score corresponding to WT, TC, and ET</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-11.tif"/>
</fig>
<sec id="s5_1">
<label>5.1</label>
<title>Importance of Suitable Data Preprocessing Technique</title>
<p>It is highly significant to highlight the importance of a suitable data preprocessing technique. One of the major factors other than designing a reliable and robust deep learning architecture is choosing a suitable data preprocessing technique. As mentioned in <xref ref-type="sec" rid="s3_3">Section 3.3</xref> about Gabor filter optimization, we observed notable improvement in the segmentation performance of the model. The performance of the proposed model is observed on both datasets, i.e., regular BraTS 2020 and filtered BraTS 2020 datasets. As shown in <xref ref-type="table" rid="table-4">Table 4</xref>, it is evident that after selecting a suitable data preprocessing technique, the segmentation performance of the proposed model improved in segmenting all three tumor areas, i.e., WT, TC, and ET.</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Choosing a suitable data preprocessing technique</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th align="center">Configuration</th>
<th align="center">WT</th>
<th align="center">TC</th>
<th align="center">ET</th>
<th align="center">Preprocessing techniques</th>
</tr>
</thead>
<tbody>
<tr>
<td>Proposed model without Gabor filtered data</td>
<td>0.856</td>
<td>0.827</td>
<td>0.796</td>
<td>Few preprocessing operations were used.</td>
</tr>
<tr>
<td>Proposed model with Gabor filtered data</td>
<td>0.881</td>
<td>0.846</td>
<td>0.819</td>
<td>Gabor filter with multiple techniques, as discussed in the preprocessing section.</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s5_2">
<label>5.2</label>
<title>Comparison with State-of-the-Art Methods</title>
<p>We compared our dSEAT-UNet model with state-of-the-art methods on the BraTS 2020 dataset and presented an in-depth quantitative results analysis. Later, we also cross-verified the performance of our model on BraTS 2021 dataset samples. The following sections contain the proposed model&#x2019;s evaluation on both datasets, i.e., BraTS 2020 and BraTS 2021.</p>
<sec id="s5_2_1">
<label>5.2.1</label>
<title>Comparison with State-of-the-Art-Methods on the BraTS2020 Dataset</title>
<p>Segmenting brain tumors accurately is a complex task, particularly for regions like tumor core and enhancing tumor. Segmenting these smaller regions presents additional difficulties. To assess the performance of the proposed model, dSEAT-UNet, was compared with benchmark methods on BraTS2020 benchmark dataset. Our comparison is based on published results from these methods, which we referenced for a comprehensive analysis and comparison with our study. Our model outperformed most previously published studies, in terms of TC and ET segmentation performance, as shown in <xref ref-type="table" rid="table-5">Table 5</xref>. The highest DSC values for each tumor class are highlighted in bold. <xref ref-type="fig" rid="fig-12">Fig. 12</xref> presents the segmentation results of the model on the BraTS 2020 dataset. The precision of any deep learning model completely lies in how precisely it segments the tumor regions, as it is significant in medical treatment. The segmentation results shown by our model for each tumor region show the accurate definition of boundaries, especially in the case of tumor core and enhancing tumor. Our model&#x2019;s precision in detecting and segmenting tumor regions shows the model&#x2019;s effectiveness and offers competitive performance compared to the state-of-the-art methods. Quantitative analysis from <xref ref-type="table" rid="table-5">Table 5</xref> shows that the study presented by Wang et al. [<xref ref-type="bibr" rid="ref-26">26</xref>] shows the highest dice score of 0.900 in segmenting whole tumor, exceeding our score of 0.881. However, our model achieved 2.9% (TC) and 2.9% (ET) gains over Wang et al.&#x2019;s study [<xref ref-type="bibr" rid="ref-26">26</xref>]. Without using any postprocessing technique, our model surpasses all other studies regarding TC and ET with values of 0.846 and 0.819, respectively. Our model has 30.63 M parameters. Our model has fewer parameters than the state-of-the-art and the model by Wang et al. [<xref ref-type="bibr" rid="ref-26">26</xref>], which has 32.99 M parameters. There are 19.06 M parameters of the baseline 3D U-Net [<xref ref-type="bibr" rid="ref-17">17</xref>], and Raza et al.&#x2019;s model [<xref ref-type="bibr" rid="ref-24">24</xref>] has 30.47 M parameters. Even though our model has 30.63 M parameters, our model has shown improved performance and obtained DSC of 0.881 for WT, 0.846 for TC, and 0.819 for ET.</p>
<table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>Comparison with state-of-the-art methods on the BraTS 2020 dataset</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th align="center">Comparison study</th>
<th align="center">Model</th>
<th align="center">Image size</th>
<th align="center">WT</th>
<th align="center">TC</th>
<th align="center">ET</th>
<th align="center">Dataset</th>
</tr>
</thead>
<tbody>
<tr>
<td>&#x00C7;i&#x00E7;ek et al. [<xref ref-type="bibr" rid="ref-16">16</xref>]</td>
<td>3D UNet</td>
<td>128 &#x00D7; 128 &#x00D7; 128</td>
<td>0.841</td>
<td>0.790</td>
<td>0.687</td>
<td>BraTS 2020</td>
</tr>
<tr>
<td>Ballestar [<xref ref-type="bibr" rid="ref-48">48</xref>]</td>
<td>3D CNN</td>
<td>64 &#x00D7; 64 &#x00D7; 64</td>
<td>0.846</td>
<td>0.752</td>
<td>0.621</td>
<td>BraTS 2020</td>
</tr>
<tr>
<td>Wang et al. [<xref ref-type="bibr" rid="ref-26">26</xref>]</td>
<td>TransBTS</td>
<td>128 &#x00D7; 128 &#x00D7; 128</td>
<td><bold>0.900</bold></td>
<td>0.817</td>
<td>0.787</td>
<td>BraTS 2020</td>
</tr>
<tr>
<td>Messaoudi et al. [<xref ref-type="bibr" rid="ref-49">49</xref>]</td>
<td>Eff.Net-3D UNet</td>
<td>192 &#x00D7; 160 &#x00D7; 108</td>
<td>0.806</td>
<td>0.752</td>
<td>0.695</td>
<td>BraTS 2020</td>
</tr>
<tr>
<td>Wang et al. [<xref ref-type="bibr" rid="ref-50">50</xref>]</td>
<td>3D UNet</td>
<td>128 &#x00D7; 128 &#x00D7; 128</td>
<td>0.852</td>
<td>0.798</td>
<td>0.778</td>
<td>BraTS 2019</td>
</tr>
<tr>
<td>Zhang et al. [<xref ref-type="bibr" rid="ref-24">24</xref>]</td>
<td>Att.gate- ResUnet</td>
<td>128 &#x00D7; 128 &#x00D7; 128</td>
<td>0.870</td>
<td>0.777</td>
<td>0.709</td>
<td>BraTS 2019</td>
</tr>
<tr>
<td>Raza et al. [<xref ref-type="bibr" rid="ref-25">25</xref>]</td>
<td>dResU-NET</td>
<td>128 &#x00D7; 128 &#x00D7; 128</td>
<td>0.866</td>
<td>0.835</td>
<td>0.800</td>
<td>BraTS 2020</td>
</tr>
<tr>
<td>Colman et al. [<xref ref-type="bibr" rid="ref-51">51</xref>]</td>
<td>DR-UNet</td>
<td>240 &#x00D7; 240</td>
<td>0.867</td>
<td>0.798</td>
<td>0.751</td>
<td>BraTS 2020</td>
</tr>
<tr>
<td>Tang et al. [<xref ref-type="bibr" rid="ref-52">52</xref>]</td>
<td>MultiResUNet</td>
<td>80 &#x00D7; 96 &#x00D7; 64</td>
<td>0.892</td>
<td>0.789</td>
<td>0.703</td>
<td>BraTS 2020</td>
</tr>
<tr>
<td>Jiang et al. [<xref ref-type="bibr" rid="ref-23">23</xref>]</td>
<td>SwinBTS</td>
<td>240 &#x00D7; 240 &#x00D7; 155</td>
<td>0.890</td>
<td>0.803</td>
<td>0.772</td>
<td>BraTS 2020</td>
</tr>
<tr>
<td>Abd-Ellah et al. [<xref ref-type="bibr" rid="ref-53">53</xref>]</td>
<td>TPCUAR-Net</td>
<td>128 &#x00D7; 128</td>
<td>0.870</td>
<td>0.830</td>
<td>0.760</td>
<td>BraTS 2017</td>
</tr>
<tr>
<td>Ren et al. [<xref ref-type="bibr" rid="ref-54">54</xref>]</td>
<td>Optimized 3D UNet</td>
<td>240 &#x00D7; 240 &#x00D7; 155</td>
<td>0.675</td>
<td>0.721</td>
<td>0.715</td>
<td>BraTS 2023</td>
</tr>
<tr>
<td>Proposed</td>
<td>dSEAT-UNet</td>
<td>160 &#x00D7; 160 &#x00D7; 128</td>
<td>0.881</td>
<td><bold>0.846</bold></td>
<td><bold>0.819</bold></td>
<td>BraTS 2020</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-5fn1" fn-type="other">
<p>Note: The highest DSC values for each tumor class are highlighted in bold.</p>
</fn>
</table-wrap-foot>
</table-wrap><fig id="fig-12">
<label>Figure 12</label>
<caption>
<title>Qualitative visualization of model segmentation results when randomly selecting four test sample images from the dataset. Prediction shows separate illustrations of segmentations for WT, TC, and ET</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-12.tif"/>
</fig>
</sec>
<sec id="s5_2_2">
<label>5.2.2</label>
<title>Model Cross-Validation on BraTS 2021 Dataset</title>
<p>We also evaluated our model on the BraTS 2021 dataset to analyze the performance. The BraTS 2021 dataset is also pre-processed as BraTS2020 to make it suitable for testing. An unbiased performance of the model can be measured by testing the model on another dataset [<xref ref-type="bibr" rid="ref-55">55</xref>]. Our model also showed competitive results on unseen test samples from BraTS 2021. The sample preprocessed images of the BraTS 2021 dataset are shown in <xref ref-type="fig" rid="fig-13">Fig. 13</xref>. We observed enhanced generalization ability and promising results of our model, as shown in <xref ref-type="fig" rid="fig-14">Fig. 14</xref>, and quantitative results in <xref ref-type="table" rid="table-6">Table 6</xref>. This highlights the effectiveness and robustness of our model, which outperformed other state-of-the-art studies with DSC values of 0.856 for TC and 0.824 for ET, respectively.</p>
<fig id="fig-13">
<label>Figure 13</label>
<caption>
<title>Resized BraTS 2021 sample MRI images with mask (ground truth)</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-13.tif"/>
</fig><fig id="fig-14">
<label>Figure 14</label>
<caption>
<title>Brain tumor segmentation results on the axial axis. Prediction results show separate tumor regions for WT, TC, and ET</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-14.tif"/>
</fig><table-wrap id="table-6">
<label>Table 6</label>
<caption>
<title>Cross-validation of the model on the BraTS 2021 dataset</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Model</th>
<th colspan="3">Dice Score</th>
<th colspan="3">Specificity</th>
<th colspan="3">Sensitivity</th>
</tr>
<tr>
<th/>
<th>WT</th>
<th>TC</th>
<th>ET</th>
<th>WT</th>
<th>TC</th>
<th>ET</th>
<th>WT</th>
<th>TC</th>
<th>ET</th>
</tr>
</thead>
<tbody>
<tr>
<td>Model cross-validation (dSEAT-UNet)</td>
<td>0.887</td>
<td>0.856</td>
<td>0.824</td>
<td>0.991</td>
<td>0.988</td>
<td>0.990</td>
<td>0.991</td>
<td>0.984</td>
<td>0.972</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s5_3">
<label>5.3</label>
<title>Ablation Study</title>
<p>The dSEAT-UNet outperforms state-of-the-art methods in the accurate segmentation of WT, TC, and ET brain tumor regions. An ablation study is also conducted to highlight the importance of a combined network with residual blocks and SE mechanisms in skip connections. This network is compared with residual network-based encoder network and baseline 3D network, which do not contain squeeze and excitation mechanisms. The dSEAT-UNet has shown the highest performance compared to the two methods. The dice scores, specificity, and sensitivity metrics values for the three methods are shown in <xref ref-type="table" rid="table-7">Table 7</xref>. The dice scores are plotted, and a comparison chart for the baseline 3D U-Net, proposed model without attention mechanism, and dSEAT-UNet is presented for three tumor regions, i.e., WT, TC, and ET, in <xref ref-type="fig" rid="fig-15">Fig. 15</xref>.</p>
<table-wrap id="table-7">
<label>Table 7</label>
<caption>
<title>Ablation study performed on BraTS 2020 dataset</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th align="center">Method</th>
<th colspan="3">DSC</th>
<th colspan="3">Specificity</th>
<th colspan="3">Sensitivity</th>
</tr>
<tr>
<th/>
<th>WT</th>
<th>TC</th>
<th>ET</th>
<th>WT</th>
<th>TC</th>
<th>ET</th>
<th>WT</th>
<th>TC</th>
<th>ET</th>
</tr>
</thead>
<tbody>
<tr>
<td>Baseline 3D U-Net</td>
<td>0.834</td>
<td>0.796</td>
<td>0.752</td>
<td>0.971</td>
<td>0.973</td>
<td>0.894</td>
<td>0.942</td>
<td>0.937</td>
<td>0.842</td>
</tr>
<tr>
<td>Proposed model (without Attention Mechanism)</td>
<td>0.848</td>
<td>0.813</td>
<td>0.784</td>
<td>0.973</td>
<td>0.971</td>
<td>0.966</td>
<td>0.947</td>
<td>0.933</td>
<td>0.914</td>
</tr>
<tr>
<td>dSEAT-UNet</td>
<td>0.881</td>
<td>0.846</td>
<td>0.819</td>
<td>0.996</td>
<td>0.991</td>
<td>0.987</td>
<td>0.985</td>
<td>0.978</td>
<td>0.967</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-15">
<label>Figure 15</label>
<caption>
<title>Comparison of dice scores on baseline 3D U-Net, proposed model without attention, and dSEAT-UNet for WT, TC, and ET</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-15.tif"/>
</fig>
<p>The quantitative analysis shows that dSEAT-UNet, which incorporates the SE attention mechanism, significantly outperforms both the baseline 3D U-Net and the proposed model without the attention mechanism across all three tumor regions. The dice score improvements for dSEAT-UNet indicate the effectiveness of the attention mechanism in enhancing segmentation performance. Specifically, for the WT region, dSEAT-UNet shows a performance improvement of 5.63% over the baseline and 3.89% over the proposed model without the attention mechanism, highlighting its superior ability to segment the entire tumor region accurately. For the TC region, the 6.28% improvement over the baseline and 4.06% over the proposed model without the attention mechanism underscores the attention mechanism&#x2019;s impact on accurately identifying the tumor core. The most significant improvement is observed in the ET region, where dSEAT-UNet achieves an 8.51% enhancement over the baseline and 4.08% over the proposed model without the attention mechanism, demonstrating the attention mechanism&#x2019;s critical role in capturing the enhancing tumor region, which is often more challenging to segment due to its variability and complexity.</p>
<p>To further quantify the contribution of the SE attention mechanism in dSEAT-UNet, we analyzed the performance improvements by comparing the model without SE attention to the full dSEAT-UNet. This comparison isolates the effect of the attention mechanism on segmentation accuracy. The results show that integrating SE attention leads to an increase in dice scores by approximately 3.3% for WT region, 3.3% for TC, and 3.5% for ET region. These improvements underscore the effectiveness of the SE module in enhancing feature recalibration and model sensitivity, particularly for complex and challenging tumor subregions.</p>
<p>The quantitative analysis highlights the substantial performance gains achieved by incorporating the attention mechanism in dSEAT-UNet. These improvements across all tumor regions demonstrate the critical importance of the attention mechanism in enhancing segmentation accuracy, proving dSEAT-UNet as the best-performing model among the three evaluated in the ablation study.</p>
</sec>
<sec id="s5_4">
<label>5.4</label>
<title> Limitations, Risks and Computational Trade-Offs</title>
<p>While the proposed dSEAT-UNet demonstrates strong segmentation accuracy on benchmark datasets, several challenges and deployment considerations must be addressed to ensure its real-world applicability in clinical settings:
<list list-type="bullet">
<list-item>
<p><bold>Computational Resource Constraints:</bold> The model architecture, integrating a deep ResNet encoder, SE attention mechanisms, and Gabor filters, improves segmentation accuracy but increases computational demands. Deploying such a model in clinical environments especially those with limited GPU availability can be challenging. Reducing the input volume size to 160 &#x00D7; 160 &#x00D7; 128 helped mitigate GPU memory overflow during training, but further optimizations such as model pruning, quantization, or knowledge distillation could enhance deployability without significant performance loss.</p></list-item>
<list-item>
<p><bold>Latency and Real-Time Processing:</bold> The depth and complexity of dSEAT-UNet, particularly in processing high-resolution 3D MRI scans, may introduce latency in inference. In time-sensitive clinical workflows, this could limit the model&#x2019;s usability. Efficient model variants or hybrid encoder-decoder designs with fewer parameters could reduce inference time while preserving segmentation accuracy.</p></list-item>
<list-item>
<p><bold>Sensitivity to Variability in MRI Scans:</bold> Real-world MRI data often contain artifacts, inter-slice inconsistencies, or intensity variations across devices and institutions. These inconsistencies can degrade model performance, especially when segmenting small or diffuse tumor regions. Enhanced preprocessing strategies such as bias field correction and deep-learning-based denoising are essential to mitigate these issues in deployment scenarios.</p></list-item>
<list-item>
<p><bold>Generalization across Institutions:</bold> While the model performed well on the BraTS 2020 and 2021 datasets, its robustness on unseen clinical data from diverse scanners and acquisition protocols remains to be validated. Incorporating more heterogeneous training data and domain adaptation strategies can help improve the model&#x2019;s generalization and reliability.</p></list-item>
</list></p>
<p>Addressing these challenges is critical for transitioning dSEAT-UNet from a research prototype to a practical tool for automated brain tumor segmentation in clinical environments.</p>
</sec>
<sec id="s5_5">
<label>5.5</label>
<title>Segmentation Challenges Analysis</title>
<p>While the proposed model demonstrates strong overall performance, several challenging cases highlight its limitations particularly segmenting ET, more critically segmenting NET regions, as illustrated in <xref ref-type="fig" rid="fig-16">Fig. 16</xref>.</p>
<fig id="fig-16">
<label>Figure 16</label>
<caption>
<title>Segmentation challenges faced by dSEAT-UNet when segmenting narrow boundaries and small tumor regions</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_66580-fig-16.tif"/>
</fig>
<p>NET regions are especially difficult to segment due to their small size, diffuse and infiltrative nature, and low contrast against surrounding tissues. These characteristics often result in misclassification or under-segmentation. Unlike ET regions, NETs lack distinct intensity information in conventional MRI modalities, making them difficult to distinguish even for human experts.</p>
<p>The model&#x2019;s underperformance in these regions may also come from its limited ability to capture fine intensity variations or its insufficient sensitivity to weak boundaries. In some cases, these regions were completely missed or partially segmented, which can critically impact the overall tumor characterization.</p>
<p>To address these challenges, future improvements could include:
<list list-type="bullet">
<list-item>
<p>Refining the attention mechanisms to enhance sensitivity to weak gradients and boundary regions.</p></list-item>
<list-item>
<p>Incorporating domain-specific priors, such as anatomical constraints or tumor growth patterns.</p></list-item>
<list-item>
<p>Applying post-processing techniques like Conditional Random Fields (CRFs) or morphological operations to sharpen segmentation outputs.</p></list-item>
<list-item>
<p>Training with dedicated loss functions, e.g., boundary loss or focal loss, suitable for underrepresented regions.</p></list-item>
</list></p>
<p>By including and analyzing these failure cases, we provide a more realistic picture of the model&#x2019;s behavior in clinical settings and identify areas where further optimization is needed for robust, tumor-subregion-level segmentation.</p>
</sec>
<sec id="s5_6">
<label>5.6</label>
<title>Clinical Applicability and Future Integration</title>
<p>While dSEAT-UNet demonstrates promising results on benchmark datasets, its deployment in clinical settings presents several challenges that require future exploration. Integration into radiology workflows would require compatibility with clinical imaging systems such as PACS, robust inference speed suitable for real-time use, and output formats interpretable by clinicians. Enhancing the model&#x2019;s explainability, through visual interpretability tools like attention heatmaps, could further facilitate clinician trust and adoption.</p>
<p>In addition, rigorous validation across diverse clinical cohorts and institutions is essential to ensure generalizability. Both retrospective analyses using real patient scans and prospective evaluation within clinical workflows will be needed to assess model robustness, safety, and usability. These steps, along with adherence to regulatory guidelines (e.g., FDA, CE), will be key to translating this research into a deployable clinical solution. Future work will focus on addressing these practical and regulatory challenges.</p>
</sec>
</sec>
<sec id="s6">
<label>6</label>
<title>Conclusion</title>
<p>This study proposed dSEAT-UNet, a novel 3D brain tumor segmentation model that enhances the conventional 3D U-Net by integrating a deep ResNet-based encoder and squeeze-and-excitation (SE) attention mechanisms within the skip connections. The ResNet encoder improves semantic representation while maintaining stable training, and the SE blocks adaptively recalibrate features, leading to better generalization. These design enhancements collectively contributed to significant performance gains, especially in segmenting complex tumor regions across benchmarks datasets.</p>
<p>Additionally, the integration of Gabor filter banks into the encoder contributed to improved texture-aware feature extraction. This enhancement enabled the model to capture low-level, transformation-invariant features and mitigate texture interference, particularly in irregular tumor boundaries. These contributions facilitated faster convergence during early training and improved segmentation accuracy in small and complex tumor regions.</p>
<p>Our experiments on the BraTS 2020 dataset demonstrated that dSEAT-UNet obtained dice scores of 0.881 for Whole tumor (WT), 0.846 for Tumor core (TC), and 0.819 for Enhancing tumor (ET), outperforming several state-of-the-art models. The model&#x2019;s generalizability was further confirmed on the BraTS 2021 dataset, maintaining strong performance across test cases.</p>
<p>Despite these promising results, challenges remain in accurately segmenting small or diffuse subregions, particularly under the presence of real-world imaging artifacts such as noise, motion blur, and inter-slice inconsistencies. Moreover, the model&#x2019;s computational demand is considerable, given the complexity of its components, which limited training to a batch size of one and constrained the input volume size due to GPU memory limitations.</p>
<p>To improve practical applicability, future work will explore testing dSEAT-UNet on external clinical datasets from local hospitals to evaluate performance under varying MRI conditions, scanner types, and artifact scenarios. This validation step is critical for understanding the model&#x2019;s real-world robustness and clinical reliability. Furthermore, lightweight versions of the model will be investigated using techniques such as model pruning, quantization, and knowledge distillation to reduce computational overhead and make the model more suitable for deployment in resource-limited clinical environments.</p>
<p>In terms of architectural innovation, future extensions will explore deformable convolutions, boundary-aware refinement modules, and hybrid CNN-transformer designs. These additions aim to enhance global context modeling and boundary localization. Further improvements in preprocessing including bias field correction, deep-learning-based denoising, and advanced data augmentation will support the model&#x2019;s ability to handle diverse clinical data.</p>
<p>Overall, the proposed dSEAT-UNet demonstrates strong potential as a reliable and accurate solution for 3D brain tumor segmentation. Ongoing efforts to validate and optimize the model across diverse settings will be crucial to its successful integration into real-world clinical workflows.</p>
</sec>
</body>
<back>
<ack>
<p>The authors thank the National Science and Technology Council (NSTC) of the Republic of China, Taiwan, for financially supporting this research.</p>
</ack>
<sec>
<title>Funding Statement</title>
<p>The authors thank the National Science and Technology Council (NSTC) of the Republic of China, Taiwan, for financially supporting this research under Contract No. NSTC 112-2637-M-131-001.</p>
</sec>
<sec>
<title>Author Contributions</title>
<p>The authors confirm contribution to the paper as follows: Study Conceptualization: Nisar Ahmad, Yao-Tien Chen; Data Collection, Experiments: Nisar Ahmad; Results Analysis and Discussion: Nisar Ahmad, Yao-Tien Chen, Khursheed Aurangzeb; Supervision: Yao-Tien Chen; Writing, Editing: Nisar Ahmad; Draft Review: Yao-Tien Chen, Nisar Ahmad, Khursheed Aurangzeb. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability">
<title>Availability of Data and Materials</title>
<p>The datasets used in this study are two publicly available brain tumor datasets (BraTS 2020 <underline>(</underline><ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/awsaf49/brats20-dataset-training-validation">https://www.kaggle.com/datasets/awsaf49/brats20-dataset-training-validation</ext-link><underline>)</underline> and BraTS 2021 (<ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/dschettler8845/brats-2021-task1">https://www.kaggle.com/datasets/dschettler8845/brats-2021-task1</ext-link><underline>)</underline> (accessed on 20 June 2025)), which were downloaded from Kaggle. The preprocessed data versions utilized in this study are not publicly available but are available from the corresponding author upon request.</p>
</sec>
<sec>
<title>Ethics Approval</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement">
<title>Conflicts of Interest</title>
<p>The authors declare no conflicts of interest to report regarding the present study.</p>
</sec>
<app-group id="appg-1">
<app id="app-1">
<title>Appendix A Gabor Filter Mathematical Formulation and Parameters</title>
<p>This appendix provides the mathematical background and configuration details of the Gabor filter used in this study. The filter&#x2019;s core formulas are shown in <xref ref-type="disp-formula" rid="eqn-A1">Eqs. (A1)</xref>&#x2013;<xref ref-type="disp-formula" rid="eqn-A3">(A3)</xref>. <xref ref-type="table" rid="table-8">Table A1</xref> defines the parameters and their roles, along with the optimized values selected during experimentation.</p>
<table-wrap id="table-8">
<label>Table A1</label>
<caption>
<title>Gabor wavelet parameters optimization. Optimized value(s) are presented for each parameter where the most suit-able results are achieved</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th align="center">Parameter</th>
<th align="center">Description</th>
<th align="center">Optimized value range</th>
</tr>
</thead>
<tbody>
<tr>
<td><bold>Lambda</bold> (<bold>&#x03BB;</bold>)</td>
<td>Wavelength of the sinusoidal factor (spatial period of the cosine wave).</td>
<td>{1: 10}</td>
</tr>
<tr>
<td><bold>Theta</bold> (<bold>&#x03B8;</bold>)</td>
<td>Orientation of the normal to the parallel stripes of the Gabor function (typically in radians).</td>
<td>{0, &#x03C0;/4, &#x03C0;/2, 3&#x03C0;/4, &#x03C0;}</td>
</tr>
<tr>
<td><bold>Phi</bold> (<bold>&#x03C8;</bold>)</td>
<td>Phase offset of the sinusoidal function, controlling the symmetry of the filter.</td>
<td>{0}</td>
</tr>
<tr>
<td><bold>Sigma</bold> (<bold>&#x03B3;</bold>)</td>
<td>Standard deviation of the Gaussian envelope, determining the spatial extent of the filter.</td>
<td>{1, 3}</td>
</tr>
<tr>
<td><bold>Gamma</bold> (<bold>&#x03B3;</bold>)</td>
<td>Spatial aspect ratio, specifying the ellipticity of the Gabor function (i.e., the ratio between the x and y axes of the Gaussian).</td>
<td>{1}</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>These wavelets can be described mathematically in the following <xref ref-type="disp-formula" rid="eqn-A1">Eqs. (A1)</xref>&#x2013;<xref ref-type="disp-formula" rid="eqn-A3">(A3)</xref>.
<disp-formula id="eqn-A1"><label>(A1)</label><mml:math id="mml-eqn-A1" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>;</mml:mo><mml:mi>&#x03BB;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03C8;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B3;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>exp</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mi>x</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:msup><mml:mi>&#x03B3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:msup><mml:mi>y</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:msup><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mi>cos</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>&#x03C0;</mml:mi><mml:mfrac><mml:msup><mml:mi>x</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mi>&#x03BB;</mml:mi></mml:mfrac><mml:mo>+</mml:mo><mml:mi>&#x03C8;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-A2"><label>(A2)</label><mml:math id="mml-eqn-A2" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:msup><mml:mi>x</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mi>x</mml:mi><mml:mi>cos</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>y</mml:mi><mml:mi>sin</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-A3"><label>(A3)</label><mml:math id="mml-eqn-A3" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:msup><mml:mi>y</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mi>x</mml:mi><mml:mi>cos</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>y</mml:mi><mml:mi>cos</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
</app>
<app id="app-2">
<title>Appendix B Squeeze and Excitation (SE) Attention Formulation</title>
<p>This appendix details the squeeze phase of the SE attention mechanism. <xref ref-type="table" rid="table-9">Table A2</xref> describes the symbols used and formula is provided in <xref ref-type="disp-formula" rid="eqn-A4">Eq. (A4)</xref>.</p>
<table-wrap id="table-9">
<label>Table A2</label>
<caption>
<title>Variable definitions for the squeeze operation in the SE attention mechanism</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Symbol</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>s</td>
<td>Output of the squeeze operation for channel <italic>c</italic></td>
</tr>
<tr>
<td><inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">F</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">sq</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula></td>
<td>Squeeze function that aggregates spatial information</td>
</tr>
<tr>
<td><inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">u</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">c</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula></td>
<td>Feature map value for channel <italic>c</italic></td>
</tr>
<tr>
<td><inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mrow><mml:mtext mathvariant="bold">i</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mtext mathvariant="bold">j</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mtext mathvariant="bold">k</mml:mtext></mml:mrow></mml:math></inline-formula></td>
<td>Indices over height (H), width (W), and depth (D) dimensions</td>
</tr>
<tr>
<td><inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:math></inline-formula></td>
<td>Height of the input feature map</td>
</tr>
<tr>
<td><inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:math></inline-formula></td>
<td>Width of the input feature map</td>
</tr>
<tr>
<td><inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow></mml:math></inline-formula></td>
<td>Depth of the input feature map (number of slices)</td>
</tr>
<tr>
<td><inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:mrow><mml:mtext mathvariant="bold">C</mml:mtext></mml:mrow></mml:math></inline-formula></td>
<td>Number of channels in the feature map</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The squeeze operation that aggregates global spatial information is defined as,
<disp-formula id="eqn-A4"><label>(A4)</label><mml:math id="mml-eqn-A4" display="block"><mml:mi>s</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>q</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mi>H</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>W</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>D</mml:mi></mml:mrow></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>H</mml:mi></mml:mrow></mml:munderover><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>W</mml:mi></mml:mrow></mml:munderover><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>D</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>

</app>
</app-group>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Tong</surname> <given-names>L</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>L</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>F</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Q</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Deep learning based brain tumor segmentation: a survey</article-title>. <source>Complex Intell Syst</source>. <year>2023</year>;<volume>9</volume>(<issue>1</issue>):<fpage>1001</fpage>&#x2013;<lpage>26</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s40747-022-00815-5</pub-id>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Feng</surname> <given-names>X</given-names></string-name>, <string-name><surname>Tustison</surname> <given-names>NJ</given-names></string-name>, <string-name><surname>Patel</surname> <given-names>SH</given-names></string-name>, <string-name><surname>Meyer</surname> <given-names>CH</given-names></string-name></person-group>. <article-title>Brain tumor segmentation using an ensemble of 3D U-Nets and overall survival prediction using radiomic features</article-title>. <source>Front Comput Neurosci</source>. <year>2020</year>;<volume>14</volume>:<fpage>25</fpage>. doi:<pub-id pub-id-type="doi">10.3389/fncom.2020.00025</pub-id>; <pub-id pub-id-type="pmid">32322196</pub-id></mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>D</given-names></string-name>, <string-name><surname>Huang</surname> <given-names>G</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Han</surname> <given-names>J</given-names></string-name>, <string-name><surname>Han</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Exploring task structure for brain tumor segmentation from multi-modality MR images</article-title>. <source>IEEE Trans Image Process</source>. <year>2020</year>;<volume>29</volume>:<fpage>9032</fpage>&#x2013;<lpage>43</lpage>. doi:<pub-id pub-id-type="doi">10.1109/TIP.2020.3023609</pub-id>; <pub-id pub-id-type="pmid">32941137</pub-id></mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Imtiaz</surname> <given-names>T</given-names></string-name>, <string-name><surname>Rifat</surname> <given-names>S</given-names></string-name>, <string-name><surname>Fattah</surname> <given-names>SA</given-names></string-name>, <string-name><surname>Wahid</surname> <given-names>KA</given-names></string-name></person-group>. <article-title>Automated brain tumor segmentation based on multi-planar superpixel level features extracted from 3D MR images</article-title>. <source>IEEE Access</source>. <year>2020</year>;<volume>8</volume>:<fpage>25335</fpage>&#x2013;<lpage>49</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2019.2961630</pub-id>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Shen</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sandham</surname> <given-names>W</given-names></string-name>, <string-name><surname>Granat</surname> <given-names>M</given-names></string-name>, <string-name><surname>Sterr</surname> <given-names>A</given-names></string-name></person-group>. <article-title>MRI fuzzy segmentation of brain tissue using neighborhood attraction with neural-network optimization</article-title>. <source>IEEE Trans Inf Technol Biomed</source>. <year>2005</year>;<volume>9</volume>(<issue>3</issue>):<fpage>459</fpage>&#x2013;<lpage>67</lpage>. doi:<pub-id pub-id-type="doi">10.1109/titb.2005.847500</pub-id>; <pub-id pub-id-type="pmid">16167700</pub-id></mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>J</given-names></string-name>, <string-name><surname>Bai</surname> <given-names>X</given-names></string-name></person-group>. <article-title>Gradient-assisted deep model for brain tumor segmentation by multi-modality MRI volumes</article-title>. <source>Biomed Signal Process Control</source>. <year>2023</year>;<volume>85</volume>:<fpage>105066</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.bspc.2023.105066</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gooya</surname> <given-names>A</given-names></string-name>, <string-name><surname>Pohl</surname> <given-names>KM</given-names></string-name>, <string-name><surname>Bilello</surname> <given-names>M</given-names></string-name>, <string-name><surname>Cirillo</surname> <given-names>L</given-names></string-name>, <string-name><surname>Biros</surname> <given-names>G</given-names></string-name>, <string-name><surname>Melhem</surname> <given-names>ER</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>GLISTR: glioma image segmentation and registration</article-title>. <source>IEEE Trans Med Imaging</source>. <year>2012</year>;<volume>31</volume>(<issue>10</issue>):<fpage>1941</fpage>&#x2013;<lpage>54</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tmi.2012.2210558</pub-id>; <pub-id pub-id-type="pmid">22907965</pub-id></mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhou</surname> <given-names>S</given-names></string-name>, <string-name><surname>Nie</surname> <given-names>D</given-names></string-name>, <string-name><surname>Adeli</surname> <given-names>E</given-names></string-name>, <string-name><surname>Yin</surname> <given-names>J</given-names></string-name>, <string-name><surname>Lian</surname> <given-names>J</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>D</given-names></string-name></person-group>. <article-title>High-resolution encoder-decoder networks for low-contrast medical image segmentation</article-title>. <source>IEEE Trans Image Process</source>. <year>2020</year>;<volume>29</volume>:<fpage>461</fpage>&#x2013;<lpage>75</lpage>. doi:<pub-id pub-id-type="doi">10.1109/TIP.2019.2919937</pub-id>; <pub-id pub-id-type="pmid">31226074</pub-id></mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhou</surname> <given-names>T</given-names></string-name>, <string-name><surname>Canu</surname> <given-names>S</given-names></string-name>, <string-name><surname>Vera</surname> <given-names>P</given-names></string-name>, <string-name><surname>Ruan</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Latent correlation representation learning for brain tumor segmentation with missing MRI modalities</article-title>. <source>IEEE Trans Image Process</source>. <year>2021</year>;<volume>30</volume>:<fpage>4263</fpage>&#x2013;<lpage>74</lpage>. doi:<pub-id pub-id-type="doi">10.1109/TIP.2021.3070752</pub-id>; <pub-id pub-id-type="pmid">33830924</pub-id></mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kia</surname> <given-names>M</given-names></string-name>, <string-name><surname>Sadeghi</surname> <given-names>S</given-names></string-name>, <string-name><surname>Safarpour</surname> <given-names>H</given-names></string-name>, <string-name><surname>Sadeghi</surname> <given-names>E</given-names></string-name>, <string-name><surname>Shamsi</surname> <given-names>M</given-names></string-name>, <string-name><surname>Moghaddam</surname> <given-names>HS</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Innovative fusion of VGG16, MobileNet, EfficientNet, AlexNet, and ResNet50 for MRI-based brain tumor identification</article-title>. <source>Iran J Comput Sci</source>. <year>2025</year>;<volume>8</volume>:<fpage>185</fpage>&#x2013;<lpage>215</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s42044-024-00216-6</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wu</surname> <given-names>B</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>F</given-names></string-name>, <string-name><surname>Xu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>S</given-names></string-name>, <string-name><surname>Shao</surname> <given-names>P</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>M</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Modality preserving U-Net for segmentation of multimodal medical images</article-title>. <source>Quant Imaging Med Surg</source>. <year>2023</year>;<volume>13</volume>(<issue>8</issue>):<fpage>5242</fpage>&#x2013;<lpage>57</lpage>. doi:<pub-id pub-id-type="doi">10.21037/qims-22-1367</pub-id>; <pub-id pub-id-type="pmid">37581055</pub-id></mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>He</surname> <given-names>K</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Ren</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification</article-title>. In: <conf-name>Proceedings of the IEEE International Conference on Computer Vision</conf-name>; <year>2015 Dec 7&#x2013;13</year>; <publisher-loc>Santiago, Chile</publisher-loc>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Mzoughi</surname> <given-names>H</given-names></string-name>, <string-name><surname>Njeh</surname> <given-names>I</given-names></string-name>, <string-name><surname>Slima</surname> <given-names>MB</given-names></string-name>, <string-name><surname>Ben Hamida</surname> <given-names>A</given-names></string-name>, <string-name><surname>Mhiri</surname> <given-names>C</given-names></string-name>, <string-name><surname>Mahfoudh</surname> <given-names>KB</given-names></string-name></person-group>. <article-title>Towards a computer aided diagnosis (CAD) for brain MRI glioblastomas tumor exploration based on a deep convolutional neuronal networks (D-CNN) architectures</article-title>. <source>Multimed Tools Appl</source>. <year>2021</year>;<volume>80</volume>(<issue>1</issue>):<fpage>899</fpage>&#x2013;<lpage>919</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s11042-020-09786-6</pub-id>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Ronneberger</surname> <given-names>O</given-names></string-name>, <string-name><surname>Fischer</surname> <given-names>P</given-names></string-name>, <string-name><surname>Brox</surname> <given-names>T</given-names></string-name></person-group>. <chapter-title>U-Net: convolutional networks for biomedical image segmentation</chapter-title>. In: <person-group person-group-type="editor"><string-name><surname>Navab</surname> <given-names>N</given-names></string-name>, <string-name><surname>Hornegger</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wells</surname> <given-names>W</given-names></string-name>, <string-name><surname>Frangi</surname> <given-names>A</given-names></string-name></person-group>, editors. <source>Medical image computing and computer-assisted intervention-MICCAI 2015 (lecture notes in computer science)</source>. Vol. <volume>9351</volume>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2015</year>. p. <fpage>234</fpage>&#x2013;<lpage>41</lpage>. doi:<pub-id pub-id-type="doi">10.1007/978-3-319-24574-4_28</pub-id>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Long</surname> <given-names>J</given-names></string-name>, <string-name><surname>Shelhamer</surname> <given-names>E</given-names></string-name>, <string-name><surname>Darrell</surname> <given-names>T</given-names></string-name></person-group>. <article-title>Fully convolutional networks for semantic segmentation</article-title>. In: <conf-name>Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</conf-name>; <year>2015 Jun 7&#x2013;12</year>; <publisher-loc>Boston, MA, USA</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1109/CVPR.2015.7298965</pub-id>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>&#x00C7;i&#x00E7;ek</surname> <given-names>&#x00D6;</given-names></string-name>, <string-name><surname>Abdulkadir</surname> <given-names>A</given-names></string-name>, <string-name><surname>Lienkamp</surname> <given-names>SS</given-names></string-name>, <string-name><surname>Brox</surname> <given-names>T</given-names></string-name>, <string-name><surname>Ronneberger</surname> <given-names>O</given-names></string-name></person-group>. <chapter-title>3D U-Net: learning dense volumetric segmentation from sparse annotation</chapter-title>. In: <person-group person-group-type="editor"><string-name><surname>Ourselin</surname> <given-names>S</given-names></string-name>, <string-name><surname>Joskowicz</surname> <given-names>L</given-names></string-name>, <string-name><surname>Sabuncu</surname> <given-names>M</given-names></string-name>, <string-name><surname>Unal</surname> <given-names>G</given-names></string-name>, <string-name><surname>Wells</surname> <given-names>W</given-names></string-name></person-group>, editors. <source>Medical image computing and computer-assisted intervention-MICCAI 2016 (lecture notes in computer science)</source>. Vol. <volume>9901</volume>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2016</year>. p. <fpage>424</fpage>&#x2013;<lpage>32</lpage>. doi:<pub-id pub-id-type="doi">10.1007/978-3-319-46723-8_49</pub-id>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Dai</surname> <given-names>J</given-names></string-name>, <string-name><surname>Qi</surname> <given-names>H</given-names></string-name>, <string-name><surname>Xiong</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>G</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>H</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Deformable convolutional networks</article-title>. In: <conf-name>Proceedings of the IEEE International Conference on Computer Vision</conf-name>; <year>2017 Dec 25</year>; <publisher-loc>Venice, Italy</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1109/ICCV.2017.89</pub-id>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Zhou</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Ye</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Qiu</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Jiao</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Oriented response networks</article-title>. In: <conf-name>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</conf-name>; <year>2017 Jul 21&#x2013;26</year>; <publisher-loc>Honolulu, HI, USA</publisher-loc>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gabor</surname> <given-names>D</given-names></string-name></person-group>. <article-title>Theory of communication. Part 1: the analysis of information</article-title>. <source>J Inst Electr Eng</source>. <year>1946</year>;<volume>93</volume>(<issue>26</issue>):<fpage>429</fpage>&#x2013;<lpage>41</lpage>. doi:<pub-id pub-id-type="doi">10.1049/ji-3-2.1946.0074</pub-id>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Gong</surname> <given-names>X</given-names></string-name>, <string-name><surname>Xia</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>W</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>B</given-names></string-name>, <string-name><surname>Doermann</surname> <given-names>D</given-names></string-name>, <string-name><surname>Zhuo</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Deformable Gabor feature networks for biomedical image classification</article-title>. In: <conf-name>Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision</conf-name>; <year>2021 Jan 3&#x2013;8</year>; <publisher-loc>Waikoloa, HI, USA</publisher-loc>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Vaswani</surname> <given-names>A</given-names></string-name>, <string-name><surname>Shazeer</surname> <given-names>N</given-names></string-name>, <string-name><surname>Parmar</surname> <given-names>N</given-names></string-name>, <string-name><surname>Uszkoreit</surname> <given-names>J</given-names></string-name>, <string-name><surname>Jones</surname> <given-names>L</given-names></string-name>, <string-name><surname>Gomez</surname> <given-names>AN</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Attention is all you need</article-title>. <comment>arXiv:1706.03762. 2017</comment>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Hatamizadeh</surname> <given-names>A</given-names></string-name>, <string-name><surname>Tang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Nath</surname> <given-names>V</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>D</given-names></string-name>, <string-name><surname>Myronenko</surname> <given-names>A</given-names></string-name>, <string-name><surname>Landman</surname> <given-names>B</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>UNETR: transformers for 3D medical image segmentation</article-title>. In: <conf-name>Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision</conf-name>; <year>2022 Jan 3&#x2013;8</year>; <publisher-loc>Waikoloa, HI, USA</publisher-loc>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Jiang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>X</given-names></string-name>, <string-name><surname>Dong</surname> <given-names>J</given-names></string-name>, <string-name><surname>Cheng</surname> <given-names>T</given-names></string-name>, <string-name><surname>Liang</surname> <given-names>J</given-names></string-name></person-group>. <article-title>SwinBTS: a method for 3D multimodal brain tumor segmentation using Swin transformer</article-title>. <source>Brain Sci</source>. <year>2022</year>;<volume>12</volume>(<issue>6</issue>):<fpage>797</fpage>. doi:<pub-id pub-id-type="doi">10.3390/brainsci12060797</pub-id>; <pub-id pub-id-type="pmid">35741682</pub-id></mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Dong</surname> <given-names>J</given-names></string-name>, <string-name><surname>Hou</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>B</given-names></string-name></person-group>. <article-title>Attention gate ResU-Net for automatic MRI brain tumor segmentation</article-title>. <source>IEEE Access</source>. <year>2020</year>;<volume>8</volume>:<fpage>58533</fpage>&#x2013;<lpage>45</lpage>. doi:<pub-id pub-id-type="doi">10.1109/ACCESS.2020.2983075</pub-id>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Raza</surname> <given-names>R</given-names></string-name>, <string-name><surname>Ijaz Bajwa</surname> <given-names>U</given-names></string-name>, <string-name><surname>Mehmood</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Waqas Anwar</surname> <given-names>M</given-names></string-name>, <string-name><surname>Hassan Jamal</surname> <given-names>M</given-names></string-name></person-group>. <article-title>dResU-Net: 3D deep residual U-Net based brain tumor segmentation from multimodal MRI</article-title>. <source>Biomed Signal Process Control</source>. <year>2022</year>;<volume>79</volume>(<issue>4</issue>):<fpage>103861</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.bspc.2022.103861</pub-id>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>W</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>C</given-names></string-name>, <string-name><surname>Ding</surname> <given-names>M</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zha</surname> <given-names>S</given-names></string-name>, <string-name><surname>Li</surname> <given-names>J</given-names></string-name></person-group>. <article-title>TransBTS: multimodal brain tumor segmentation using transformer</article-title>. In: <conf-name> Proceedings of the 24th International Conference-Medical Image Computing and Computer Assisted Intervention&#x2014;MICCAI 2021</conf-name>; <year>2021 Sep 27&#x2013;Oct 1</year>; <publisher-loc>Strasbourg, France</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1007/978-3-030-87193-2_11</pub-id>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Chen</surname> <given-names>W</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>B</given-names></string-name>, <string-name><surname>Peng</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>J</given-names></string-name>, <string-name><surname>Qiao</surname> <given-names>X</given-names></string-name></person-group>. <chapter-title>S3D-UNet: separable 3D U-Net for brain tumor segmentation</chapter-title>. In: <person-group person-group-type="editor"><string-name><surname>Crimi</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bakas</surname> <given-names>S</given-names></string-name>, <string-name><surname>Kuijf</surname> <given-names>H</given-names></string-name>, <string-name><surname>Keyvan</surname> <given-names>F</given-names></string-name>, <string-name><surname>Reyes</surname> <given-names>M</given-names></string-name>, <string-name><surname>van Walsum</surname> <given-names>T</given-names></string-name></person-group>, editors. <source>Brainlesion: glioma, multiple sclerosis, stroke and traumatic brain injuries (lecture notes in computer science)</source>. Vol. <volume>11384</volume>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2019</year>. doi:<pub-id pub-id-type="doi">10.1007/978-3-030-11726-9_32</pub-id>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Maji</surname> <given-names>D</given-names></string-name>, <string-name><surname>Sigedar</surname> <given-names>P</given-names></string-name>, <string-name><surname>Singh</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Attention Res-UNet with guided decoder for semantic segmentation of brain tumors</article-title>. <source>Biomed Signal Process Control</source>. <year>2022</year>;<volume>71</volume>:<fpage>103077</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.bspc.2021.103077</pub-id>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kia</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Attention-guided deep learning for effective customer loyalty management and multi-criteria decision analysis</article-title>. <source>Iran J Comput Sci</source>. <year>2024</year>;<volume>8</volume>(<issue>1</issue>):<fpage>163</fpage>&#x2013;<lpage>84</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s42044-024-00215-7</pub-id>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Lv</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>B</given-names></string-name></person-group>. <article-title>AResU-Net: attention residual U-Net for brain tumor segmentation</article-title>. <source>Symmetry</source>. <year>2020</year>;<volume>12</volume>(<issue>5</issue>):<fpage>1</fpage>&#x2013;<lpage>15</lpage>. doi:<pub-id pub-id-type="doi">10.3390/SYM12050721</pub-id>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Cao</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>W</given-names></string-name>, <string-name><surname>Zang</surname> <given-names>M</given-names></string-name>, <string-name><surname>An</surname> <given-names>D</given-names></string-name>, <string-name><surname>Feng</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>B</given-names></string-name></person-group>. <article-title>MBANet: a 3D convolutional neural network with multi-branch attention for brain tumor segmentation from MRI images</article-title>. <source>Biomed Signal Process Control</source>. <year>2023</year>;<volume>80</volume>(<issue>2</issue>):<fpage>104296</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.bspc.2022.104296</pub-id>.</mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Akbar</surname> <given-names>AS</given-names></string-name>, <string-name><surname>Fatichah</surname> <given-names>C</given-names></string-name>, <string-name><surname>Suciati</surname> <given-names>N</given-names></string-name></person-group>. <article-title>Single level UNet3D with multipath residual attention block for brain tumor segmentation</article-title>. <source>J King Saud Univ Comput Inf Sci</source>. <year>2022</year>;<volume>34</volume>(<issue>6</issue>):<fpage>3247</fpage>&#x2013;<lpage>58</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.jksuci.2022.03.022</pub-id>.</mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Huo</surname> <given-names>G</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Guan</surname> <given-names>X</given-names></string-name>, <string-name><surname>Tseng</surname> <given-names>M-L</given-names></string-name></person-group>. <article-title>Multiscale lightweight 3D segmentation algorithm with attention mechanism: brain tumor image segmentation</article-title>. <source>Expert Syst Appl</source>. <year>2023</year>;<volume>214</volume>(<issue>1</issue>):<fpage>119166</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.eswa.2022.119166</pub-id>.</mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Li</surname> <given-names>C</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>F</given-names></string-name>, <string-name><surname>Du</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Li</surname> <given-names>H</given-names></string-name></person-group>. <article-title>Classification of brain tumor types through MRIs using parallel CNNs and firefly optimization</article-title>. <source>Sci Rep</source>. <year>2022</year>;<volume>12</volume>(<issue>1</issue>):<fpage>12420</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41598-024-65714-w</pub-id>; <pub-id pub-id-type="pmid">38956224</pub-id></mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Yuan</surname> <given-names>L</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>T</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>W</given-names></string-name>, <string-name><surname>Shi</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>Z</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Tokens-to-token ViT: training vision transformers from scratch on ImageNet</article-title>. In: <conf-name>Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV)</conf-name>; <year>2021 Oct 10&#x2013;17</year>; <publisher-loc>Montreal, QC, Canada</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1109/iccv48922.2021.00060</pub-id>.</mixed-citation></ref>
<ref id="ref-36"><label>[36]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Menze</surname> <given-names>BH</given-names></string-name>, <string-name><surname>Jakab</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bauer</surname> <given-names>S</given-names></string-name>, <string-name><surname>Kalpathy-Cramer</surname> <given-names>J</given-names></string-name>, <string-name><surname>Farahani</surname> <given-names>K</given-names></string-name>, <string-name><surname>Kirby</surname> <given-names>J</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>The multimodal brain tumor image segmentation benchmark (BRATS)</article-title>. <source>IEEE Trans Med Imaging</source>. <year>2015</year>;<volume>34</volume>(<issue>10</issue>):<fpage>1993</fpage>&#x2013;<lpage>2024</lpage>. doi:<pub-id pub-id-type="doi">10.1109/TMI.2014.2377694</pub-id>; <pub-id pub-id-type="pmid">25494501</pub-id></mixed-citation></ref>
<ref id="ref-37"><label>[37]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Bakas</surname> <given-names>S</given-names></string-name>, <string-name><surname>Akbari</surname> <given-names>H</given-names></string-name>, <string-name><surname>Sotiras</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bilello</surname> <given-names>M</given-names></string-name>, <string-name><surname>Rozycki</surname> <given-names>M</given-names></string-name>, <string-name><surname>Kirby</surname> <given-names>JS</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features</article-title>. <source>Sci Data</source>. <year>2017</year>;<volume>4</volume>(<issue>1</issue>):<fpage>170117</fpage>. doi:<pub-id pub-id-type="doi">10.1038/sdata.2017.117</pub-id>; <pub-id pub-id-type="pmid">28872634</pub-id></mixed-citation></ref>
<ref id="ref-38"><label>[38]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Bakas</surname> <given-names>S</given-names></string-name>, <string-name><surname>Reyes</surname> <given-names>M</given-names></string-name>, <string-name><surname>Jakab</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bauer</surname> <given-names>S</given-names></string-name>, <string-name><surname>Rempfler</surname> <given-names>M</given-names></string-name>, <string-name><surname>Crimi</surname> <given-names>A</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge</article-title>. <comment>arXiv:1811.02629. 2018</comment>.</mixed-citation></ref>
<ref id="ref-39"><label>[39]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Bauer</surname> <given-names>S</given-names></string-name>, <string-name><surname>Wiest</surname> <given-names>R</given-names></string-name>, <string-name><surname>Nolte</surname> <given-names>L-P</given-names></string-name>, <string-name><surname>Reyes</surname> <given-names>M</given-names></string-name></person-group>. <article-title>A survey of MRI-based medical image analysis for brain tumor studies</article-title>. <source>Phys Med Biol</source>. <year>2013</year>;<volume>58</volume>(<issue>13</issue>):<fpage>R97</fpage>&#x2013;<lpage>129</lpage>. doi:<pub-id pub-id-type="doi">10.1088/0031-9155/58/13/R97</pub-id>; <pub-id pub-id-type="pmid">23743802</pub-id></mixed-citation></ref>
<ref id="ref-40"><label>[40]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zhuang</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Xiao</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>G</given-names></string-name>, <string-name><surname>Shi</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>L</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>MIL normalization&#x2014;prerequisites for accurate MRI radiomics analysis</article-title>. <source>Comput Biol Med</source>. <year>2021</year>;<volume>133</volume>:<fpage>104403</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.compbiomed.2021.104403</pub-id>; <pub-id pub-id-type="pmid">33932645</pub-id></mixed-citation></ref>
<ref id="ref-41"><label>[41]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Patro</surname> <given-names>SGK</given-names></string-name>, <string-name><surname>Sahu</surname> <given-names>KK</given-names></string-name></person-group>. <article-title>Normalization: a preprocessing stage</article-title>. <source>Int Adv Res J Sci Eng Technol</source>. <year>2015</year>;<volume>2</volume>(<issue>3</issue>):<fpage>20</fpage>&#x2013;<lpage>2</lpage>. doi:<pub-id pub-id-type="doi">10.17148/iarjset.2015.2305</pub-id>.</mixed-citation></ref>
<ref id="ref-42"><label>[42]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Shao</surname> <given-names>J</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>S</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Brown</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Application of U-Net and optimized clustering in medical image segmentation: a review</article-title>. <source>Comput Model Eng Sci</source>. <year>2023</year>;<volume>136</volume>(<issue>3</issue>):<fpage>2173</fpage>&#x2013;<lpage>219</lpage>. doi:<pub-id pub-id-type="doi">10.32604/cmes.2023.025499</pub-id>.</mixed-citation></ref>
<ref id="ref-43"><label>[43]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Reyes</surname> <given-names>AA</given-names></string-name>, <string-name><surname>Paheding</surname> <given-names>S</given-names></string-name>, <string-name><surname>Deo</surname> <given-names>M</given-names></string-name>, <string-name><surname>Audette</surname> <given-names>M</given-names></string-name></person-group>. <chapter-title>Gabor filter-embedded U-Net with transformer-based encoding for biomedical image segmentation</chapter-title>. In: <person-group person-group-type="editor"><string-name><surname>Li</surname> <given-names>X</given-names></string-name>, <string-name><surname>Lv</surname> <given-names>J</given-names></string-name>, <string-name><surname>Huo</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Dong</surname> <given-names>B</given-names></string-name>, <string-name><surname>Leahy</surname> <given-names>RM</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Q</given-names></string-name></person-group>, editors. <source>Multiscale multimodal medical imaging. MMMI 2022 (lecture notes in computer science)</source>. Vol. <volume>13594</volume>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2022</year>. doi:<pub-id pub-id-type="doi">10.1007/978-3-031-18814-5_8</pub-id>.</mixed-citation></ref>
<ref id="ref-44"><label>[44]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>He</surname> <given-names>K</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Ren</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Deep residual learning for image recognition</article-title>. In: <conf-name>Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</conf-name>; <publisher-loc>Las Vegas, NV, USA</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1109/cvpr.2016.90</pub-id>.</mixed-citation></ref>
<ref id="ref-45"><label>[45]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Sudre</surname> <given-names>CH</given-names></string-name>, <string-name><surname>Li</surname> <given-names>W</given-names></string-name>, <string-name><surname>Vercauteren</surname> <given-names>T</given-names></string-name>, <string-name><surname>Ourselin</surname> <given-names>S</given-names></string-name>, <string-name><surname>Cardoso</surname> <given-names>MJ</given-names></string-name></person-group>. <article-title>Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations</article-title>. In: <conf-name>Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS 2017</conf-name>; <year>2017 Sep 14</year>; <publisher-loc>Qu&#x00E9;bec City, QC, Canada</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1007/978-3-319-67558-9_28</pub-id>; <pub-id pub-id-type="pmid">34104926</pub-id></mixed-citation></ref>
<ref id="ref-46"><label>[46]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Bakas</surname> <given-names>S</given-names></string-name>, <string-name><surname>Akbari</surname> <given-names>H</given-names></string-name>, <string-name><surname>Sotiras</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bilello</surname> <given-names>M</given-names></string-name>, <string-name><surname>Rozycki</surname> <given-names>M</given-names></string-name>, <string-name><surname>Kirby</surname> <given-names>J</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Segmentation labels and radiomic features for the pre-operative scans of the TCGA-LGG collection</article-title>. <source>The Cancer Imaging Archive</source>. <year>2017</year>;<volume>286</volume>. doi:<pub-id pub-id-type="doi">10.7937/K9/TCIA.2017.GJQ7R0EF</pub-id>.</mixed-citation></ref>
<ref id="ref-47"><label>[47]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Lin</surname> <given-names>T-Y</given-names></string-name>, <string-name><surname>Goyal</surname> <given-names>P</given-names></string-name>, <string-name><surname>Girshick</surname> <given-names>R</given-names></string-name>, <string-name><surname>He</surname> <given-names>K</given-names></string-name>, <string-name><surname>Dollar</surname> <given-names>P</given-names></string-name></person-group>. <article-title>Focal loss for dense object detection</article-title>. <source>IEEE Trans Pattern Anal Mach Intell</source>. <year>2018</year>;<volume>42</volume>(<issue>2</issue>):<fpage>318</fpage>&#x2013;<lpage>27</lpage>. doi:<pub-id pub-id-type="doi">10.1109/TPAMI.2018.2858826</pub-id>; <pub-id pub-id-type="pmid">30040631</pub-id></mixed-citation></ref>
<ref id="ref-48"><label>[48]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Ballestar</surname> <given-names>LM</given-names></string-name>, <string-name><surname>Vilaplana</surname> <given-names>V</given-names></string-name></person-group>. <article-title>Brain tumor segmentation using 3D-CNNs with uncertainty estimation</article-title>. <comment>arXiv:2009.12188. 2020</comment>.</mixed-citation></ref>
<ref id="ref-49"><label>[49]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Messaoudi</surname> <given-names>H</given-names></string-name>, <string-name><surname>Belaid</surname> <given-names>A</given-names></string-name>, <string-name><surname>Allaoui</surname> <given-names>ML</given-names></string-name>, <string-name><surname>Zetout</surname> <given-names>A</given-names></string-name>, <string-name><surname>Allili</surname> <given-names>MS</given-names></string-name>, <string-name><surname>Tliba</surname> <given-names>S</given-names></string-name>, <etal>et al</etal></person-group>. <chapter-title>Efficient embedding network for 3D brain tumor segmentation</chapter-title>. In: <person-group person-group-type="editor"><string-name><surname>Crimi</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bakas</surname> <given-names>S</given-names></string-name></person-group>, editors. <source>BrainLes 2020. Brainlesion: glioma, multiple sclerosis, stroke and traumatic brain injuries. BrainLes 2020. Lecture notes in computer science</source>. Vol. <volume>12658</volume>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2021</year>. doi:<pub-id pub-id-type="doi">10.1007/978-3-030-72084-1_23</pub-id>.</mixed-citation></ref>
<ref id="ref-50"><label>[50]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>F</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>R</given-names></string-name>, <string-name><surname>Zheng</surname> <given-names>L</given-names></string-name>, <string-name><surname>Meng</surname> <given-names>C</given-names></string-name>, <string-name><surname>Biswal</surname> <given-names>B</given-names></string-name></person-group>. <article-title>3D U-net based brain tumor segmentation and survival days prediction</article-title>. <source>Lect Notes Comput Sci</source>. <year>2020</year>;<volume>11992</volume>(<issue>9255</issue>):<fpage>131</fpage>&#x2013;<lpage>41</lpage>. doi:<pub-id pub-id-type="doi">10.1007/978-3-030-46640-4_13</pub-id>.</mixed-citation></ref>
<ref id="ref-51"><label>[51]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Colman</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Duan</surname> <given-names>W</given-names></string-name>, <string-name><surname>Ye</surname> <given-names>X</given-names></string-name></person-group>. <article-title>DR-Unet104 for multimodal MRI brain tumor segmentation</article-title>. In: <conf-name>Proceedings of the Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 6th International Workshop, BrainLes 2020, Held in Conjunction with MICCAI 2020</conf-name>; <year>2020 Oct 4</year>; <publisher-loc>Lima, Peru</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1007/978-3-030-72087-2_36</pub-id>.</mixed-citation></ref>
<ref id="ref-52"><label>[52]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Tang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Li</surname> <given-names>T</given-names></string-name>, <string-name><surname>Shu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>H</given-names></string-name></person-group>. <chapter-title>Variational-autoencoder regularized 3D MultiResUNet for the BraTS, 2020 brain tumor segmentation</chapter-title>. In: <person-group person-group-type="editor"><string-name><surname>Crimi</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bakas</surname> <given-names>S</given-names></string-name></person-group>, editors. <source>Brainlesion: glioma, multiple sclerosis, stroke and traumatic brain injuries. BrainLes 2020. Lecture notes in computer science</source>. Vol. <volume>12659</volume>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2021</year>. doi:<pub-id pub-id-type="doi">10.1007/978-3-030-72087-2_38</pub-id>.</mixed-citation></ref>
<ref id="ref-53"><label>[53]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Abd-Ellah</surname> <given-names>MK</given-names></string-name>, <string-name><surname>Awad</surname> <given-names>AI</given-names></string-name>, <string-name><surname>Khalaf</surname> <given-names>AAM</given-names></string-name>, <string-name><surname>Ibraheem</surname> <given-names>AM</given-names></string-name></person-group>. <article-title>Automatic brain-tumor diagnosis using cascaded deep convolutional neural networks with symmetric U-Net and asymmetric residual-blocks</article-title>. <source>Sci Rep</source>. <year>2024</year>;<volume>14</volume>(<issue>1</issue>):<fpage>9501</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41598-024-59566-7</pub-id>; <pub-id pub-id-type="pmid">38664436</pub-id></mixed-citation></ref>
<ref id="ref-54"><label>[54]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Ren</surname> <given-names>T</given-names></string-name>, <string-name><surname>Honey</surname> <given-names>E</given-names></string-name>, <string-name><surname>Rebala</surname> <given-names>H</given-names></string-name>, <string-name><surname>Sharma</surname> <given-names>A</given-names></string-name>, <string-name><surname>Chopra</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kurt</surname> <given-names>M</given-names></string-name></person-group>. <article-title>An optimization framework for processing and transfer learning for the brain tumor segmentation</article-title>. <comment>arXiv:2402.07008. 2024</comment>.</mixed-citation></ref>
<ref id="ref-55"><label>[55]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kleppe</surname> <given-names>A</given-names></string-name>, <string-name><surname>Skrede</surname> <given-names>OJ</given-names></string-name>, <string-name><surname>De Raedt</surname> <given-names>S</given-names></string-name>, <string-name><surname>Liest&#x00F8;l</surname> <given-names>K</given-names></string-name>, <string-name><surname>Kerr</surname> <given-names>DJ</given-names></string-name>, <string-name><surname>Danielsen</surname> <given-names>HE</given-names></string-name></person-group>. <article-title>Designing deep learning studies in cancer diagnostics</article-title>. <source>Nat Rev Cancer</source>. <year>2021</year>;<volume>21</volume>(<issue>3</issue>):<fpage>199</fpage>&#x2013;<lpage>211</lpage>. doi:<pub-id pub-id-type="doi">10.1038/s41568-020-00327-9</pub-id>; <pub-id pub-id-type="pmid">33514930</pub-id></mixed-citation></ref>
</ref-list>
</back></article>