<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">62923</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2025.062923</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Dynamic Spatial Focus in Alzheimer&#x2019;s Disease Diagnosis via Multiple CNN Architectures and Dynamic GradNet</article-title>
<alt-title alt-title-type="left-running-head">Dynamic Spatial Focus in Alzheimer&#x2019;s Disease Diagnosis via Multiple CNN Architectures and Dynamic GradNet</alt-title>
<alt-title alt-title-type="right-running-head">Dynamic Spatial Focus in Alzheimer&#x2019;s Disease Diagnosis via Multiple CNN Architectures and Dynamic GradNet</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Almotiri</surname><given-names>Jasem</given-names></name><xref ref-type="aff" rid="aff-1"></xref><email>j.jasem@tu.edu.sa</email></contrib>
<aff id="aff-1"><institution>Department of Computer Science, College of Computers and Information Technology, Taif University</institution>, <addr-line>Taif, 21944</addr-line>, <country>Saudi Arabia</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Jasem Almotiri. Email: <email>j.jasem@tu.edu.sa</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2025</year>
</pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>16</day><month>04</month><year>2025</year>
</pub-date>
<volume>83</volume>
<issue>2</issue>
<fpage>2109</fpage>
<lpage>2142</lpage>
<history>
<date date-type="received">
<day>31</day>
<month>12</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>17</day>
<month>3</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2025 The Author.</copyright-statement>
<copyright-year>2025</copyright-year>
<copyright-holder>Published by Tech Science Press.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_62923.pdf"></self-uri>
<abstract>
<p>The evolving field of Alzheimer&#x2019;s disease (AD) diagnosis has greatly benefited from deep learning models for analyzing brain magnetic resonance (MR) images. This study introduces Dynamic GradNet, a novel deep learning model designed to increase diagnostic accuracy and interpretability for multiclass AD classification. Initially, four state-of-the-art convolutional neural network (CNN) architectures, the self-regulated network (RegNet), residual network (ResNet), densely connected convolutional network (DenseNet), and efficient network (EfficientNet), were comprehensively compared via a unified preprocessing pipeline to ensure a fair evaluation. Among these models, EfficientNet consistently demonstrated superior performance in terms of accuracy, precision, recall, and F1 score. As a result, EfficientNet was selected as the foundation for implementing Dynamic GradNet. Dynamic GradNet incorporates gradient weighted class activation mapping (GradCAM) into the training process, facilitating dynamic adjustments that focus on critical brain regions associated with early dementia detection. These adjustments are particularly effective in identifying subtle changes associated with very mild dementia, enabling early diagnosis and intervention. The model was evaluated with the OASIS dataset, which contains greater than 80,000 brain MR images categorized into four distinct stages of AD progression. The proposed model outperformed the baseline architectures, achieving remarkable generalizability across all stages. This finding was especially evident in early-stage dementia detection, where Dynamic GradNet significantly reduced false positives and enhanced classification metrics. These findings highlight the potential of Dynamic GradNet as a robust and scalable approach for AD diagnosis, providing a promising alternative to traditional attention-based models. The model&#x2019;s ability to dynamically adjust spatial focus offers a powerful tool in artificial intelligence (AI) assisted precision medicine, particularly in the early detection of neurodegenerative diseases.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Spatial focus</kwd>
<kwd>GradCAM</kwd>
<kwd>medical image classification</kwd>
<kwd>deep learning</kwd>
<kwd>early dementia detection</kwd>
<kwd>neurodegenerative disease</kwd>
<kwd>MRI analysis</kwd>
<kwd>Alzheimer&#x2019;s</kwd>
<kwd>attention</kwd>
<kwd>CNN</kwd>
</kwd-group>
<funding-group>
<award-group id="awg1">
<funding-source>Deanship of Graduate Studies and Scientific Research, Taif University</funding-source>
</award-group>
</funding-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>Magnetic resonance imaging (MRI) has become an essential tool for diagnosing neurodegenerative diseases such as Alzheimer&#x2019;s disease (AD) because of its ability to provide detailed images of brain structures. MRI is particularly effective in detecting early and subtle changes in brain regions vulnerable to AD, such as the hippocampus, which plays a critical role in memory and cognitive function. Detecting these changes is crucial for early diagnosis and monitoring the progression of the disease [<xref ref-type="bibr" rid="ref-1">1</xref>,<xref ref-type="bibr" rid="ref-2">2</xref>]. Given the noninvasive nature of MRI, it is widely used in both clinical and research settings to detect biomarkers associated with AD. However, the manual interpretation of MRI images remains a challenging and time-consuming task that is prone to human error, especially when dealing with large datasets and complex brain structures. To address these challenges, automated methods&#x2014;particularly those leveraging machine learning and deep learning models&#x2014;have been developed to increase the accuracy and efficiency of MRI-based diagnoses [<xref ref-type="bibr" rid="ref-3">3</xref>].</p>
<p>Deep learning models, a type of machine learning model, have demonstrated significant potential in the field of medical image analysis [<xref ref-type="bibr" rid="ref-4">4</xref>]. Their ability to automatically extract relevant features from raw images without the need for manual intervention provides a substantial advantage over traditional methods. Convolutional neural networks (CNNs) have gained widespread adoption in medical imaging tasks because of their ability to capture spatial hierarchies and patterns through convolutional layers [<xref ref-type="bibr" rid="ref-5">5</xref>]. This powerful feature learning capability has allowed deep learning approaches to set new performance benchmarks across a wide range of artificial intelligence applications [<xref ref-type="bibr" rid="ref-6">6</xref>]. Recent advancements in CNN architectures have led to models that are capable of capturing both local and global features within images. Among the most prominent of these architectures are the residual network (ResNet), self-regularized network (RegNet), densely connected convolutional network (DenseNet), and efficient network (EfficientNet) each offering unique advantages for medical imaging tasks, including AD detection.</p>
<p>As deep learning frameworks continue to evolve, attention has shifted toward developing efficient and scalable neural network architectures that can be applied across a diverse range of tasks, particularly in computer vision. Traditionally, designing individual networks such as ResNet or EfficientNet has been the primary approach to achieve state-of-the-art performance. However, Radosavovic et al. (2020) introduced RegNet and emphasized focusing on network design spaces that parameterize entire populations of networks rather than individual instances [<xref ref-type="bibr" rid="ref-7">7</xref>]. In AD studies, RegNet has shown promise in analyzing amyloid deposition via medical images, demonstrating its potential in detecting subtle changes associated with the disease. The ability of RegNet to process complex imaging data efficiently makes it a valuable tool for enhancing diagnostic accuracy in AD [<xref ref-type="bibr" rid="ref-8">8</xref>].</p>
<p>ResNet, introduced by He et al. [<xref ref-type="bibr" rid="ref-9">9</xref>], is widely recognized for its ability to train deep networks by addressing the vanishing gradient problem. Through the use of skip connections, ResNet allows gradients to flow more effectively through the network, enabling the training of very deep architectures. Thus, ResNet has been particularly effective in image classification tasks, including medical imaging applications, where it has been successfully applied to AD diagnosis via MR images [<xref ref-type="bibr" rid="ref-10">10</xref>].</p>
<p>DenseNet, proposed by Huang et al. [<xref ref-type="bibr" rid="ref-11">11</xref>], adopted a different approach by connecting each layer to every other layer in a feed forward manner. This dense connectivity promotes feature reuse, leading to more efficient learning and reducing the number of parameters required. DenseNet&#x2019;s efficient design makes it particularly suitable for medical imaging tasks, especially in cases where computational resources are limited. Moreover, DenseNet has demonstrated strong performance in tasks that require detailed anatomical understanding, such as detecting subtle changes in MR images for AD diagnosis [<xref ref-type="bibr" rid="ref-12">12</xref>].</p>
<p>EfficientNet, introduced by Tan et al. [<xref ref-type="bibr" rid="ref-13">13</xref>], employs a compound scaling method to uniformly scale the depth, width, and resolution of the network, optimizing performance while minimizing computational cost. Unlike traditional methods that scale these dimensions independently, often leading to suboptimal performance or high computational overhead, EfficientNet&#x2019;s scalability and ability to balance accuracy with efficiency make it highly suitable for both high performance and resource constrained environments. This architecture has set new benchmarks in various medical imaging tasks, outperforming prior CNNs across multiple benchmarks.</p>
<p>While CNN based architectures such as RegNet, ResNet, DenseNet, and EfficientNet have consistently demonstrated strong performance in medical imaging tasks, understanding why these models make certain predictions is critical in clinical applications. A widely adopted technique in this context is gradient weighted class activation mapping (GradCAM) [<xref ref-type="bibr" rid="ref-14">14</xref>], which provides visual explanations by highlighting the regions in the input image that are most important for model prediction. By leveraging GradCAM, researchers and clinicians can gain insights into which areas of the brain the model focuses on when diagnosing AD from MR images. This interpretability is essential for validating model predictions and ensuring that the network focuses on clinically relevant brain regions, such as the hippocampus or other areas susceptible to early AD related atrophy. The incorporation of GradCAM thus enhances the transparency of deep learning models, increasing the trustworthiness and reliability of artificial intelligence (AI) tools for medical imaging.</p>
<p>In computer aided diagnosis (CAD), medical image analysis plays a critical and challenging role in identifying anatomical or pathological structures across various imaging modalities, such as magnetic resonance imaging (MRI), computed tomography (CT), and positron emission tomography (PET). However, automating this process presents several challenges, including low contrast of soft tissue, variability in anatomical structures, and limited availability of annotated datasets for model training. Inspired by the ability of the human visual system to focus on relevant areas and ignore background noise, attention mechanisms have been introduced to assign adaptive weights to different regions in an image, enabling neural networks to prioritize the most important regions related to the task and disregard irrelevant areas. This capability enhances the model&#x2019;s ability to capture complex semantic relationships, making attention mechanisms particularly useful for improving both accuracy and interpretability in medical image analysis [<xref ref-type="bibr" rid="ref-15">15</xref>].
<list list-type="bullet">
<list-item>
<p><bold>Key Contributions:</bold></p></list-item>
</list>
<list list-type="simple">
<list-item>
<label>1.</label><p><bold> Dynamic Adjustments for Complex Spatial Patterns:</bold></p></list-item>
</list></p>
<p>A key strength of Dynamic GradNet is its ability to dynamically adjust focus during training on the basis of the complexity of spatial patterns in the brain. The model emphasizes critical brain regions affected by AD, such as the hippocampus and amygdala [<xref ref-type="bibr" rid="ref-16">16</xref>], without overfitting to irrelevant features. This dynamic adjustment process is especially crucial in detecting &#x201C;very mild dementia&#x201D;, where subtle changes in these regions are the earliest indicators of cognitive decline. By capturing these early patterns, the model significantly improves diagnostic accuracy, making it highly effective in identifying patients at the earliest stages of the disease.
<list list-type="simple">
<list-item>
<label>2.</label><p><bold> Controlled Spatial Focus during Training:</bold></p></list-item>
</list></p>
<p>Dynamic GradNet incorporates a mechanism to provide precise control over the brain regions that the model focuses on during training, ensuring that the model emphasizes key areas affected by AD, such as the hippocampus, temporal lobe, parietal lobe, frontal lobe, amygdala, and cerebral cortex [<xref ref-type="bibr" rid="ref-16">16</xref>]. By incorporating this spatial focus approach into training, the model prioritizes these regions, resulting in more accurate and reliable classifications, particularly for early-stage dementia.
<list list-type="simple">
<list-item>
<label>3.</label><p><bold> Unified Preprocessing for Fair Comparison of Powerful CNN Architectures:</bold></p></list-item>
</list></p>
<p>To ensure fairness in model evaluation, a unified preprocessing pipeline was applied to all neural network architectures: RegNet, ResNet, DenseNet, and EfficientNet. These CNN architectures are known for their strong performance in image classification tasks. The standardized preprocessing approach ensures consistent data handling across models, attributing performance differences directly to the architectures. Following a fair comparison, EfficientNet emerged as the best-performing model across key metrics such as precision, recall, F1 score, and accuracy, making it the ideal candidate for integrating Dynamic GradNet reduced False Positives and improved generalizability</p>
<p>By focusing on spatially significant regions and avoiding noise or irrelevant areas, Dynamic GradNet reduces the likelihood of false positives. This controlled spatial learning approach ensures that the model emphasizes the most important brain regions, improving its generalizability across different stages of AD.</p>
<p>As a result, the model generates more reliable predictions, particularly in early-stage diagnoses, where detecting subtle structural changes is critical for timely intervention.
<list list-type="simple">
<list-item>
<label>4.</label><p><bold> Clinical Relevance and Interpretability</bold></p></list-item>
</list></p>
<p>Integrating Dynamic GradNet into the training process enhances both the accuracy and interpretability of the model&#x2019;s predictions. By focusing on the brain regions most affected by AD, the model&#x2019;s decision-making process becomes more aligned with clinical observations, making it suitable for real-world medical applications. This approach ensures that the model is not only accurate but also interpretable, providing clinicians with a valuable tool for early diagnosis and monitoring of AD progression.</p>
<p>This study bridges the gap between deep learning interpretability and clinical applicability by ensuring that the model focuses on diagnostically relevant brain regions. The proposed approach enhances both classification accuracy and explainability, making it more accessible for radiologists and neurologists in real-world diagnostic workflows.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Work</title>
<p>The application of deep learning approaches in medical image analysis, particularly for AD detection, has attracted substantial attention in recent years. Early studies in this domain relied primarily on traditional machine learning techniques that require manual feature extraction from MR images. For example, methods such as structural MRI feature extraction and hippocampal volume measurement are commonly used to assess brain atrophy in regions associated with AD [<xref ref-type="bibr" rid="ref-17">17</xref>]. However, these techniques are limited by their dependence on handcrafted features, which could lead to subtle patterns critical for early diagnosis being overlooked.</p>
<p>The advent of deep learning techniques, particularly CNNs, has revolutionized this field by enabling automatic feature extraction directly from raw MR images. CNN based approaches have been successfully employed in various AD detection studies using structural MRI data. For example, Liu et al. [<xref ref-type="bibr" rid="ref-18">18</xref>] demonstrated the effectiveness of CNNs in classifying MRI data for AD diagnosis, achieving high accuracy by focusing on key brain regions susceptible to AD related atrophy. Similarly, Suk et al. [<xref ref-type="bibr" rid="ref-19">19</xref>] introduced a deep sparse multitask learning framework that combines MRI and PET data to further enhance AD detection performance.</p>
<p>In addition to CNN based methods, machine learning techniques have been widely explored for Alzheimer&#x2019;s disease (AD) diagnosis, incorporating various approaches to enhance classification performance. Traditional machine learning classifiers such as Random Forests have been applied to structural MRI data, often combined with feature selection techniques to improve early-stage detection [<xref ref-type="bibr" rid="ref-20">20</xref>]. More recently, hybrid models integrating deep learning with ensemble learning strategies have gained attention in AD diagnosis. Studies have shown that combining multiple CNN architectures or fusing imaging modalities, such as MRI and PET scans, can significantly improve model robustness</p>
<p>More recently, hybrid models integrating deep learning with ensemble learning strategies have gained attention in AD diagnosis. Studies have shown that combining multiple CNN architectures or fusing imaging modalities, such as MRI and PET scans, can significantly improve model robustness [<xref ref-type="bibr" rid="ref-21">21</xref>]. Additionally, Transformer based architectures, particularly Vision Transformers (ViTs), have emerged as promising alternatives due to their ability to capture long range dependencies in medical images [<xref ref-type="bibr" rid="ref-22">22</xref>]. These advancements highlight the growing impact of machine learning in AD detection and the ongoing efforts to enhance interpretability and reliability in clinical settings.</p>
<sec id="s2_1">
<label>2.1</label>
<title>Deep Learning and Spatial Attention Mechanisms for Alzheimer&#x2019;s Disease Classification</title>
<p>Recent advancements in CNN architectures, such as ResNet, RegNet, DenseNet, and EfficientNet, have significantly improved the accuracy and computational efficiency of AD detection systems [<xref ref-type="bibr" rid="ref-22">22</xref>&#x2013;<xref ref-type="bibr" rid="ref-24">24</xref>].</p>
<sec id="s2_1_1">
<label>2.1.1</label>
<title>ResNet</title>
<p>ResNet, introduced by He et al. [<xref ref-type="bibr" rid="ref-9">9</xref>], has been widely adopted for AD detection because of its ability to train deep networks without suffering from the vanishing gradient problem. An improved ResNet model was proposed for the early diagnosis of AD via MRI scans [<xref ref-type="bibr" rid="ref-25">25</xref>]. This model, which is based on ResNet-50, incorporates several enhancements, including the Mish activation function, spatial transformer network (STN), and a nonlocal attention mechanism. These improvements enabled the model to capture long range correlations in MRI data while retaining critical spatial information, addressing the limitations often encountered by traditional CNNs. The model achieved a classification accuracy of 0.97 with the ADNI dataset, surpassing other algorithms in terms of macro precision, recall, and F1 score.</p>
<p>Additionally, a novel multistage deep learning framework based on residual functions was introduced for AD detection [<xref ref-type="bibr" rid="ref-26">26</xref>]. Inspired by the success of ResNet in image classification tasks, the framework employs five stages to enhance feature extraction while maintaining depth. Following the feature extraction phase, machine learning classifiers such as support vector machines (SVMs), random forests (RFs), and software were applied for classification. The model achieved excellent accuracy for three benchmark datasets (ADNI, MIRAID, and OASIS), with accuracy rates reaching as high as 0.99, outperforming existing systems.</p>
</sec>
<sec id="s2_1_2">
<label>2.1.2</label>
<title>RegNet</title>
<p>Recent advancements have also highlighted the potential of RegNet in medical imaging applications, including AD detection. In one study [<xref ref-type="bibr" rid="ref-3">3</xref>], RegNet X064 was employed to predict amyloid deposition in PET scans for AD prognosis, achieving notably high performance. When combined with gradient boosting decision trees (GBDTs), the model exhibited reduced error margins and faster prediction times than human experts did. These findings underscore the effectiveness and scalability of RegNet in AD imaging tasks, making it a promising tool for clinical use in detecting neurodegenerative diseases.</p>
</sec>
<sec id="s2_1_3">
<label>2.1.3</label>
<title>DenseNet</title>
<p>DenseNet has also shown significant promise in the field of medical imaging, particularly in the analysis of MRI brain scans. Compared with traditional CNNs, DenseNet&#x2019;s densely connected convolutional architecture facilitates more efficient feature extraction with fewer parameters, leading to improved performance in medical image analysis tasks [<xref ref-type="bibr" rid="ref-27">27</xref>]. In brain MRI analysis, DenseNet has demonstrated high accuracy in capturing intricate brain structures, making it a valuable tool in both clinical applications and research settings. This model&#x2019;s ability to handle complex medical images efficiently demonstrates its potential to increase diagnostic accuracy and support advancements in neuroimaging techniques.</p>
<p>Several studies have explored the use of DenseNet in AD classification tasks. For example, in [<xref ref-type="bibr" rid="ref-28">28</xref>], a transfer learning based model utilizing DenseNet was introduced for classifying AD into three categories. This model achieved an accuracy of 0.96 and an AUC (Area Under Curve) of 0.99 with MRI datasets. This study demonstrated that DenseNet outperforms other traditional models in managing high dimensional MR data, particularly when combined with data augmentation techniques, addressing the issue of limited dataset availability. Moreover, the integration of a healthcare decision support system (HDSS) alongside the DenseNet model provided valuable insights for clinical decision making. These advancements highlight DenseNet&#x2019;s potential to improve diagnostic accuracy in AD classification within clinical settings.</p>
</sec>
<sec id="s2_1_4">
<label>2.1.4</label>
<title>EfficientNet</title>
<p>EfficientNet has also emerged as powerful deep learning model for medical imaging tasks, particularly those involving MRI scans. For example, in [<xref ref-type="bibr" rid="ref-29">29</xref>], a study was conducted utilizing a fine-tuned EfficientNet architecture for brain tumor classification that achieved superior performance across multiple datasets. EfficientNet&#x2019;s efficient feature extraction and reduced computational complexity have suggested to be highly beneficial for analyzing high resolution MR images, particularly in complex brain imaging tasks. This study underscores the potential of transfer learning based EfficientNet models to increase diagnostic accuracy in medical imaging applications, as these models outperform state-of-the-art methods.</p>
<p>In AD related tasks, recent studies have demonstrated the effectiveness of EfficientNet. For example, EfficientNet-B0 was employed to classify brain MR images for early AD detection [<xref ref-type="bibr" rid="ref-30">30</xref>]. This approach integrates UNet for brain tissue segmentation and EfficientNet-B0 for feature extraction and classification. This model achieved an accuracy of 0.98, with high sensitivity and precision scores. These findings demonstrate EfficientNet&#x2019;s ability to handle the complexity of brain MRI data, particularly in distinguishing between healthy and diseased brain tissues. The integration of EfficientNet into AD diagnosis systems has the potential to increase diagnostic accuracy and support early intervention, aligning with the growing body of research that positions EfficientNet as a robust tool for neurodegenerative disease classification.</p>
</sec>
<sec id="s2_1_5">
<label>2.1.5</label>
<title>GradCAM</title>
<p>While deep learning models such as CNNs have significantly advanced medical imaging tasks such as brain MRI classification, the interpretability of these models remains a critical challenge. However, techniques such as GradCAM have emerged as powerful tools to increase the transparency and interpretability of CNN models by generating visual explanations of the regions in an image that contribute most to a model&#x2019;s output. In one study [<xref ref-type="bibr" rid="ref-31">31</xref>], the effectiveness of GradCAM in interpreting CNN models trained to classify different types of multiple sclerosis (MS) was demonstrated using brain MR images. The results showed that GradCAM provided superior localization of discriminative brain regions, making it an invaluable tool for understanding CNNs&#x2019; decision making processes in medical contexts. These results emphasize the importance of integrating interpretability techniques such as GradCAM to improve the reliability and clinical applicability of deep learning models in medical imaging.</p>
<p>In the context of AD diagnosis, recent studies have explored the combination of deep learning models with interpretability techniques such as GradCAM. For example, Inception ResNet a was applied to differentiate between AD patients and healthy controls (HCs) via T1-weighted MR brain images [<xref ref-type="bibr" rid="ref-32">32</xref>] and achieved competitive performance. GradCAM was employed to visualize the most discriminative brain regions, with the results indicating that the lateral ventricles in the mid-axial slice were key in distinguishing AD patients. This integration of GradCAM not only enhanced the transparency of the model&#x2019;s decisions but also demonstrated its potential for assisting in diagnosis with minimal medical expertise. These findings highlight the importance of GradCAM in improving the interpretability and clinical relevance of deep learning models for AD diagnosis.</p>
</sec>
<sec id="s2_1_6">
<label>2.1.6</label>
<title>Spatial Focus</title>
<p>Krishnan et al. (2024) integrated a spatial attention mechanism into a CNN architecture to improve the classification of AD using MRI data. The spatial attention layer aids in guiding the model to focus on critical brain regions, such as those affected by AD, leading to a validation accuracy of 0.99. This approach, which assigns adaptive weights to important regions of the brain, highlights the potential of spatial focus techniques to increase both the accuracy and interpretability of deep learning models in AD diagnosis [<xref ref-type="bibr" rid="ref-33">33</xref>].</p>
<p>Sun, Wang, and He (2022) proposed a temporal and spatial analysis framework for improving AD diagnosis via the use of resting state functional MRI (fMRI) data. The authors employed a CNN with residual connections combined with a multilayer long short term memory network to classify AD patients and predict the progression of mild cognitive impairment (MCI) to AD. By constructing a functional connectivity (FC) network matrix based on regions of interest (ROIs) in the brain, the model was able to focus on critical brain areas while analyzing temporal changes over multiple time points. This approach aligns with the concept of spatial focus, as it directs the model&#x2019;s attention to diagnostically relevant brain regions, enhancing the ability to distinguish AD patients from healthy controls and to predict progressive MCI (pMCI) patients <italic>vs</italic>. stable MCI (sMCI) patients. Their method achieved classification accuracy of 0.93 for AD patients <italic>vs</italic>. HCs and 0.75 for pMCI patients <italic>vs</italic>. sMCI patients, demonstrating the effectiveness of spatial and temporal focus in improving diagnostic accuracy [<xref ref-type="bibr" rid="ref-34">34</xref>].</p>
</sec>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Problem Statement</title>
<p>The studies discussed in this section highlight the advancements and challenges in Alzheimer&#x2019;s disease (AD) diagnosis using deep learning. While previous research has extensively leveraged GradCAM as a post-hoc interpretability tool, the majority of these approaches focus solely on visualizing model attention after prediction, rather than incorporating it into the learning process. Consequently, existing methods fail to dynamically adjust their focus on clinically significant brain regions during training, limiting their effectiveness in capturing subtle yet crucial biomarkers for early-stage detection.</p>
<p>In contrast, this study introduces an innovative approach by dynamically integrating GradCAM into the training pipeline. Rather than merely using GradCAM to observe model attention retrospectively, our method actively refines the feature learning process, ensuring that the model prioritizes clinically relevant regions while suppressing irrelevant background features. This dynamic reweighting mechanism enhances both classification accuracy and model interpretability, addressing one of the critical limitations in existing methods.</p>
<p>Moreover, unlike prior research where GradCAM is employed only to explain model predictions, our approach directly influences how the model learns important features, thereby reducing false positive rates in early-stage AD detection. This improvement is particularly crucial for cases such as very mild dementia, where subtle biomarkers are often overlooked. To validate our findings, we conducted a rigorous comparison across four well established CNN architectures RegNet, ResNet, DenseNet, and EfficientNet-B0 using a standardized evaluation framework. The results demonstrated that EfficientNet-B0 outperformed other architectures across all performance metrics, including accuracy, recall, and F1 score, making it the optimal backbone for the proposed Dynamic GradNet framework.</p>
<p>By addressing both the challenges of interpretability and early-stage detection, this work presents a paradigm shift in AD diagnosis, demonstrating that integrating GradCAM beyond post-hoc analysis can significantly improve deep learning models for medical imaging. The proposed method advances the field by bridging the gap between model explain ability and performance, offering a more reliable and clinically interpretable solution for AD classification.</p>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Methodology</title>
<sec id="s3_1">
<label>3.1</label>
<title>Dataset Preparation and Overview</title>
<p>The OASIS MRI dataset [<xref ref-type="bibr" rid="ref-35">35</xref>&#x2013;<xref ref-type="bibr" rid="ref-38">38</xref>] used in this study contains more than 80,000 brain MR images categorized into four classes on the basis of AD progression: moderate dementia, very mild dementia, mild dementia, and nondemented. These images were obtained from 461 patients, offering a robust dataset for AD detection and analysis. Patient classification was based on clinical dementia rating (CDR) values, resulting in four distinct classes, as shown in <xref ref-type="table" rid="table-1">Table 1</xref>. This classification enables the study of AD progression across different stages. <xref ref-type="fig" rid="fig-1">Fig. 1</xref> presents sample MRI images from the dataset, showcasing representative examples from each of the four categories. These samples highlight the visual differences and subtle patterns associated with each stage of AD progression.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Categories and number of images per category</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Category</th>
<th>Number of images</th>
</tr>
</thead>
<tbody>
<tr>
<td>Mild dementia</td>
<td>5002</td>
</tr>
<tr>
<td>Moderate dementia</td>
<td>488</td>
</tr>
<tr>
<td>Non demented</td>
<td>67,222</td>
</tr>
<tr>
<td>Very mild dementia</td>
<td>13,725</td>
</tr>
<tr>
<td>Total number of images</td>
<td>86,437</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Representative MRI images for Alzheimer&#x2019;s disease stages</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-1.tif"/>
</fig>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Preprocessing Stage</title>
<p>In this section, we describe the unified preprocessing pipeline applied to all four neural network architectures: RegNet, ResNet, DenseNet, and EfficientNet. A consistent and standardized preprocessing approach ensures that no unique or model specific transformations are applied to one architecture over another, providing a fair basis for performance comparison.</p>
<sec id="s3_2_1">
<label>3.2.1</label>
<title>Image Resizing</title>
<p>The images are resized to 224 &#x00D7; 224 pixels to align with the standard input dimensions of various CNN architectures. This size ensures compatibility with pre-trained models, balances computational efficiency with feature preservation, and maintains anatomical integrity. Additionally, using a fixed resolution across all models eliminates input size variability, enabling a fair and consistent performance comparison.</p>
</sec>
<sec id="s3_2_2">
<label>3.2.2</label>
<title>Data Augmentation</title>
<p>To improve the generalizability of the models and prevent overfitting, several data augmentation techniques have been applied. These transformations introduce variations to the training dataset, allowing the models to learn more robust features. The following augmentations were used:
<list list-type="bullet">
<list-item>
<p>Random Horizontal Flip: With a probability of 0.50, each image is flipped horizontally. This transformation helps the model become invariant to the left&#x2013;right orientation, which is especially useful in medical imaging tasks where structural symmetry is common.</p></list-item>
<list-item>
<p>Random Rotation: a &#x00B1;10 degree random rotation is applied to improve model generalization and reduce sensitivity to slight orientation differences in MRI scans. This range introduces variability while preserving critical brain structures, preventing overfitting to specific spatial patterns.</p></list-item>
</list></p>
<p>These data augmentation techniques are crucial in enhancing the variability of the training data, which contributes to the robustness of the models during inference.</p>
</sec>
<sec id="s3_2_3">
<label>3.2.3</label>
<title>Tensor Conversion</title>
<p>Once the augmentation techniques are applied, the images are directly converted to tensors via the transforms.ToTensor() function. This transformation function scales the pixel values (which originally range from [0, 255]) to a range of [0, 1] by dividing each pixel by 255, preparing the images for input into the neural networks.</p>
</sec>
<sec id="s3_2_4">
<label>3.2.4</label>
<title>Normalization</title>
<p>After the images are converted to tensors, a normalization approach is applied using the mean and standard deviation values. Since the images are grayscale, the values used for normalization are specific to images with a single channel. The values used for normalization in this study are mean &#x003D; 0.165 and standard deviation &#x003D; 0.176.</p>
<p>Normalization ensures that the pixel values are scaled such that the resulting tensors have a mean of 0 and a standard deviation of 1. This process helps improve the convergence of the model during training by standardizing the input values, leading to more stable gradient updates and faster training.</p>
</sec>
<sec id="s3_2_5">
<label>3.2.5</label>
<title>Data Splitting</title>
<p>The dataset is split into three subsets: training, validation, and testing. A stratified split is applied to ensure that the class distribution in the subsets reflects the overall class distribution in the dataset.
<list list-type="bullet">
<list-item>
<p>Training Set: 0.80 of the dataset is used for training the models.</p></list-item>
<list-item>
<p>Validation Set: 0.10 of the dataset is used for validation, ensuring that the models do not overfit during the training process.</p></list-item>
<list-item>
<p>Test Set: 0.10 of the dataset is reserved for evaluating the final performance of the models.</p></list-item>
</list></p>
<p>This stratified split ensures a balanced distribution of classes across all subsets, maintaining a proportional representation of each class in the training, validation, and test sets.</p>
</sec>
<sec id="s3_2_6">
<label>3.2.6</label>
<title>Handling Class Imbalance</title>
<p>The OASIS dataset presents a significant class imbalance, particularly between the nondemented and moderate dementia categories. To address this issue, we implemented a comprehensive approach that combines Weighted Random Sampling, a Weighted Loss Function, and Class-wise Performance Analysis to ensure that the model does not disproportionately favor the majority class.</p>
<p>To balance the dataset during training, Weighted Random Sampling was applied, ensuring that underrepresented classes were sampled more frequently. The class weights were computed using the formula:
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mi>C</mml:mi><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mi>W</mml:mi><mml:mi>e</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mi>h</mml:mi><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mi>N</mml:mi><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mfrac></mml:math></disp-formula>where <italic>N</italic> represents the total dataset size, and <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the number of samples in class i. This method ensured that all classes contributed equally during training, preventing the model from being biased toward the dominant nondemented category.</p>
<p>In addition to sampling adjustments, a Weighted CrossEntropy Loss Function was incorporated to further counteract the imbalance. The loss function was modified such that higher penalties were assigned to misclassified samples from underrepresented classes, ensuring that the model paid more attention to the minority classes. The loss function is formulated as follows:
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mo>&#x2211;</mml:mo><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the weight assigned to class c, which scales the contribution of each class in the loss calculation. This approach ensures that classes with fewer samples have a stronger influence on the optimization process, thereby improving the model&#x2019;s ability to distinguish between dementia stages.</p>
<p>To evaluate the effectiveness of these techniques, a Class-wise Performance Analysis was conducted, where metrics such as precision, recall, F1 score, and accuracy were calculated separately for each class. This analysis validated that the model&#x2019;s improvements were not driven solely by the majority class but were distributed more equitably across all dementia stages. The results confirmed that the integration of weighted sampling and loss functions led to more balanced predictions, reducing bias and improving overall classification stability.</p>
</sec>
<sec id="s3_2_7">
<label>3.2.7</label>
<title>DataLoader Configuration</title>
<p>A DataLoader is used to load the data in batches during the training process. The batch size is set to 32 across all the models, and the &#x201C;WeightedRandomSampler&#x201D; is incorporated into the training DataLoader to handle class imbalance. For the validation and test sets, the data are loaded without sampling, and shuffling is disabled to maintain consistency during evaluation.</p>
</sec>
<sec id="s3_2_8">
<label>3.2.8</label>
<title>Summary of Preprocessing Pipeline</title>
<p><disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>N</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>T</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula>
<list list-type="bullet">
<list-item>
<p><inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:msub><mml:mrow><mml:mtext>I</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mtext>Pre</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula>: Represents the image after applying all preprocessing steps.</p></list-item>
<list-item>
<p><inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:msub><mml:mi>S</mml:mi><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>: Refers to the scaling operation (resize) applied to the original image <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> to fixed dimensions d &#x003D; (224,224) pixels.
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mrow><mml:mtext>Sd</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mtext>I</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mtext>orig</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mrow><mml:mtext>&#xA0;Resize&#xA0;</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mtext>I</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mtext>orig</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mn>224</mml:mn><mml:mo>,</mml:mo><mml:mn>224</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula></p></list-item>
<list-item>
<p><inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x22C5;</mml:mo><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x003A;</mml:mo></mml:math></inline-formula> Refers to the random horizontal flip operation applied to the image with a probability <italic>p</italic> &#x003D; 0.5.
<disp-formula id="ueqn-5"><mml:math id="mml-ueqn-5" display="block"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>I</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="left left" rowspacing=".2em" columnspacing="1em" displaystyle="false"><mml:mtr><mml:mtd><mml:mrow><mml:mi mathvariant="normal">&#x2033;</mml:mi></mml:mrow><mml:mi>F</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>p</mml:mi><mml:mi>h</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>z</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>o</mml:mi><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>b</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>y</mml:mi><mml:msup><mml:mi>p</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:mi mathvariant="normal">&#x2033;</mml:mi></mml:mrow><mml:mi>I</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>o</mml:mi><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>b</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>y</mml:mi><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msup><mml:mi>p</mml:mi><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:math></disp-formula></p></list-item>
<list-item>
<p><inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mo>&#x22C5;</mml:mo><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>: Refers to the random rotation applied to the image. The angle &#x03B8; is drawn from a uniform distribution <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mi>&#x03B8;</mml:mi><mml:mo>&#x223C;</mml:mo><mml:mi>U</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x2212;</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>&#x2218;</mml:mo></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>&#x2218;</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:math></inline-formula>
<disp-formula id="ueqn-6"><mml:math id="mml-ueqn-6" display="block"><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>I</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mtext>&#xA0;Rotate&#xA0;</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>I</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mspace width="thinmathspace" /><mml:mrow><mml:mi>&#x03F5;</mml:mi></mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mo>&#x2212;</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>&#x2218;</mml:mo></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>&#x2218;</mml:mo></mml:mrow></mml:msup><mml:mo>]</mml:mo></mml:mrow></mml:math></disp-formula></p></list-item>
<list-item>
<p><inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:mrow><mml:mtext>Tensor</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mo>&#x22C5;</mml:mo><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>: Refers to the conversion to a tensor, where the pixel values are scaled from the range [0, 255] to [0, 1].
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:mrow><mml:mtext>Tensor</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>I</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mi>I</mml:mi><mml:mn>255</mml:mn></mml:mfrac></mml:math></disp-formula></p></list-item>
<list-item>
<p><inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mi>N</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>I</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x003A;</mml:mo></mml:math></inline-formula> Refers to the normalization operation applied to the image I, where each pixel is normalized using the mean &#x03BC; &#x003D; [0.165] and standard deviation &#x03C3; &#x003D; [0.176] for a single grayscale channel.
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mi>N</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>I</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>I</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03BC;</mml:mi></mml:mrow><mml:mi>&#x03C3;</mml:mi></mml:mfrac></mml:math></disp-formula></p></list-item>
</list></p>
<p>In summary, the uniform preprocessing pipeline applied to all four models, RegNet, ResNet, DenseNet, and EfficientNet, serves as a cornerstone of this research. By ensuring that the same transformations and data preparation steps are consistently applied across all the architectures, we eliminate any potential biases that could arise from model specific preprocessing. This consistency guarantees that the results obtained from each model are directly comparable. Any observed differences in performance can thus be attributed solely to the inherent architecture of the models rather than to variations in the input preparation. This careful control of preprocessing conditions is a critical contribution of our work, ensuring a fair and rigorous evaluation of each model&#x2019;s capabilities.</p>
<p>The unified preprocessing pipeline can be expressed mathematically as follows:
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>N</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>T</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula>where:
<list list-type="bullet">
<list-item>
<p><inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the image after all preprocessing steps.</p></list-item>
<list-item>
<p><inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:msub><mml:mi>S</mml:mi><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> refers to resizing the original image <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> to fixed dimensions of 224 &#x00D7; 224 pixels.</p></list-item>
<list-item>
<p><inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mo>&#x22C5;</mml:mo><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> denotes the random horizontal flip operation applied with a probability <italic>p</italic> &#x003D; 0.5.</p></list-item>
<list-item>
<p><inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mo>&#x22C5;</mml:mo><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> represents the random rotation applied, where the angle &#x03B8; is drawn from a uniform distribution <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mo stretchy="false">[</mml:mo><mml:mo>&#x2212;</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>&#x2218;</mml:mo></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>&#x2218;</mml:mo></mml:mrow></mml:msup><mml:mo stretchy="false">]</mml:mo></mml:math></inline-formula>.</p></list-item>
<list-item>
<p>Tensor(<inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mo>&#x22C5;</mml:mo></mml:math></inline-formula>) refers to the conversion to a tensor, scaling pixel values from the range [0, 255] to [0, 1].</p></list-item>
<list-item>
<p><inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:mi>N</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>I</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mo>.</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents normalization using the mean &#x03BC; &#x003D; [0.165] and standard deviation &#x03C3; &#x003D; [0.176] for a single grayscale channel.</p></list-item>
</list></p>
<p>Following this preprocessing pipeline, the image is passed through each of the four models:
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:msubsup><mml:mi>I</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>u</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi><mml:mi>N</mml:mi><mml:mi>e</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi><mml:mi>N</mml:mi><mml:mi>e</mml:mi><mml:mi>t</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula>
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:msubsup><mml:mi>I</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>u</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mi>N</mml:mi><mml:mi>e</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mi>N</mml:mi><mml:mi>e</mml:mi><mml:mi>t</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula>
<disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:msubsup><mml:mi>I</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>u</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext>DenseNet</mml:mtext></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mrow><mml:mtext>DenseNet</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula>
<disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block"><mml:msubsup><mml:mi>I</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>u</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext>EfficientNet</mml:mtext></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mrow><mml:mtext>EfficientNet</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula>
<list list-type="bullet">
<list-item>
<p><inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:msubsup><mml:mi>I</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>u</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>M</mml:mi><mml:mi>o</mml:mi><mml:mi>d</mml:mi><mml:mi>e</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> represents the final output from each model.</p></list-item>
<list-item>
<p>RegNet, ResNet, DenseNet and EfficientNet are the four different models applied in this study.</p></list-item>
</list></p>
<p>By applying this uniform preprocessing pipeline followed by each specific model architecture, we ensure that any observed differences in performance are a direct result of the model&#x2019;s architecture, as the preprocessing steps are consistent across all the models.</p>
</sec>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Convolutional Neural Network (CNN Architectures)</title>
<p>The decision making process in CNNs, such as RegNet, ResNet, DenseNet, and EfficientNet, is driven by their ability to learn and extract hierarchical features from input data. These architectures are designed to enhance performance on tasks such as AD classification, involving stages such as nondemented, very mild dementia, mild dementia, and moderate dementia. Each model introduces distinct structural innovations, optimizing feature extraction and classification and thus making them well suited for complex image based diagnostics. The task in this study involves classifying four distinct stages of AD: nondemented, very mild dementia, mild dementia, and moderate dementia. These stages are reflected in the classes variable, which contains these four labels. The final layer of each neural network model (RegNet, ResNet, DenseNet, and EfficientNet) is adjusted to output predictions for these four classes, ensuring that the models can differentiate between the various stages of dementia on the basis of the input medical images.</p>
<sec id="s3_3_1">
<label>3.3.1</label>
<title>ResNet</title>
<p>ResNet [<xref ref-type="bibr" rid="ref-16">16</xref>] is a deep CNN that employs residual connections to address the vanishing gradient problem, enabling the training of deeper architectures without performance loss. This design makes ResNet particularly effective for tasks such as medical image classification.</p>
<p>In this study, a pretrained ResNet50 was fine-tuned to classify the stages of AD. The final fully connected layer was modified to output predictions for four classes, allowing the model to distinguish between different stages of dementia.</p>
</sec>
<sec id="s3_3_2">
<label>3.3.2</label>
<title>RegNet</title>
<p>RegNet [<xref ref-type="bibr" rid="ref-14">14</xref>] is a CNN architecture designed to optimize network design spaces for improved scalability and performance. Unlike traditional architectures, RegNet employs a regularized approach to adjust model size and complexity, resulting in a more balanced and efficient network. This scalability makes RegNet particularly effective for a wide range of tasks, including medical image classification, while maintaining computational efficiency.</p>
<p>In this study, a pretrained RegNet-Y-400MF model was fine-tuned for AD classification. The final fully connected layer was adapted to output predictions for four distinct classes, allowing the model to effectively differentiate between the various stages of dementia.</p>
</sec>
<sec id="s3_3_3">
<label>3.3.3</label>
<title>DenseNet</title>
<p>DenseNet [<xref ref-type="bibr" rid="ref-18">18</xref>] is a CNN that establishes dense connections between each layer to every other layer in a feedforward manner, ensuring maximum information and gradient flow throughout the network. This dense connectivity mitigates the vanishing gradient problem and improves feature reuse, leading to efficient learning with fewer parameters. Its ability to enhance feature propagation makes it particularly effective for tasks such as medical image classification.</p>
<p>In this study, a pretrained DenseNet121 model was fine-tuned to classify AD stages. The original classifier layer was replaced to output predictions for four classes, enabling the model to effectively distinguish between different stages of dementia.</p>
</sec>
<sec id="s3_3_4">
<label>3.3.4</label>
<title>EfficientNet</title>
<p>EfficientNet [<xref ref-type="bibr" rid="ref-20">20</xref>] is a CNN that scales the dimensions of depth, width, and resolution in a balanced manner via a compound scaling method. This approach enables the network to achieve better accuracy and efficiency than traditional models do, making it highly effective for tasks requiring computational efficiency, such as medical image classification.</p>
<p>In this study, a pretrained EfficientNet-B0 model was fine-tuned to classify AD stages. The original classifier layer was modified to output predictions for four classes, allowing the model to accurately differentiate between the various stages of dementia.</p>
</sec>
<sec id="s3_3_5">
<label>3.3.5</label>
<title>Final Layer Modification for AD Classification</title>
<p>For each model (ResNet, RegNet, DenseNet, and EfficientNet), the final fully connected layer was replaced to output four classes corresponding to AD stages (mild dementia, moderate dementia, nondemented, very mild dementia). This modification ensures that each model can classify input images into one of these four stages.</p>
<p>In all cases, the output layer was followed by a softmax activation function to convert logits into class probabilities, ensuring that the sum of the probabilities was 1 for each prediction.</p>
</sec>
<sec id="s3_3_6">
<label>3.3.6</label>
<title>Training Parameters</title>
<p>The key training parameters applied consistently across all CNN models (ResNet, RegNet, DenseNet, and EfficientNet) are summarized in <xref ref-type="table" rid="table-2">Table 2</xref>.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Training parameters for the CNN models</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Parameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Epoch</td>
<td>40 (with early stopping after 6 epochs of no improvement)</td>
</tr>
<tr>
<td>Optimizer</td>
<td>AdamW (Adam with weight decay)</td>
</tr>
<tr>
<td>Batch size</td>
<td>32</td>
</tr>
<tr>
<td>Activation</td>
<td>GELU (in MLP layers), softmax (for probabilities)</td>
</tr>
<tr>
<td>Loss</td>
<td>CrossEntropyLoss (with class weights)</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>EfficientNet-B0 and GradCAM Integration</title>
<p>The EfficientNet-B0 model was further enhanced through the integration of GradCAM for dynamic weighting during training. This approach introduces an innovative dynamic weighting mechanism into the loss function, guided by the spatial importance of image regions as determined by GradCAM.</p>
<p>Among the tested models (ResNet, RegNet, DenseNet, and EfficientNet), EfficientNet demonstrated the best performance in terms of precision, recall, F1 score, and accuracy. As a result, it was selected for the integration of GradCAM (dynamic weighting) to further improve its performance during training.</p>
<p>By integrating GradCAM into the EfficientNet-B0 architecture, the model dynamically adjusts its focus to critical brain regions, improving both interpretability and diagnostic accuracy. The framework of our proposed method, including the preprocessing pipeline, the EfficientNet model, and the GradCAM integration for dynamic spatial focus, is illustrated in <xref ref-type="fig" rid="fig-2">Fig. 2</xref>.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Dynamic GradNet framework for enhanced spatial attention</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-2.tif"/>
</fig>
<sec id="s3_4_1">
<label>3.4.1</label>
<title>Model and Training Parameters</title>
<p>To ensure a fair comparison between the models, the following training parameters were applied consistently across all models, with the only difference being the loss function used in the EfficientNet-B0 model (<xref ref-type="table" rid="table-3">Table 3</xref>).</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Model and training parameters for baseline models <italic>vs</italic>. dynamic weighting (EfficientNet-B0)</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Parameter</th>
<th align="center">Baseline models (ResNet, RegNet, DenseNet, and EfficientNet)</th>
<th align="center">Dynamic weighting (EfficientNet-B0)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Epoch</td>
<td>40 (with early stopping after 6 epochs of no improvement)</td>
<td>40 (with early stopping after 6 epochs of no improvement)</td>
</tr>
<tr>
<td>Optimizer</td>
<td>AdamW (Adam with weight decay)</td>
<td>AdamW (Adam with weight decay)</td>
</tr>
<tr>
<td>Batch size</td>
<td>32</td>
<td>32</td>
</tr>
<tr>
<td>Activation</td>
<td>GELU (in MLP layers), softmax (for probabilities)</td>
<td>GELU (in MLP layers), softmax (for probabilities)</td>
</tr>
<tr>
<td>Loss</td>
<td>CrossEntropyLoss (with class weights)</td>
<td>CrossEntropyLoss (with dynamic weighting based on GradCAM)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The key distinction is that while the baseline models (ResNet, RegNet, DenseNet, and EfficientNet) utilize CrossEntropyLoss with class weights to manage class imbalance, the EfficientNet-B0 model employs CrossEntropyLoss with dynamic weighting based on GradCAM.</p>
</sec>
<sec id="s3_4_2">
<label>3.4.2</label>
<title>GradCAM Integration</title>
<p>GradCAM was applied to the final convolutional block of EfficientNet-B0 before the classifier to generate heatmaps that highlight the most relevant image regions for each class prediction. These heatmaps guided the dynamic weighting in the loss function, ensuring the model focuses on the most critical regions of the image during training. The 3D GradCAM visualization further highlights the most significant brain regions involved in the classification process, providing a crucial understanding of the spatial focus in the model&#x2019;s predictions <xref ref-type="fig" rid="fig-3">Fig. 3</xref>.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Critical brain regions mapped by 3D GradCAM</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-3.tif"/>
</fig>
</sec>
<sec id="s3_4_3">
<label>3.4.3</label>
<title>Heatmap Generation</title>
<p>For each input image, GradCAM generated heatmaps that highlighted the most relevant features for classification. These heatmaps were computed from the gradients of the output with respect to the activation maps in the target layer. The activations were weighted by the gradients and averaged spatially to produce the heatmap, which indicates the importance of each region in the input image for the model&#x2019;s decision.</p>
</sec>
<sec id="s3_4_4">
<label>3.4.4</label>
<title>Dynamic Weight Calculation</title>
<p>The GradCAM heatmaps were averaged across spatial dimensions to compute scalar weights representing the importance of the highlighted regions. These weights were then applied to the model outputs to scale them, ensuring that the network focused more on the regions of greater importance. This process allows the model to adapt its predictions by focusing on the most important parts of the image while ignoring less critical regions.</p>
</sec>
<sec id="s3_4_5">
<label>3.4.5</label>
<title>Loss Function Calculation</title>
<p>To enhance the model&#x2019;s ability to focus on critical brain regions during training, GradCAM-based dynamic weighting was integrated into the CrossEntropy Loss function. The GradCAM heatmaps generated for each class were used to compute spatial importance weights, which were then applied to the model&#x2019;s predictions before loss computation. This mechanism ensures that regions with higher clinical significance contribute more to the training process, allowing the model to refine its decision making based on essential anatomical features.</p>
</sec>
<sec id="s3_4_6">
<label>3.4.6</label>
<title>Dynamic Weight Computation from GradCAM</title>
<p>After each forward pass, the GradCAM activation maps were computed from the final convolutional layer. These heatmaps highlight the most influential areas in the brain MRI scans, with activation intensities reflecting their relative importance for each class.</p>
<p>For each class c, the weight <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, was computed by spatially averaging the heatmap activations, formulated as follows:
<disp-formula id="eqn-12"><label>(12)</label><mml:math id="mml-eqn-12" display="block"><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>Z</mml:mi></mml:mfrac><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:msubsup><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup></mml:math></disp-formula>where:
<list list-type="bullet">
<list-item>
<p><inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> represents the activation at location (<italic>i</italic>,<italic>j</italic>) in the GradCAM for class c.</p></list-item>
<list-item>
<p><inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:msubsup><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is the importance score (derived from backpropagated gradients) at location (<italic>i</italic>,<italic>j</italic>).</p></list-item>
<list-item>
<p><italic>Z</italic> is a normalization factor ensuring stability in weight scaling.</p></list-item>
</list></p>
</sec>
<sec id="s3_4_7">
<label>3.4.7</label>
<title>Weighted Model Output Modification</title>
<p>Before computing the loss, the predicted probability for each class <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> was adjusted dynamically using the computed GradCAM weight <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, ensuring the model prioritizes the most informative regions:
<disp-formula id="eqn-13"><label>(13)</label><mml:math id="mml-eqn-13" display="block"><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mrow><mml:mtext>&#xA0;weighted</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>&#x22C5;</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></disp-formula>where:
<list list-type="bullet">
<list-item>
<p><inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the raw model prediction for class c.</p></list-item>
<list-item>
<p><inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> amplifies or attenuates <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> based on spatial importance.</p></list-item>
</list></p>
</sec>
<sec id="s3_4_8">
<label>3.4.8</label>
<title>Final Loss Function with GradCAM Integration</title>
<p>Using the modified outputs, the CrossEntropy Loss was calculated as:
<disp-formula id="eqn-14"><label>(14)</label><mml:math id="mml-eqn-14" display="block"><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">dynamic</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>&#x22C5;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mrow><mml:mtext>&#xA0;weighted</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>where:
<list list-type="bullet">
<list-item>
<p><inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the ground-truth label for class c.</p></list-item>
<list-item>
<p><inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mrow><mml:mtext>&#xA0;weighted</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> is the GradCAM weighted model output for class c.</p></list-item>
<list-item>
<p><inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> adjusts the loss contribution dynamically based on spatial importance.</p></list-item>
</list></p>
<p>This integration ensures that the loss function penalizes errors more in clinically relevant regions, guiding the model to prioritize learning from the most diagnostically significant areas in MRI images.</p>
</sec>
<sec id="s3_4_9">
<label>3.4.9</label>
<title>Representation of Spatial Importance through GradCAM Weights</title>
<p>The weights obtained from GradCAM represent the contribution of each activation map to the model&#x2019;s final decision. These weights are used to determine the spatial importance of regions within the image.</p>
<p>The activation maps produced by the convolutional layers contain spatial information about the input image. Each pixel in the activation map corresponds to a specific region in the original image. By weighting the activation map using the gradients with respect to the class score, the regions that contribute more significantly to the model&#x2019;s decision are highlighted. Regions with higher gradients are assigned greater importance in the final prediction, which is reflected in the GradCAM heatmap.</p>
</sec>
<sec id="s3_4_10">
<label>3.4.10</label>
<title>GradCAM in Training</title>
<p>When GradCAM is used to guide the model during training, the following steps are applied:
<list list-type="bullet">
<list-item>
<p><bold>GradCAM Calculation</bold></p>
<p>After each training step, GradCAM is computed on the basis of the model&#x2019;s current outputs. The gradients with respect to the target layer&#x2019;s activations are calculated and used to generate the heatmap.</p></list-item>
<list-item>
<p><bold>Modifying Outputs with the Weights</bold></p>
<p>The model&#x2019;s outputs are adjusted using the weights derived from the GradCAM heatmaps. These weights emphasize regions identified as having greater importance by GradCAM, thereby ensuring that these regions have greater influence during the loss calculation, guiding the model to focus on these areas.</p></list-item>
<list-item>
<p><bold>Loss Calculation</bold></p>
<p>The loss is calculated on the basis of the modified outputs. By incorporating dynamic weights, the model is encouraged to focus on the regions identified as important by GradCAM. This approach helps the model learn more effectively by concentrating on the most relevant spatial features during training.</p></list-item>
</list></p>
<p>By incorporating GradCAM derived weights during training, the model is guided to focus on the most critical regions of the input image for classification, enhancing its ability to generalize and make accurate decisions. This approach is particularly beneficial in tasks where spatial information is crucial for understanding the image content, as the model can concentrate on the most relevant spatial features.</p>
</sec>
</sec>
<sec id="s3_5">
<label>3.5</label>
<title>Computational Requirements</title>
<p>The proposed model was trained on a system equipped with an Intel Core i9-13900 processor, 32 GB RAM, and an NVIDIA GTX 1650 (4 GB GDDR6) GPU. This hardware configuration was sufficient to train the model effectively while maintaining a balance between computational efficiency and performance. The model&#x2019;s architecture was designed to be lightweight, ensuring feasibility for deployment in real-world medical imaging applications without requiring high-end computational resources. Furthermore, the model can be integrated into clinical workflows using moderate hardware specifications, making it accessible for a broader range of medical institutions.</p>
</sec>
<sec id="s3_6">
<label>3.6</label>
<title>Summary</title>
<p>This methodology demonstrates how EfficientNet-B0, enhanced with GradCAM for dynamic weighting, was trained for AD disease classification. The consistent preprocessing pipeline applied across all models ensures a fair comparison, while the integration of the dynamic weighting mechanism enables the EfficientNet-B0 model to focus on the most relevant areas in images, improving its performance for this critical task. The integration of GradCAM into the loss function represents a novel approach for improving model interpretability and accuracy in medical image classification.</p>
</sec>
<sec id="s3_7">
<label>3.7</label>
<title>Evaluation</title>
<p>Below are the mathematical equations and explanations for the metrics used to evaluate the performance of the proposed model. These metrics provide insights into the model&#x2019;s ability to correctly classify instances in a multiclass setting.</p>
<p>The precision metric measures the proportion of true positives (correctly predicted positive instances) out of all the predicted positives. It is given by the following formula:
<disp-formula id="eqn-15"><label>(15)</label><mml:math id="mml-eqn-15" display="block"><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula>where <italic>TP</italic> refers to the number of true positive cases and <italic>FP</italic> refers to the number of false positive cases.</p>
<p>The recall metric, also known as sensitivity, measures the proportion of actual positive cases that the model correctly identified. It is calculated as:
<disp-formula id="eqn-16"><label>(16)</label><mml:math id="mml-eqn-16" display="block"><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula>where <italic>FN</italic> refers to the number of false negatives, i.e., actual positive cases that were incorrectly predicted as negative.</p>
<p>The F1 score is the harmonic mean of precision and recall, providing a balance between the two metrics, especially in cases of class imbalance. It is given by the following formula:
<disp-formula id="eqn-17"><label>(17)</label><mml:math id="mml-eqn-17" display="block"><mml:mrow><mml:mtext>F</mml:mtext></mml:mrow><mml:mn>1</mml:mn><mml:mrow><mml:mtext>&#xA0;score</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mtext>Precision</mml:mtext></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mtext>Recall</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext>Precision</mml:mtext></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mtext>Recall</mml:mtext></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>The accuracy metric measures the overall proportion of correct predictions (both positive and negative) out of the total number of predictions. It is computed as: <disp-formula id="eqn-18"><label>(18)</label><mml:math id="mml-eqn-18" display="block"><mml:mrow><mml:mtext>Accuracy</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula>where <italic>TN</italic> represents the number of true negative cases and correctly predicted negatives.</p>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Result</title>
<sec id="s4_1">
<label>4.1</label>
<title>Comparative Analysis of Model Performance</title>
<p>In this section, we present a comparison of the performance of five models, ResNet, RegNet, DenseNet, EfficientNet, and EfficientNet with Dynamic GradNet, across four categories of AD diagnosis: mild dementia, moderate dementia, nondemented, and very mild dementia. The evaluation is based on key metrics such as accuracy, precision, recall, and the F1 score, providing insights into each model&#x2019;s strengths in different stages of dementia.</p>
<sec id="s4_1_1">
<label>4.1.1</label>
<title>Performance in the Mild Dementia Class</title>
<p><xref ref-type="table" rid="table-4">Table 4</xref> highlights the performance of different models in classifying the mild dementia category. Among these models, EfficientNet with Dynamic GradNet yielded the highest overall F1 score (0.9881), reflecting a balanced trade off between precision and recall. However, EfficientNet achieved the highest accuracy (0.9976), indicating its superior ability to correctly classify both positive and negative samples in this category.</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Performance in the mild dementia class</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Model</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
</tr>
</thead>
<tbody>
<tr>
<td>ResNet</td>
<td>0.9012</td>
<td>0.9873</td>
<td>0.9423</td>
<td>0.9930</td>
</tr>
<tr>
<td>RegNet</td>
<td>0.9126</td>
<td>0.9982</td>
<td>0.9535</td>
<td>0.9944</td>
</tr>
<tr>
<td>DenseNet</td>
<td>0.8499</td>
<td>0.9809</td>
<td>0.9107</td>
<td>0.9889</td>
</tr>
<tr>
<td>EfficientNet</td>
<td>0.9656</td>
<td>0.9945</td>
<td>0.9799</td>
<td>0.9976</td>
</tr>
<tr>
<td>EfficientNet with dynamic GradNet</td>
<td>0.9915</td>
<td>0.9847</td>
<td>0.9881</td>
<td>0.9847</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_1_2">
<label>4.1.2</label>
<title>Performance in the Moderate Dementia Class</title>
<p>In the moderate dementia class, DenseNet achieved the highest accuracy (0.9999) and F1 score (0.9910), indicating its strong performance in correctly identifying both true positives and true negatives. Moreover, EfficientNet with Dynamic GradNet achieved the highest precision (0.9946), highlighting its robustness in minimizing false positives in this class, as shown in <xref ref-type="table" rid="table-5">Table 5</xref>.</p>
<table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>Performance in the moderate dementia class</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Model</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
</tr>
</thead>
<tbody>
<tr>
<td>ResNet</td>
<td>0.9764</td>
<td>0.9952</td>
<td>0.9857</td>
<td>0.9998</td>
</tr>
<tr>
<td>RegNet</td>
<td>0.9459</td>
<td>0.9906</td>
<td>0.9677</td>
<td>0.9996</td>
</tr>
<tr>
<td>DenseNet</td>
<td>0.9865</td>
<td>0.9955</td>
<td>0.9910</td>
<td>0.9999</td>
</tr>
<tr>
<td>EfficientNet</td>
<td>0.9587</td>
<td>0.9858</td>
<td>0.9721</td>
<td>0.9997</td>
</tr>
<tr>
<td>EfficientNet with dynamic GradNet</td>
<td>0.9946</td>
<td>0.9941</td>
<td>0.9949</td>
<td>0.9940</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_1_3">
<label>4.1.3</label>
<title>Performance in the Nondemented Class</title>
<p>For the nondemented class, EfficientNet with Dynamic GradNet achieved the highest recall (0.9950), F1 score (0.9949), and accuracy (0.9953), making it the most effective model for correctly identifying nondemented individuals. ResNet and EfficientNet exhibited the highest precision values (0.9973 and 0.9971, respectively), indicating their strong ability to minimize false positives in this category, as shown in <xref ref-type="table" rid="table-6">Table 6</xref>.</p>
<table-wrap id="table-6">
<label>Table 6</label>
<caption>
<title>Performance in the nondemented class</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Model</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
</tr>
</thead>
<tbody>
<tr>
<td>ResNet</td>
<td>0.9973</td>
<td>0.9259</td>
<td>0.9602</td>
<td>0.9404</td>
</tr>
<tr>
<td>RegNet</td>
<td>0.9967</td>
<td>0.9461</td>
<td>0.9708</td>
<td>0.9557</td>
</tr>
<tr>
<td>DenseNet</td>
<td>0.9901</td>
<td>0.9485</td>
<td>0.9689</td>
<td>0.9526</td>
</tr>
<tr>
<td>EfficientNet</td>
<td>0.9971</td>
<td>0.9737</td>
<td>0.9853</td>
<td>0.9773</td>
</tr>
<tr>
<td>EfficientNet with dynamic GradNet</td>
<td>0.9945</td>
<td>0.9950</td>
<td>0.9949</td>
<td>0.9953</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_1_4">
<label>4.1.4</label>
<title>Performance in the Very Mild Dementia Class</title>
<p>In the very mild dementia class, EfficientNet with Dynamic GradNet outperformed all the other models, with the highest precision (0.9779), F1 score (0.9773), and accuracy (0.9767). This finding indicates the superior ability of EfficientNet with Dynamic GradNet to accurately detect early-stage dementia while maintaining a lower rate of false positives, as shown in <xref ref-type="table" rid="table-7">Table 7</xref>.</p>
<table-wrap id="table-7">
<label>Table 7</label>
<caption>
<title>Performance in the very mild dementia class</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Model</th>
<th>Precision</th>
<th>Recall</th>
<th>F1 score</th>
<th>Accuracy</th>
</tr>
</thead>
<tbody>
<tr>
<td>ResNet</td>
<td>0.7505</td>
<td>0.9871</td>
<td>0.8527</td>
<td>0.9459</td>
</tr>
<tr>
<td>RegNet</td>
<td>0.8112</td>
<td>0.9839</td>
<td>0.8893</td>
<td>0.9611</td>
</tr>
<tr>
<td>DenseNet</td>
<td>0.8320</td>
<td>0.9563</td>
<td>0.8898</td>
<td>0.9624</td>
</tr>
<tr>
<td>EfficientNet</td>
<td>0.8951</td>
<td>0.9873</td>
<td>0.9389</td>
<td>0.9796</td>
</tr>
<tr>
<td>EfficientNet with dynamic GradNet</td>
<td>0.9779</td>
<td>0.9761</td>
<td>0.9773</td>
<td>0.9767</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Comprehensive Model Comparison</title>
<p><xref ref-type="fig" rid="fig-4">Fig. 4</xref>, presents the training and validation loss and accuracy curves over epochs for EfficientNet with Dynamic GradNet. These performance curves illustrate the model&#x2019;s convergence behavior, highlighting its stability and generalization across training and validation sets.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>Training and validation performance curves for EfficientNet with dynamic GradNet</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-4.tif"/>
</fig>
<sec id="s4_2_1">
<label>4.2.1</label>
<title>Average Performance across All Classes</title>
<p><xref ref-type="table" rid="table-8">Table 8</xref> presents a summary of the average performance of each model across all classes. EfficientNet with Dynamic GradNet yielded the highest average scores across all the metrics, including precision (0.9896), recall (0.9878), F1 score (0.9887), and accuracy (0.9903). This finding indicates that EfficientNet with Dynamic GradNet not only performs well across individual classes but also generalizes effectively across the entire dataset as illustrated in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>.</p>
<table-wrap id="table-8">
<label>Table 8</label>
<caption>
<title>Average performance across all classes</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Model</th>
<th>Avg. precision</th>
<th>Avg. recall</th>
<th>Avg. F1 score</th>
<th>Avg. accuracy</th>
</tr>
</thead>
<tbody>
<tr>
<td>ResNet</td>
<td>0.9063</td>
<td>0.9714</td>
<td>0.9453</td>
<td>0.9710</td>
</tr>
<tr>
<td>RegNet</td>
<td>0.9166</td>
<td>0.9797</td>
<td>0.9528</td>
<td>0.9765</td>
</tr>
<tr>
<td>DenseNet</td>
<td>0.9146</td>
<td>0.9728</td>
<td>0.9437</td>
<td>0.9740</td>
</tr>
<tr>
<td>EfficientNet</td>
<td>0.9553</td>
<td>0.9853</td>
<td>0.9700</td>
<td>0.9874</td>
</tr>
<tr>
<td>EfficientNet with dynamic GradNet</td>
<td>0.9896</td>
<td>0.9878</td>
<td>0.9887</td>
<td>0.9903</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>Average performance across all classes</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-5.tif"/>
</fig>
<p>The integration of EfficientNet with Dynamic GradNet has notably enhanced the model&#x2019;s ability to focus on critical regions, as illustrated in <xref ref-type="fig" rid="fig-6">Fig. 6</xref>. The GradCAM visualizations demonstrate a progressive refinement in attention across four stages of Alzheimer&#x2019;s disease progression (moderate dementia, mild dementia, very mild dementia, and nondemented). Each stage is represented by a sequence of four images, showing how the model&#x2019;s focus improves over training, gradually capturing more relevant brain regions in MRI scans. Additionally, <xref ref-type="fig" rid="fig-6">Fig. 6</xref>, also presents examples of failure cases from other models, which failed to emphasize critical areas and missed essential brain regions relevant to Alzheimer&#x2019;s diagnosis. These failure cases (not EfficientNet with Dynamic GradNet) highlight the limitations of models without dynamic weighting, further emphasizing the effectiveness of the proposed approach in improving model robustness and reliability for MRI-based Alzheimer&#x2019;s disease classification.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>Model focus analysis</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-6.tif"/>
</fig>
</sec>
<sec id="s4_2_2">
<label>4.2.2</label>
<title>Performance in the Mild Dementia Class with Standard Deviation</title>
<p>The results for the mild dementia class are further detailed with standard deviations, highlighting the consistency of each model. EfficientNet with Dynamic GradNet not only achieved the highest precision and F1 score but also maintained a low variance, indicating its stable and reliable performance across different test samples.</p>
<p><xref ref-type="table" rid="table-9">Table 9</xref> reports the standard deviation (STD) for each metric, providing insight into the consistency and reliability of the model&#x2019;s performance. The low variance across multiple runs further supports the robustness of the proposed EfficientNet with Dynamic GradNet framework.</p>
<table-wrap id="table-9">
<label>Table 9</label>
<caption>
<title>Performance in the mild dementia class with standard deviation</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Model</th>
<th>Precision (&#x00B1; STD)</th>
<th>Recall (&#x00B1; STD)</th>
<th>F1 score (&#x00B1; STD)</th>
<th>Accuracy (&#x00B1; STD)</th>
</tr>
</thead>
<tbody>
<tr>
<td>ResNet</td>
<td>0.9012 (&#x00B1; 0.03)</td>
<td>0.9873 (&#x00B1; 0.01)</td>
<td>0.9423 (&#x00B1; 0.02)</td>
<td>0.9930 (&#x00B1; 0.01)</td>
</tr>
<tr>
<td>RegNet</td>
<td>0.9126 (&#x00B1; 0.02)</td>
<td>0.9982 (&#x00B1; 0.01)</td>
<td>0.9535 (&#x00B1; 0.02)</td>
<td>0.9944 (&#x00B1; 0.01)</td>
</tr>
<tr>
<td>DenseNet</td>
<td>0.8499 (&#x00B1; 0.05)</td>
<td>0.9809 (&#x00B1; 0.02)</td>
<td>0.9107 (&#x00B1; 0.03)</td>
<td>0.9889 (&#x00B1; 0.01)</td>
</tr>
<tr>
<td>EfficientNet</td>
<td>0.9656 (&#x00B1; 0.01)</td>
<td>0.9945 (&#x00B1; 0.01)</td>
<td>0.9799 (&#x00B1; 0.01)</td>
<td>0.9976 (&#x00B1; 0.01)</td>
</tr>
<tr>
<td>EfficientNet with dynamic GradNet</td>
<td>0.9915 (&#x00B1; 0.01)</td>
<td>0.9847 (&#x00B1; 0.01)</td>
<td>0.9881 (&#x00B1; 0.01)</td>
<td>0.9847 (&#x00B1; 0.01)</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_2_3">
<label>4.2.3</label>
<title>Sensitivity and Specificity Analysis</title>
<p><xref ref-type="table" rid="table-10">Table 10</xref> presents a comparison of each model&#x2019;s sensitivity (recall) and specificity across the four classes. EfficientNet with Dynamic GradNet consistently performed well across all categories, particularly in the nondemented and moderate dementia classes with high specificity, indicating its strong ability to minimize false positives. Similarly, it achieved high sensitivity in detecting true positives, especially in the very mild dementia class, which is crucial for early diagnosis. The improvements in specificity and sensitivity achieved by EfficientNet with Dynamic GradNet compared to other models, including DenseNet, RegNet, and the baseline EfficientNet, are comprehensively illustrated in <xref ref-type="fig" rid="fig-7">Figs. 7</xref> through <xref ref-type="fig" rid="fig-14">14</xref>.</p>
<table-wrap id="table-10">
<label>Table 10</label>
<caption>
<title>Sensitivity (Recall) and specificity across classes</title>
</caption>
<table>
<colgroup>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Model</th>
<th>Class</th>
<th>Sensitivity (Recall)</th>
<th>Specificity</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="4">ResNet</td>
<td>Mild dementia</td>
<td>0.9873</td>
<td>0.9602</td>
</tr>
<tr>
<td>Moderate dementia</td>
<td>0.9952</td>
<td>0.9791</td>
</tr>
<tr>
<td>Nondemented</td>
<td>0.9259</td>
<td>0.9985</td>
</tr>
<tr>
<td>Very mild dementia</td>
<td>0.9871</td>
<td>0.8541</td>
</tr>
<tr>
<td rowspan="4">RegNet</td>
<td>Mild dementia</td>
<td>0.9982</td>
<td>0.9714</td>
</tr>
<tr>
<td>Moderate dementia</td>
<td>0.9906</td>
<td>0.9843</td>
</tr>
<tr>
<td>Nondemented</td>
<td>0.9461</td>
<td>0.9976</td>
</tr>
<tr>
<td>Very mild dementia</td>
<td>0.9839</td>
<td>0.9022</td>
</tr>
<tr>
<td rowspan="4">DenseNet</td>
<td>Mild dementia</td>
<td>0.9809</td>
<td>0.9480</td>
</tr>
<tr>
<td>Moderate dementia</td>
<td>0.9955</td>
<td>0.9890</td>
</tr>
<tr>
<td>Nondemented</td>
<td>0.9485</td>
<td>0.9967</td>
</tr>
<tr>
<td>Very mild dementia</td>
<td>0.9563</td>
<td>0.8890</td>
</tr>
<tr>
<td rowspan="1">EfficientNet</td>
<td>Mild dementia</td>
<td>0.9945</td>
<td>0.9823</td>
</tr>
<tr>
<td/>
<td>Moderate dementia</td>
<td>0.9858</td>
<td>0.9897</td>
</tr>
<tr>
<td/>
<td>Nondemented</td>
<td>0.9737</td>
<td>0.9972</td>
</tr>
<tr>
<td/>
<td>Very mild dementia</td>
<td>0.9873</td>
<td>0.9208</td>
</tr>
<tr>
<td rowspan="4">EfficientNet with dynamic GradNet</td>
<td>Mild dementia</td>
<td>0.9847</td>
<td>0.9951</td>
</tr>
<tr>
<td>Moderate dementia</td>
<td>0.9941</td>
<td>0.9964</td>
</tr>
<tr>
<td>Nondemented</td>
<td>0.9950</td>
<td>0.9991</td>
</tr>
<tr>
<td>Very mild dementia</td>
<td>0.9761</td>
<td>0.9333</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="fig-7">
<label>Figure 7</label>
<caption>
<title>Specificity EfficientNet vs. EfficientNet with dynamic GradNet</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-7.tif"/>
</fig>
<fig id="fig-8">
<label>Figure 8</label>
<caption>
<title>Sensitivity EfficientNet vs. EfficientNet with dynamic GradNet</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-8.tif"/>
</fig>
<fig id="fig-9">
<label>Figure 9</label>
<caption>
<title>Specificity DenseNet vs. EfficientNet with Dynamic GradNet</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-9.tif"/>
</fig>
<fig id="fig-10">
<label>Figure 10</label>
<caption>
<title>Sensitivity DenseNet vs. EfficientNet with Dynamic GradNet</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-10.tif"/>
</fig>
<fig id="fig-11">
<label>Figure 11</label>
<caption>
<title>Specificity RegNet vs. EfficientNet with dynamic GradNet</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-11.tif"/>
</fig>
<fig id="fig-12">
<label>Figure 12</label>
<caption>
<title>Sensitivity RegNet vs. EfficientNet with dynamic GradNet</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-12.tif"/>
</fig>
<fig id="fig-13">
<label>Figure 13</label>
<caption>
<title>Specificity ResNet vs. EfficientNet with dynamic GradNet</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-13.tif"/>
</fig>
<fig id="fig-14">
<label>Figure 14</label>
<caption>
<title>Sensitivity ResNet vs. EfficientNet with dynamic GradNet</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_62923-fig-14.tif"/>
</fig>
</sec>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Discussion</title>
<sec id="s5_1">
<label>5.1</label>
<title>Comparative Analysis of Model Performance: ResNet, RegNet, DenseNet, EfficientNet, and EfficientNet with Dynamic GradNet</title>
<p>The results from the comparative analysis reveal significant insights into the performance of various deep learning models, including ResNet, RegNet, DenseNet, EfficientNet, and the proposed EfficientNet with Dynamic GradNet, in the context of AD classification. These models were evaluated across four distinct categories: mild dementia, moderate dementia, nondemented, and very mild dementia. The discussion below highlights key trends and observations, focusing on the tradeoffs between precision, recall, the F1 score, and accuracy and the models&#x2019; generalizability.</p>
<sec id="s5_1_1">
<label>5.1.1</label>
<title>Mild Dementia Classification</title>
<p>In the mild dementia class, EfficientNet with Dynamic GradNet achieved the highest F1 score (0.9881), indicating a well-balanced performance between precision and recall. This outcome suggests that the proposed model is highly effective in maintaining a low false positive rate while still capturing the majority of true positives. Interestingly, EfficientNet achieved the highest accuracy (0.9976), indicating its proficiency in correct classification overall. However, the relatively low F1 score (0.9799) than that of EfficientNet with Dynamic GradNet suggests that EfficientNet may exhibit a slight imbalance between precision and recall, possibly leading to a greater number of false negatives in this category.</p>
<p>The consistently high performance of EfficientNet with Dynamic GradNet can be attributed to the integration of the Dynamic GradNet mechanism, which likely enhances the model&#x2019;s robustness by dynamically adjusting gradient updates, leading to more stable and accurate predictions. The marginal yet meaningful improvements in precision and recall over other models highlight the advantage of this approach in handling the complex and nuanced features associated with early-stage dementia.</p>
</sec>
<sec id="s5_1_2">
<label>5.1.2</label>
<title>Moderate Dementia Classification</title>
<p>For the moderate dementia class, EfficientNet with Dynamic GradNet achieved the highest F1 score (0.9949), demonstrating a well-balanced performance between precision and recall. While DenseNet attained the highest accuracy (0.9999), F1 score remains a more comprehensive indicator of the model&#x2019;s robustness, particularly in cases with class imbalance.</p>
<p>The results suggest that while DenseNet excels in capturing both positive and negative samples with high accuracy, EfficientNet with Dynamic GradNet may be more suitable when the primary concern is reducing false positive rates, which is often a priority in clinical settings. This tradeoff between accuracy and precision must be carefully considered when selecting a model for deployment in real-world diagnostic systems.</p>
</sec>
<sec id="s5_1_3">
<label>5.1.3</label>
<title>Nondemented Classification</title>
<p>In the nondemented class, EfficientNet with Dynamic GradNet demonstrated superior performance across multiple metrics, achieving the highest recall (0.9950), F1 score (0.9949), and accuracy (0.9953). This finding underscores its ability to accurately identify individuals who are not suffering from dementia, with minimal false negatives and false positives. High recall in this category is crucial, as misclassifying nondemented individuals could result in unnecessary anxiety and further diagnostic procedures.</p>
<p>Interestingly, ResNet and EfficientNet achieved the highest precision values (0.9973 and 0.9971, respectively), which indicates their effectiveness in minimizing false positives in this class. However, their lower recall values than that of EfficientNet with Dynamic GradNet suggests a potential shortcoming in correctly identifying all nondemented individuals, possibly leading to a greater number of false negatives. This trade-off between precision and recall is again evident, with EfficientNet with Dynamic GradNet offering a more balanced approach in this category.</p>
</sec>
<sec id="s5_1_4">
<label>5.1.4</label>
<title>Very Mild Dementia Classification</title>
<p>Early detection of dementia is crucial for timely intervention, and in the very mild dementia class, EfficientNet with Dynamic GradNet outperformed all other models in terms of precision (0.9779), F1 score (0.9773), and accuracy (0.9767). This finding highlights the model&#x2019;s ability to accurately detect early-stage dementia while maintaining a low false positive rate. Given the subtle nature of early dementia symptoms, this performance is particularly noteworthy, as it demonstrates the model&#x2019;s ability to distinguish between very mild cognitive impairment and normal aging processes.</p>
<p>Moreover, EfficientNet with Dynamic GradNet exhibited exceptional performance in the very mild dementia category, surpassing all other models in terms of precision, recall, and F1 score. Its remarkable ability to detect early-stage dementia with minimal false positives and near perfect sensitivity makes it a highly reliable tool for early diagnosis. The model&#x2019;s superior F1 score highlights its robustness in accurately identifying true cases while maintaining a low misclassification rate. As a result, EfficientNet with Dynamic GradNet stands out as the most powerful and efficient model for detecting this crucial early-stage of dementia.</p>
<p>The relatively low performance of other models in this class, particularly ResNet F1 score (0.8527) and RegNet F1 score (0.8893), suggests that these architectures may struggle with the fine-grained distinctions required for early dementia classification. The enhanced performance of EfficientNet with Dynamic GradNet can likely be attributed to its dynamic gradient adjustment mechanism, which allows for better generalizability in categories with subtle and overlapping features.</p>
</sec>
</sec>
<sec id="s5_2">
<label>5.2</label>
<title>Comprehensive Model Comparison</title>
<sec id="s5_2_1">
<label>5.2.1</label>
<title>Generalization across All Classes</title>
<p>When comparing the models&#x2019; average performance across all classes, it is evident that EfficientNet with Dynamic GradNet stands out, achieving the highest average precision (0.9896), recall (0.9878), F1 score (0.9887), and accuracy (0.9903). This finding suggests that the proposed model not only excels in individual categories but also generalizes effectively across the entire dataset. The consistency in its performance, coupled with low variance in key metrics, indicates that EfficientNet with Dynamic GradNet offers a highly reliable solution for multiclass dementia classification.</p>
<p>The superior performance of EfficientNet with Dynamic GradNet across all classes can be attributed to its ability to adapt dynamically to different data distributions and class imbalances, which is a common challenge in medical datasets. The model&#x2019;s ability to maintain high sensitivity (recall) and specificity across all categories further reinforces its potential as a robust tool for clinical applications, where both high true positive rates and low false positive rates are critical.</p>
</sec>
<sec id="s5_2_2">
<label>5.2.2</label>
<title>Sensitivity and Specificity</title>
<p>The sensitivity (recall) and specificity analysis across classes further highlights the strengths of EfficientNet with Dynamic GradNet. For the nondemented and moderate dementia classes, the model achieved the highest specificity values (0.9991 and 0.9964, respectively), indicating its strong ability to minimize false positives.</p>
<p>This balance between sensitivity and specificity underscores the clinical applicability of EfficientNet with Dynamic GradNet, as it has the ability to correctly identify true positives while minimizing false positives, thereby reducing the risk of misdiagnosis and unnecessary treatments.</p>
<p>The proposed EfficientNet with Dynamic GradNet model consistently outperformed existing architectures across multiple dementia classification tasks. Its superior precision, recall, F1 score, and accuracy&#x2014;coupled with its ability to generalize well across different classes&#x2014;make it an ideal candidate for clinical deployment in AD diagnosis. The model&#x2019;s dynamic gradient adjustment mechanism allows it to handle the inherent complexities and imbalances in medical datasets, ensuring both high sensitivity and specificity. These findings suggest that EfficientNet with Dynamic GradNet offers a valuable contribution to the field of medical image analysis, particularly in the early detection of dementia, where accurate and timely diagnosis is paramount.</p>
</sec>
</sec>
<sec id="s5_3">
<label>5.3</label>
<title>Generalizability of Dynamic GradNet</title>
<p>The proposed Dynamic GradNet model has been trained and evaluated on MRI images from the OASIS dataset. Given that GradCAM operates on feature maps extracted from convolutional layers, the model&#x2019;s design is theoretically adaptable to other medical imaging modalities such as CT and PET scans. Convolutional neural networks (CNNs), including EfficientNet, have demonstrated flexibility in handling different imaging data, making it feasible to apply transfer learning techniques for adapting Dynamic GradNet to alternative modalities.</p>
<p>However, MRI, CT, and PET scans exhibit substantial differences in contrast, resolution, and feature representation, which may require modifications in preprocessing steps such as normalization and intensity scaling. While the core model architecture remains unchanged, adjustments in image standardization and augmentation strategies would be necessary to optimize performance across different modalities. Future research may focus on fine tuning Dynamic GradNet using multi-modal datasets, evaluating its robustness in cross modality diagnostic settings. This extension will ensure the model&#x2019;s broader applicability in clinical decision support systems, enhancing its real-world utility beyond MRI based Alzheimer&#x2019;s diagnosis.</p>
</sec>
<sec id="s5_4">
<label>5.4</label>
<title>Potential Integration into Clinical Diagnostic Workflows</title>
<p>The proposed Dynamic GradNet framework not only enhances classification performance but also presents a significant step toward integrating deep learning models into clinical diagnostic workflows. Unlike conventional post-hoc interpretability approaches, our method actively refines the learning process, ensuring the model consistently prioritizes clinically relevant brain regions. This improvement is particularly valuable for computer aided diagnosis (CAD) systems, where reliability and transparency are crucial for adoption in real-world medical imaging applications. By reducing false positives, particularly in early-stage dementia detection, and improving model interpretability, Dynamic GradNet aligns well with the needs of radiologists and neurologists. Furthermore, its adaptability to existing MRI-based diagnostic tools suggests potential applicability in automated diagnostic systems and future AI-assisted medical imaging software. These findings highlight the feasibility of integrating Dynamic GradNet into clinical settings, paving the way for its implementation in decision support systems for Alzheimer&#x2019;s disease diagnosis.</p>
<p>By integrating GradCAM into the training process, this framework not only improves model interpretability but also enhances its practical applicability in clinical settings, where explain ability is essential for trust and adoption by healthcare professionals. This advancement facilitates its integration into decision-support systems, aiding radiologists and neurologists in Alzheimer&#x2019;s disease diagnosis.</p>
</sec>
</sec>
<sec id="s6">
<label>6</label>
<title>Conclusion and Future Work</title>
<p>This study underscores the importance of dynamic spatial focus in the diagnosis of AD and introduces the EfficientNet GradNet model as an effective approach for improving diagnostic accuracy and interpretability. A comparison of four state-of-the-art CNN architectures revealed that EfficientNet outperformed the others in terms of accuracy, precision, recall, and F1 score, particularly in the early stages of dementia. The integration of Dynamic GradNet further enhanced the model&#x2019;s ability by enabling it to focus on the most relevant regions in medical images while reducing the impact of less critical areas such as image edges or irrelevant organ sections. This focus not only provides better interpretability but also improves the model&#x2019;s learning process during the training phase by emphasizing anatomically significant regions.</p>
<p>In medical imaging, some portions of images, such as peripheral areas or non-crucial regions, may not contribute significantly to the classification process. Dynamic GradNet ensures that the model focuses on the critical areas required for accurate diagnosis, minimizing the influence of irrelevant portions of images. This capability is essential both for enhancing model interpretability and for guiding learning in the training phase, where the model needs to prioritize the most important patterns.</p>
<p>One of the key contributions of this study is highlighting the importance of dynamic spatial focus in medical diagnostics. This approach could serve as a foundation for future studies on spatial prioritization, where precise anatomical localization plays a crucial role in improving diagnostic accuracy, particularly in neurodegenerative diseases. By demonstrating the impact of spatial focus on performance, this study opens the door for further exploration into how spatial attention mechanisms can be tailored for other medical conditions, potentially leading to more refined diagnostic tools that leverage the most critical regions in medical images.</p>
<p>In the future, several avenues can be explored. First, researchers could focus on improving the model&#x2019;s performance in classifying complex cases, such as mixed or ambiguous forms of dementia. Comparative studies with other imaging modalities, such as CT or PET, could be performed to assess the model&#x2019;s adaptability across different types of medical imaging.</p>
<p>To further evaluate the generalizability of Dynamic GradNet, future work will explore its adaptation to different imaging modalities, such as computed tomography (CT) and positron emission tomography (PET). While the model&#x2019;s architecture remains unchanged, modifications in preprocessing strategies, including normalization, intensity scaling, and domain specific augmentations, will be considered to optimize performance across different imaging techniques. Additionally, training and validation on multi-modal datasets will provide insights into the model&#x2019;s robustness and effectiveness in cross modality diagnostic settings. This extension will help establish Dynamic GradNet as a more versatile tool for clinical decision support, ensuring its applicability beyond MRI-based Alzheimer&#x2019;s diagnosis.</p>
<p>Finally, incorporating more advanced interpretability frameworks, such as explainable AI (XAI) techniques, could provide deeper insights into the model&#x2019;s decision making process, fostering greater trust in AI-assisted diagnostic tools.</p>
<p>Overall, EfficientNet with Dynamic GradNet has significant potential in supporting AD diagnosis. Continued refinement and innovation in this direction can further increase the accuracy and clinical applicability of this model, meaningfully contributing to the future of AI technologies in medical diagnostics. Moreover, the importance of dynamic spatial focus could serve as a key stepping stone for further studies, driving advancements in how medical imaging models prioritize and interpret critical spatial information across various diseases.</p>
</sec>
</body>
<back>
<ack>
<p>The author would like to thank the editors and reviewers for their valuable work and constructive feedback. We also extend our gratitude to the Open Access Series of Imaging Studies (OASIS) project and Kaggle for providing the OASIS MRI dataset, which includes over 80,000 brain MR images categorized by Alzheimer&#x2019;s Disease progression. The author would like to acknowledge Deanship of Graduate Studies and Scientific Research, Taif University for funding this work.</p>
</ack>
<sec>
<title>Funding Statement</title>
<p>This research was funded by Taif University, Saudi Arabia. The author would like to acknowledge Deanship of Graduate Studies and Scientific Research, Taif University for funding this work.</p>
</sec>
<sec sec-type="data-availability">
<title>Availability of Data and Materials</title>
<p>The dataset used in this study is publicly available. We acknowledge the Open Access Series of Imaging Studies (OASIS) project and Kaggle for providing the OASIS MRI dataset, which includes over 80,000 brain MR images categorized by Alzheimer&#x2019;s Disease progression. The dataset can be accessed at <ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/ninadaithal/imagesoasis/data">https://www.kaggle.com/datasets/ninadaithal/imagesoasis/data</ext-link> (accessed on 19 November 2024).</p>
</sec>
<sec>
<title>Ethics Approval</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement">
<title>Conflicts of Interest</title>
<p>The author declares no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Aramadaka</surname> <given-names>S</given-names></string-name>, <string-name><surname>Mannam</surname> <given-names>R</given-names></string-name>, <string-name><surname>Sankara Narayanan</surname> <given-names>R</given-names></string-name>, <string-name><surname>Bansal</surname> <given-names>A</given-names></string-name>, <string-name><surname>Yanamaladoddi</surname> <given-names>VR</given-names></string-name>, <string-name><surname>Sarvepalli</surname> <given-names>SS</given-names></string-name>, <string-name><surname>Vemula</surname> <given-names>SL</given-names></string-name></person-group>. <article-title>Neuroimaging in Alzheimer&#x2019;s disease for early diagnosis: a comprehensive review</article-title>. <source>Cureus</source>. <year>2023</year>;<volume>15</volume>(<issue>5</issue>):<fpage>e38544</fpage>. doi:<pub-id pub-id-type="doi">10.7759/cureus.38544</pub-id>; <pub-id pub-id-type="pmid">37273363</pub-id></mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Frisoni</surname> <given-names>GB</given-names></string-name>, <string-name><surname>Fox</surname> <given-names>NC</given-names></string-name>, <string-name><surname>Jack</surname> <given-names>CR</given-names></string-name>, <string-name><surname>Scheltens</surname> <given-names>P</given-names></string-name>, <string-name><surname>Thompson</surname> <given-names>PM</given-names></string-name></person-group>. <article-title>The clinical use of structural MRI in Alzheimer disease</article-title>. <source>Nat Rev Neurol</source>. <year>2010</year>;<volume>6</volume>(<issue>2</issue>):<fpage>67</fpage>&#x2013;<lpage>77</lpage>. doi:<pub-id pub-id-type="doi">10.1038/nrneurol.2009.215</pub-id>; <pub-id pub-id-type="pmid">20139996</pub-id></mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Feng</surname> <given-names>X</given-names></string-name>, <string-name><surname>Provenzano</surname> <given-names>FA</given-names></string-name>, <string-name><surname>Small</surname> <given-names>SA</given-names></string-name></person-group>. <article-title>A deep learning MRI approach outperforms other biomarkers of prodromal Alzheimer&#x2019;s disease</article-title>. <source>Alzheimer&#x2019;s Res Ther</source>. <year>2022</year>;<volume>14</volume>(<issue>1</issue>):<fpage>45</fpage>&#x2013;<lpage>5</lpage>. doi:<pub-id pub-id-type="doi">10.1186/s13195-022-00985-x</pub-id>; <pub-id pub-id-type="pmid">35351193</pub-id></mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Litjens</surname> <given-names>G</given-names></string-name>, <string-name><surname>Kooi</surname> <given-names>T</given-names></string-name>, <string-name><surname>Bejnordi</surname> <given-names>BE</given-names></string-name>, <string-name><surname>Setio</surname> <given-names>AA</given-names></string-name>, <string-name><surname>Ciompi</surname> <given-names>F</given-names></string-name>, <string-name><surname>Ghafoorian</surname> <given-names>M</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>A survey on deep learning in medical image analysis</article-title>. <source>Med Image Anal</source>. <year>2017</year>;<volume>42</volume>(<issue>13</issue>):<fpage>60</fpage>&#x2013;<lpage>88</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.media.2017.07.005</pub-id>; <pub-id pub-id-type="pmid">28778026</pub-id></mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Shen</surname> <given-names>D</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>G</given-names></string-name>, <string-name><surname>Suk</surname> <given-names>HI</given-names></string-name></person-group>. <article-title>Deep learning in medical image analysis</article-title>. <source>Annu Rev Biomed Eng</source>. <year>2017</year>;<volume>19</volume>(<issue>1</issue>):<fpage>221</fpage>&#x2013;<lpage>48</lpage>. doi:<pub-id pub-id-type="doi">10.1146/annurev-bioeng-071516-044442</pub-id>; <pub-id pub-id-type="pmid">28301734</pub-id></mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Andrej</surname> <given-names>K</given-names></string-name>, <string-name><surname>Li</surname> <given-names>F</given-names></string-name></person-group>. <article-title>Deep visual-semantic alignments for generating image descriptions</article-title>. In: <conf-name>2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</conf-name>; <year>2015</year>; <publisher-loc>Boston, MA, USA</publisher-loc>. p. <fpage>3128</fpage>&#x2013;<lpage>37</lpage>. doi:<pub-id pub-id-type="doi">10.1109/CVPR.2015.7298932</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Radosavovic</surname> <given-names>I</given-names></string-name>, <string-name><surname>Kosaraju</surname> <given-names>RP</given-names></string-name>, <string-name><surname>Girshick</surname> <given-names>R</given-names></string-name>, <string-name><surname>He</surname> <given-names>K</given-names></string-name>, <string-name><surname>Doll&#x00E1;r</surname> <given-names>P</given-names></string-name></person-group>. <article-title>Designing network design spaces</article-title>. In: <conf-name>2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)</conf-name>. <publisher-loc>Seattle, WA, USA; 2020</publisher-loc>. p. <fpage>10428</fpage>&#x2013; <lpage>36</lpage>. doi:<pub-id pub-id-type="doi">10.1109/cvpr42600.2020.01044</pub-id>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Maddury</surname> <given-names>S</given-names></string-name>, <string-name><surname>Desai</surname> <given-names>K</given-names></string-name></person-group>. <article-title>DeepAD: a deep learning application for predicting amyloid standardized uptake value ratio through PET for Alzheimer&#x2019;s prognosis</article-title>. <source>Front Artif Intell</source>. <year>2023</year>;<volume>6</volume>:<fpage>1091506</fpage>. doi:<pub-id pub-id-type="doi">10.3389/frai.2023.1091506</pub-id>; <pub-id pub-id-type="pmid">36815006</pub-id></mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>He</surname> <given-names>K</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Ren</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Deep residual learning for image recognition</article-title>. In: <conf-name>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</conf-name>; <year>2016</year>; <publisher-loc>Los Alamitos, CA, USA</publisher-loc>: <publisher-name>IEEE Computer Society</publisher-name>. p. <fpage>770</fpage>&#x2013;<lpage>8</lpage>. doi:<pub-id pub-id-type="doi">10.1109/cvpr.2016.90</pub-id>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Shi</surname> <given-names>C</given-names></string-name>, <string-name><surname>Yao</surname> <given-names>X</given-names></string-name>, <string-name><surname>Luo</surname> <given-names>S</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>L</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>T</given-names></string-name></person-group>. <article-title>Classification of Alzheimer&#x2019;s disease via deep residual network</article-title>. In: <conf-name> International Conference on Image, Vision and Intelligent Systems</conf-name>; <year>2023 Aug 16&#x2013;18</year>; <publisher-loc>Baoding, China; 2024</publisher-loc>. p. <fpage>557</fpage>&#x2013;<lpage>64</lpage>. doi:<pub-id pub-id-type="doi">10.1007/978-981-97-0855-0_53</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Huang</surname> <given-names>G</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Van Der Maaten</surname> <given-names>L</given-names></string-name>, <string-name><surname>Weinberger</surname> <given-names>KQ</given-names></string-name></person-group>. <article-title>Densely connected convolutional networks//and others</article-title>. In: <conf-name>2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</conf-name>; <year>2017</year>; <publisher-loc>Honolulu, HI, USA</publisher-loc>. doi:<pub-id pub-id-type="doi">10.1109/CVPR.2017.243</pub-id>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhou</surname> <given-names>T</given-names></string-name>, <string-name><surname>Ye</surname> <given-names>X</given-names></string-name>, <string-name><surname>Lu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zheng</surname> <given-names>X</given-names></string-name>, <string-name><surname>Qiu</surname> <given-names>S</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Dense convolutional network and its application in medical image analysis</article-title>. <source>Biomed Res Int</source>. <year>2022</year>;<volume>2022</volume>(<issue>1</issue>):<fpage>2384830</fpage>. doi:<pub-id pub-id-type="doi">10.1155/2022/2384830</pub-id>; <pub-id pub-id-type="pmid">35509707</pub-id></mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Tan</surname> <given-names>M</given-names></string-name>, <string-name><surname>Le</surname> <given-names>Q</given-names></string-name></person-group>. <article-title>Efficientnet: rethinking model scaling for convolutional neural networks</article-title>. In: <conf-name>Proceedings of the 36th International Conference on Machine Learning (ICML)</conf-name>; <year>2019 May</year>; <publisher-loc>Long Beach, CA, USA</publisher-loc>:<publisher-name>PMLR</publisher-name>. p. <fpage>6105</fpage>&#x2013;<lpage>14</lpage>. doi:<pub-id pub-id-type="doi">10.48550/arXiv.1905.11946</pub-id>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hoang</surname> <given-names>GM</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>UH</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>JG</given-names></string-name></person-group>. <article-title>Vision transformers for the prediction of mild cognitive impairment to Alzheimer&#x2019;s disease progression using mid-sagittal sMRI</article-title>. <source>Front Aging Neurosci</source>. <year>2023</year>;<volume>15</volume>:<fpage>1102869</fpage>. doi:<pub-id pub-id-type="doi">10.3389/fnagi.2023.1102869</pub-id>; <pub-id pub-id-type="pmid">37122374</pub-id></mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Xie</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>B</given-names></string-name>, <string-name><surname>Guan</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Xia</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Attention mechanisms in medical image segmentation: a survey</article-title>. <comment>arXiv:2305.17937</comment>. <year>2023</year>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Serrano-Pozo</surname> <given-names>A</given-names></string-name>, <string-name><surname>Frosch</surname> <given-names>MP</given-names></string-name>, <string-name><surname>Masliah</surname> <given-names>E</given-names></string-name>, <string-name><surname>Hyman</surname> <given-names>BT</given-names></string-name></person-group>. <article-title>Neuropathological alterations in Alzheimer disease</article-title>. <source>Cold Spring Harb Perspect Med</source>. <year>2011</year>;<volume>1</volume>(<issue>1</issue>):<fpage>6189</fpage>&#x2013;<lpage>9</lpage>. doi:<pub-id pub-id-type="doi">10.1101/cshperspect.a006189</pub-id>; <pub-id pub-id-type="pmid">22229116</pub-id></mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>M</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>D</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>D</given-names></string-name></person-group>. <article-title>Hierarchical fusion of features and classifier decisions for Alzheimer&#x2019;s disease diagnosis</article-title>. <source>Hum Brain Mapp</source>. <year>2015</year>;<volume>36</volume>(<issue>3</issue>):<fpage>1202</fpage>&#x2013;<lpage>16</lpage>. doi:<pub-id pub-id-type="doi">10.1002/hbm.22254</pub-id>; <pub-id pub-id-type="pmid">23417832</pub-id></mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>M</given-names></string-name>, <string-name><surname>Cheng</surname> <given-names>D</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>K</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Multi-modality cascaded convolutional neural networks for Alzheimer&#x2019;s disease diagnosis</article-title>. <source>Neuroinformatics</source>. <year>2018</year>;<volume>16</volume>:<fpage>295</fpage>&#x2013;<lpage>308</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s12021-018-9370-4</pub-id>; <pub-id pub-id-type="pmid">29572601</pub-id></mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Suk</surname> <given-names>HI</given-names></string-name>, <string-name><surname>Lee</surname> <given-names>SW</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>D</given-names></string-name>, <collab>Alzheimer&#x2019;s Disease Neuroimaging Initiative</collab></person-group>. <article-title>Deep sparse multi-task learning for feature selection in Alzheimer&#x2019;s disease diagnosis</article-title>. <source>Brain Struct Funct</source>. <year>2016</year>;<volume>221</volume>(<issue>5</issue>):<fpage>2569</fpage>&#x2013;<lpage>87</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s00429-015-1059-y</pub-id>; <pub-id pub-id-type="pmid">25993900</pub-id></mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Li</surname> <given-names>R</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Lawler</surname> <given-names>K</given-names></string-name>, <string-name><surname>Garg</surname> <given-names>S</given-names></string-name>, <string-name><surname>Bai</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Alty</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Applications of artificial intelligence to aid early detection of dementia: a scoping review on current capabilities and future directions</article-title>. <source>J Biomed Inform</source>. <year>2022</year>;<volume>127</volume>:<fpage>104030</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.jbi.2022.104030</pub-id>; <pub-id pub-id-type="pmid">35183766</pub-id></mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Fathi</surname> <given-names>S</given-names></string-name>, <string-name><surname>Ahmadi</surname> <given-names>A</given-names></string-name>, <string-name><surname>Dehnad</surname> <given-names>A</given-names></string-name>, <string-name><surname>Almasi-Dooghaee</surname> <given-names>M</given-names></string-name>, <string-name><surname>Sadegh</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Alzheimer&#x2019;s disease neuroimaging initiative. A deep learning-based ensemble method for early diagnosis of Alzheimer&#x2019;s disease using MRI images</article-title>. <source>Neuroinformatics</source>. <year>2024</year>;<volume>22</volume>(<issue>1</issue>):<fpage>89</fpage>&#x2013;<lpage>105</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s12021-023-09646-2</pub-id>; <pub-id pub-id-type="pmid">38042764</pub-id></mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yaqoob</surname> <given-names>N</given-names></string-name>, <string-name><surname>Khan</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Masood</surname> <given-names>S</given-names></string-name>, <string-name><surname>Albarakati</surname> <given-names>HM</given-names></string-name>, <string-name><surname>Hamza</surname> <given-names>A</given-names></string-name>, <string-name><surname>Alhayan</surname> <given-names>F</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Prediction of Alzheimer&#x2019;s disease stages based on ResNet-Self-attention architecture with Bayesian optimization and best features selection</article-title>. <source>Front Comput Neurosci</source>. <year>2024</year>;<volume>18</volume>:<fpage>1393849</fpage>. doi:<pub-id pub-id-type="doi">10.3389/fncom.2024.1393849</pub-id>; <pub-id pub-id-type="pmid">38725868</pub-id></mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Agarwal</surname> <given-names>D</given-names></string-name>, <string-name><surname>Berb&#x00ED;s</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Luna</surname> <given-names>A</given-names></string-name>, <string-name><surname>Lipari</surname> <given-names>V</given-names></string-name>, <string-name><surname>Ballester</surname> <given-names>JB</given-names></string-name>, <string-name><surname>de la Torre-D&#x00ED;ez</surname> <given-names>I</given-names></string-name></person-group>. <article-title>Automated medical diagnosis of Alzheimer&#x2019;s disease using an efficient net convolutional neural network</article-title>. <source>J Med Syst</source>. <year>2023</year>;<volume>47</volume>(<issue>1</issue>):<fpage>57</fpage>. doi:<pub-id pub-id-type="doi">10.1007/s10916-023-01941-4</pub-id>; <pub-id pub-id-type="pmid">37129723</pub-id></mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Dhinagar</surname> <given-names>NJ</given-names></string-name>, <string-name><surname>Thomopoulos</surname> <given-names>SI</given-names></string-name>, <string-name><surname>Laltoo</surname> <given-names>E</given-names></string-name>, <string-name><surname>Thompson</surname> <given-names>PM</given-names></string-name></person-group>. <article-title>Efficiently training vision transformers on structural MRI scans for alzheimer&#x2019;s disease detection</article-title>. In: <conf-name>2023 45th Annual International Conference of the IEEE Engineering in Medicine &#x0026; Biology Society (EMBC)</conf-name>; <year>2023 Jul</year>; <publisher-loc>Sydney, Australia</publisher-loc>: <publisher-name>IEEE</publisher-name>. p. <fpage>1</fpage>&#x2013;<lpage>6</lpage>. doi:<pub-id pub-id-type="doi">10.1109/EMBC40787.2023.10341190</pub-id>; <pub-id pub-id-type="pmid">38083552</pub-id></mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Sun</surname> <given-names>H</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>A</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>W</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>C</given-names></string-name></person-group>. <article-title>An improved deep residual network prediction model for the early diagnosis of Alzheimer&#x2019;s disease</article-title>. <source>Sensors</source>. <year>2021</year>;<volume>21</volume>(<issue>12</issue>):<fpage>4182</fpage>. doi:<pub-id pub-id-type="doi">10.3390/s21124182</pub-id>; <pub-id pub-id-type="pmid">34207145</pub-id></mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hassan</surname> <given-names>N</given-names></string-name>, <string-name><surname>Miah</surname> <given-names>ASM</given-names></string-name>, <string-name><surname>Shin</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Residual-based multi-stage deep learning framework for computer-aided Alzheimer&#x2019;s disease detection</article-title>. <source>J Imag</source>. <year>2024</year>;<volume>10</volume>(<issue>6</issue>):<fpage>141</fpage>. doi:<pub-id pub-id-type="doi">10.3390/jimaging10060141</pub-id>; <pub-id pub-id-type="pmid">38921618</pub-id></mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gottapu</surname> <given-names>RD</given-names></string-name>, <string-name><surname>Dagli</surname> <given-names>CH</given-names></string-name></person-group>. <article-title>DenseNet for anatomical brain segmentation</article-title>. <source>Procedia Comput Sci</source>. <year>2018</year>;<volume>140</volume>(<issue>4</issue>):<fpage>179</fpage>&#x2013;<lpage>85</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.procs.2018.10.327</pub-id>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Saleh</surname> <given-names>AW</given-names></string-name>, <string-name><surname>Gupta</surname> <given-names>G</given-names></string-name>, <string-name><surname>Khan</surname> <given-names>SB</given-names></string-name>, <string-name><surname>Alkhaldi</surname> <given-names>NA</given-names></string-name>, <string-name><surname>Verma</surname> <given-names>A</given-names></string-name></person-group>. <article-title>An Alzheimer&#x2019;s disease classification model using transfer learning DenseNet with embedded healthcare decision support system</article-title>. <source>Decis Anal J</source>. <year>2023</year>;<volume>9</volume>(<issue>45</issue>):<fpage>100348</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.dajour.2023.100348</pub-id>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Priyadarshini</surname> <given-names>P</given-names></string-name>, <string-name><surname>Kanungo</surname> <given-names>P</given-names></string-name>, <string-name><surname>Kar</surname> <given-names>T</given-names></string-name></person-group>. <article-title>Multigrade brain tumor classification in MRI images using Fine tuned efficientnet. e-Prime&#x2014;advances in electrical engineering</article-title>. <source>Electron Energy</source>. <year>2024</year>;<volume>8</volume>(<issue>20</issue>):<fpage>100498</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.prime.2024.100498</pub-id>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Aborokbah</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Alzheimer&#x2019;s disease MRI classification using EfficientNet: a deep learning model</article-title>. In: <conf-name>2024 4th International Conference on Applied Artificial Intelligence (ICAPAI)</conf-name>; <year>2024</year>; <publisher-loc>Halden, Norway; 2024</publisher-loc>. p. <fpage>1</fpage>&#x2013;<lpage>8</lpage>. doi:<pub-id pub-id-type="doi">10.1109/ICAPAI61893.2024.10541281</pub-id>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Hong</surname> <given-names>D</given-names></string-name>, <string-name><surname>Mcclement</surname> <given-names>D</given-names></string-name>, <string-name><surname>Oladosu</surname> <given-names>O</given-names></string-name>, <string-name><surname>Pridham</surname> <given-names>G</given-names></string-name>, <string-name><surname>Slaney</surname> <given-names>G</given-names></string-name></person-group>. <article-title>Grad-CAM helps interpret the deep learning models trained to classify multiple sclerosis types using clinical brain magnetic resonance imaging</article-title>. <source>J Neurosci Methods</source>. <year>2021</year>;<volume>353</volume>:<fpage>109098</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.jneumeth.2021.109098</pub-id>; <pub-id pub-id-type="pmid">33582174</pub-id></mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Shaji</surname> <given-names>S</given-names></string-name>, <string-name><surname>Ganapathy</surname> <given-names>N</given-names></string-name>, <string-name><surname>Swaminathan</surname> <given-names>R</given-names></string-name></person-group>. <article-title>Classification of Alzheimer&#x2019;s condition using MR brain images and inception-residual network model</article-title>. <source>Curr Dir Biomed Eng</source>. <year>2021</year>;<volume>7</volume>(<issue>2</issue>):<fpage>763</fpage>&#x2013;<lpage>6</lpage>. doi:<pub-id pub-id-type="doi">10.1515/cdbme-2021-2195</pub-id>.</mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Krishnan</surname> <given-names>D</given-names></string-name>, <string-name><surname>Bishnoi</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bansal</surname> <given-names>S</given-names></string-name>, <string-name><surname>Ravi</surname> <given-names>V</given-names></string-name>, <string-name><surname>Ravi</surname> <given-names>P</given-names></string-name></person-group>. <article-title>Enhancing classification of Alzheimer&#x2019;s disease using spatial attention mechanism</article-title>. <source>Open Neuroimag J</source>. <year>2024</year>;(<issue>1</issue>):<fpage>17</fpage>&#x2013;<lpage>7</lpage>. doi:<pub-id pub-id-type="doi">10.2174/0118744400305746240626043912</pub-id>.</mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Sun</surname> <given-names>H</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>A</given-names></string-name>, <string-name><surname>He</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Temporal and spatial analysis of Alzheimer&#x2019;s disease based on an improved convolutional neural network and a resting-state FMRI brain functional network</article-title>. <source>Int J Environ Res Public Health</source>. <year>2022</year>;<volume>19</volume>(<issue>8</issue>):<fpage>4508</fpage>&#x2013;<lpage>8</lpage>. doi:<pub-id pub-id-type="doi">10.3390/ijerph19084508</pub-id>; <pub-id pub-id-type="pmid">35457373</pub-id></mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Marcus</surname> <given-names>DS</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>TH</given-names></string-name>, <string-name><surname>Parker</surname> <given-names>J</given-names></string-name>, <string-name><surname>Csernansky</surname> <given-names>JG</given-names></string-name>, <string-name><surname>Morris</surname> <given-names>JC</given-names></string-name>, <string-name><surname>Buckner</surname> <given-names>RL</given-names></string-name></person-group>. <article-title>Open Access Series of Imaging Studies (OASIS): cross-sectional MRI data in young, middle-aged, nondemented, and demented older adults</article-title>. <source>J Cogn Neurosci</source>. <year>2007</year>;<volume>19</volume>(<issue>9</issue>):<fpage>1498</fpage>&#x2013;<lpage>507</lpage>. doi:<pub-id pub-id-type="doi">10.1162/jocn.2007.19.9.1498</pub-id>; <pub-id pub-id-type="pmid">17714011</pub-id></mixed-citation></ref>
<ref id="ref-36"><label>[36]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Marcus</surname> <given-names>DS</given-names></string-name>, <string-name><surname>Fotenos</surname> <given-names>AF</given-names></string-name>, <string-name><surname>Csernansky</surname> <given-names>JG</given-names></string-name>, <string-name><surname>Morris</surname> <given-names>JC</given-names></string-name>, <string-name><surname>Buckner</surname> <given-names>RL</given-names></string-name></person-group>. <article-title>Open access series of imaging studies: longitudinal MRI data in nondemented and demented older adults</article-title>. <source>J Cogn Neurosci</source>. <year>2010</year>;<volume>22</volume>(<issue>12</issue>):<fpage>2677</fpage>&#x2013;<lpage>84</lpage>. doi:<pub-id pub-id-type="doi">10.1162/jocn.2009.21407</pub-id>; <pub-id pub-id-type="pmid">19929323</pub-id></mixed-citation></ref>
<ref id="ref-37"><label>[37]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Lamontagne</surname> <given-names>PJ</given-names></string-name>, <string-name><surname>Benzinger</surname> <given-names>TL</given-names></string-name>, <string-name><surname>Morris</surname> <given-names>JC</given-names></string-name>, <string-name><surname>Keefe</surname> <given-names>S</given-names></string-name>, <string-name><surname>Hornbeck</surname> <given-names>R</given-names></string-name>, <string-name><surname>Xiong</surname> <given-names>C</given-names></string-name>, <etal>et al.</etal></person-group> <article-title>OASIS-3: longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and Alzheimer disease</article-title>. <comment>medrxiv</comment>; <year>2019</year>. doi:<pub-id pub-id-type="doi">10.1101/2019.12.13.19014902</pub-id>.</mixed-citation></ref>
<ref id="ref-38"><label>[38]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Aithal</surname><given-names>N</given-names></string-name></person-group>.<comment>(n.d.)</comment>. <article-title>OASIS Alzheimer&#x2019;s detection [Data set]. Kaggle</article-title>. [cited 2025 Feb 10]. Available from: <ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/ninadaithal/imagesoasis">https://www.kaggle.com/datasets/ninadaithal/imagesoasis</ext-link>.</mixed-citation></ref>
</ref-list>
</back></article>

















