<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">30698</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2022.030698</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Cartesian Product Based Transfer Learning Implementation for Brain Tumor Classification</article-title>
<alt-title alt-title-type="left-running-head">Cartesian Product Based Transfer Learning Implementation for Brain Tumor Classification</alt-title>
<alt-title alt-title-type="right-running-head">Cartesian Product Based Transfer Learning Implementation for Brain Tumor Classification</alt-title>
</title-group>
<contrib-group content-type="authors">
<contrib id="author-1" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Usmani</surname><given-names>Irfan Ahmed</given-names>
</name><xref ref-type="aff" rid="aff-1">1</xref><email>iausmani@ssuet.edu.pk</email></contrib>
<contrib id="author-2" contrib-type="author">
<name name-style="western"><surname>Qadri</surname><given-names>Muhammad Tahir</given-names>
</name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Zia</surname><given-names>Razia</given-names>
</name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Aziz</surname><given-names>Asif</given-names>
</name><xref ref-type="aff" rid="aff-2">2</xref></contrib>
<contrib id="author-5" contrib-type="author">
<name name-style="western"><surname>Saeed</surname><given-names>Farheen</given-names>
</name><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<aff id="aff-1"><label>1</label><institution>Electronic Engineering Department, Sir Syed University of Engineering &#x0026; Technology</institution>, <addr-line>Karachi, 75300</addr-line>, <country>Pakistan</country></aff>
<aff id="aff-2"><label>2</label><institution>Department of Computer Science, Bahria University Karachi Campus</institution>, <addr-line>Karachi, 75260</addr-line>, <country>Pakistan</country></aff>
<aff id="aff-3"><label>3</label><institution>Christus Trinity Clinic</institution>, <addr-line>Texas</addr-line>, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Irfan Ahmed Usmani. Email: <email>iausmani@ssuet.edu.pk</email></corresp>
</author-notes>
<pub-date pub-type="epub" date-type="pub" iso-8601-date="2022-06-14"><day>14</day>
<month>06</month>
<year>2022</year></pub-date>
<volume>73</volume>
<issue>2</issue>
<fpage>4369</fpage>
<lpage>4392</lpage>
<history>
<date date-type="received">
<day>31</day>
<month>3</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>12</day>
<month>5</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2022 Usmani et al.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Usmani et al.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_30698.pdf"></self-uri>
<abstract>
<p>Knowledge-based transfer learning techniques have shown good performance for brain tumor classification, especially with small datasets. However, to obtain an optimized model for targeted brain tumor classification, it is challenging to select a pre-trained deep learning (DL) model, optimal values of hyperparameters, and optimization algorithm (solver). This paper first presents a brief review of recent literature related to brain tumor classification. Secondly, a robust framework for implementing the transfer learning technique is proposed. In the proposed framework, a Cartesian product matrix is generated to determine the optimal values of the two important hyperparameters: batch size and learning rate. An extensive exercise consisting of 435 simulations for 11 state-of-the-art pre-trained DL models was performed using 16 paired hyperparameters from the Cartesian product matrix to input the model with the three most popular solvers (stochastic gradient descent with momentum (SGDM), adaptive moment estimation (ADAM), and root mean squared propagation (RMSProp)). The 16 pairs were formed using individual hyperparameter values taken from literature, which generally addressed only one hyperparameter for optimization, rather than making a grid for a particular range. The proposed framework was assessed using a multi-class publicly available dataset consisting of glioma, meningioma, and pituitary tumors. Performance assessment shows that ResNet18 outperforms all other models in terms of accuracy, precision, specificity, and recall (sensitivity). The results are also compared with existing state-of-the-art research work that used the same dataset. The comparison was mainly based on performance metric &#x201C;accuracy&#x201D; with support of three other parameters &#x201C;precision,&#x201D; &#x201C;recall,&#x201D; and &#x201C;specificity.&#x201D; The comparison shows that the transfer learning technique, implemented through our proposed framework for brain tumor classification, outperformed all existing approaches. To the best of our knowledge, the proposed framework is an efficient framework that helped reduce the computational complexity and the time to attain optimal values of two important hyperparameters and consequently the optimized model with an accuracy of 99.56%.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Deep transfer learning</kwd>
<kwd>Cartesian product</kwd>
<kwd>hyperparameter optimization</kwd>
<kwd>magnetic resonance imaging (MRI)</kwd>
<kwd>brain tumor classification</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>One of the most widely known imperative causes for the increase in deaths among adults and children is brain tumors, which emerge as a group of anomalous cells developing inside or around the brain [<xref ref-type="bibr" rid="ref-1">1</xref>]. A precise and early brain tumor diagnosis plays a significant role in successful therapy. Among imaging modalities, MRI is the most extensively utilized non-invasive approach that succor radiologists and physicians in the discernment, diagnosis, and classification of brain tumors [<xref ref-type="bibr" rid="ref-2">2</xref>&#x2013;<xref ref-type="bibr" rid="ref-4">4</xref>]. The radiologist approaches brain tumor classification in two ways: (i) categorizing the normal and anomalous magnetic resonance (MR) images and (ii) Scrutinize types and stages of the anomalous MR images [<xref ref-type="bibr" rid="ref-2">2</xref>].</p>
<p>Since brain tumors show a high-level of dissimilarities related to size, shape, and intensity [<xref ref-type="bibr" rid="ref-5">5</xref>], and tumors from various neurotic types might show comparatively similar appearances [<xref ref-type="bibr" rid="ref-6">6</xref>], therefore classification into different types and stages has become quite a wide research topic [<xref ref-type="bibr" rid="ref-7">7</xref>,<xref ref-type="bibr" rid="ref-8">8</xref>]. Manual classification of comparatively similar appearing brain tumor MR images is quite a challenging task, which relies upon the accessibility and capability of the radiologists. Despite the radiologist&#x2019;s skills, the human visual system always bounds the analysis as the knowledge contained in an MR image surpasses the perception of the human visual system. Thus, the computer was used as the second eye to understand the MR images.</p>
<p>Computer vision-based image analysis methods usually encompass several sub-processes: preprocessing, segmentation, feature extraction, and classification. A low-level process, preprocessing consists of operations such as image sharpening, contrast enhancement, noise reduction, etc. [<xref ref-type="bibr" rid="ref-8">8</xref>]. Segmentation and classification are in the mid-level process domain. Image segmentation is the process used to extract the anomalous region from MR images, which helps to accurately locate the tumor and determine its size. The classification results depend upon suitable feature extraction from the delineated segmented region. Feature extraction is generally dependent on the knowledge of an expert in a particular domain, which makes it quite challenging for an unskilled person to use it in traditional image processing as well as in machine learning. One can eliminate the manual feature extraction problem by using DL approaches based on the self-learning hierarchical fashion principle.</p>
<p>Deep learning, especially convolutional neural networks (CNNs), outperforms many machine learning (ML) approaches in different areas, such as generating text [<xref ref-type="bibr" rid="ref-9">9</xref>], natural language processing [<xref ref-type="bibr" rid="ref-10">10</xref>], speech recognition [<xref ref-type="bibr" rid="ref-11">11</xref>], face verification [<xref ref-type="bibr" rid="ref-12">12</xref>], object detection [<xref ref-type="bibr" rid="ref-13">13</xref>], image description [<xref ref-type="bibr" rid="ref-14">14</xref>], machine translation [<xref ref-type="bibr" rid="ref-15">15</xref>], and the game of Go [<xref ref-type="bibr" rid="ref-16">16</xref>]. In particular, improved performance in computer vision boosted the utilization of DL methods for brain tumor MR image analysis [<xref ref-type="bibr" rid="ref-17">17</xref>,<xref ref-type="bibr" rid="ref-18">18</xref>]. For decades, CNN&#x2019;s have been utilized but, they gained popularity when Krizhevsky [<xref ref-type="bibr" rid="ref-19">19</xref>] participated and won the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) by developing a DL model &#x201C;AlexNet&#x201D; trained on ImageNet dataset [<xref ref-type="bibr" rid="ref-20">20</xref>], in 2012. Another similar but deeper visual geometry group (VGG) network, known as VGGNet, was presented by Zisserman and Simonyan [<xref ref-type="bibr" rid="ref-21">21</xref>] for classification in the 2014 ILSVRC. In fact, DL progressed with the availability of big data [<xref ref-type="bibr" rid="ref-20">20</xref>,<xref ref-type="bibr" rid="ref-22">22</xref>,<xref ref-type="bibr" rid="ref-23">23</xref>], advanced learning algorithms [<xref ref-type="bibr" rid="ref-24">24</xref>&#x2013;<xref ref-type="bibr" rid="ref-28">28</xref>], and powerful graphical processing units (GPUs).</p>
<p>Algorithms based on deep learning for a certain classification task are difficult to effectively reuse and generalize. Therefore, a new algorithm from scratch has to be rebuilt even for a similar task that requires considerable computational power and time. At the same time, if sufficient data are not available for similar tasks, the developed algorithm may have difficulty in attaining the desired performance or might even fail to complete the tasks. In case of a shortage of data, for example, in brain tumor classification, the concept of knowledge-based transfer learning technique may be employed by using pre-trained DL models that are already trained for other classification problems. However, selection of a pre-trained DL model, hyperparameters&#x0027; optimal values, and an optimization algorithm (solver) is challenging tasks to obtain an optimized model for targeted brain tumor classification. This research aims to provide a robust framework for implementing knowledge-based transfer learning techniques for brain tumor classification with a small-sized dataset. The research contributions of this study are as follows:
<list list-type="bullet">
<list-item>
<p>A robust framework is proposed for brain tumor classification that describes complete approach to utilize the knowledge of a pre-trained DL model and re-train it for brain tumor classification with a small dataset.</p></list-item>
<list-item>
<p>Following the framework, the knowledge transfer technique is deployed using 11 state-of-the-art pre-trained DL models to select an appropriate model for brain tumor classification.</p></list-item>
<list-item>
<p>The concept of Cartesian Product Matrix is introduced to find the most suitable pair of two important hyperparameters, batch size and learning rate. The cartesian product matrix is formed from initialized set of hyperparameters vector. Each pre-trained DL model was evaluated for the three most popular solvers (SGDM, ADAM, RMSProp) to obtain the most appropriate set consisting of a solver, one batch size, and one learning rate.</p></list-item>
<list-item>
<p>To investigate the model performance, a comprehensive comparative analysis of each model for brain tumor classification was conducted.</p></list-item>
</list></p>
<p>The rest of the paper is divided into five sections. Section 2 presents a comprehensive literature review related to brain tumor classification with the focus on their approaches. Section 3 describes the proposed framework to implement transfer learning technique with the used dataset, preprocessing, augmentation, pre-trained networks, fine-tuning and performance assessment. Section 4 discusses the results and analysis of the proposed frame work. At the end, conclusion and future work have been discussed in Section 5.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Literature Review</title>
<p>To classify brain tumors, many research efforts have contributed to different subfields of detection and classification processes. Different techniques have been presented for the segmentation of anomalous regions in MR images [<xref ref-type="bibr" rid="ref-29">29</xref>&#x2013;<xref ref-type="bibr" rid="ref-32">32</xref>]. MRI(s) are classified into different types and grades after segmentation. In [<xref ref-type="bibr" rid="ref-33">33</xref>&#x2013;<xref ref-type="bibr" rid="ref-35">35</xref>], binary classifiers were used for tumor classification into malignant and benign classes.</p>
<p>Abdolmaleki&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-33">33</xref>] extracted 13 different features to differentiate malignant and benign tumors using a three-level neural network. The features were extracted with the help of the radiologists. The proposed methods were applied to a dataset of 165 patients&#x2019; MRIs, which helped to achieve accuracies of 94% and 91% for benign and malignant tumors, respectively. In [<xref ref-type="bibr" rid="ref-34">34</xref>], the author categorized brain tumors as malignant or benign by using a hybrid scheme consisting of a genetic algorithm (GA) and a support vector machine (SVM). Furthermore, Papageorgiou&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-35">35</xref>] introduced fuzzy cognitive maps (FCM) to classify tumors into low-grade and high-grade gliomas. Papageorgiou&#x2019;s FCM model used 100 cases and achieved 93.22% accuracy for high-grade gliomas and 90.26% accuracy for low-grade gliomas.</p>
<p>In addition to the binary classification of brain tumors, Zacharaki&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-36">36</xref>] proposed a technique using SVM and K-nearest neighbor (KNN) for multi-classification of brain tumors into primary gliomas, meningiomas, metastases, and glioblastomas. This research encompasses sub-processes, such as segmentation and feature extraction, to perform multi-classification. Accuracies of 88% and 85% for binary classification and multi-classification, respectively, were achieved. Hsieh&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-37">37</xref>] also proposed a technique based on extracted features from a dataset of 107 MR images, consisting of 73 and 34 low-grade and high-grade glioma MRIs respectively, to measure malignancy. The proposed technique produced accuracies of 83%, 76%, and 88% using local, global, and fused features, respectively. Sachdeva&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-38">38</xref>] presented a technique that depends on optimal features, based on color and texture, extracted from segmented regions, and used GA in combination with SVM and artificial neural network (ANN). The technique achieved accuracies of 91.7% and 94.9% for GA-SVM and GA-ANN, respectively.</p>
<p>Cheng&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-5">5</xref>] presented a framework based on the following approaches to extract features: intensity histogram, gray level co-occurrence matrix (GLCM), and bag-of-words (BoW). In the domain of classification, this research is considered as the first significant work towards brain tumor multi-classification using the challenging and largest publicly available dataset in figshare [<xref ref-type="bibr" rid="ref-39">39</xref>], consists of glioma, meningioma, and pituitary brain tumor types. The approach is dependent on a wide range of features extracted from a manually defined segmented region and applied as input to different classifiers. The author used sensitivity, specificity, and classification accuracy as measurement parameters and obtained the best results using SVM for a particular set of features. In [<xref ref-type="bibr" rid="ref-40">40</xref>], Ismael and Abdel Qadir worked on the same challenging publicly available dataset, as used in [<xref ref-type="bibr" rid="ref-5">5</xref>] and proposed an algorithm for brain tumor classification. The algorithm uses the Gabor filter and discrete wavelet transform (DWT) to extract the statistical features used to train the classifier. The authors randomly selected 70% and 30% of MR images for training and validation of the classifier, respectively. Feature extraction from a segmented region of interest (ROI) and its appropriate selection is significant in establishing the best learning of the classifier. These handcrafted features must be extracted by an expert with sound knowledge and skills to determine the most significant features. Furthermore, the feature extraction process requires a significant amount of time and is susceptible to errors when dealing with big data [<xref ref-type="bibr" rid="ref-41">41</xref>].</p>
<p>In contrast to ML, DL algorithms do not require handcrafted features. DL requires a preprocessed dataset and applies a self-learning approach to determine the significant features [<xref ref-type="bibr" rid="ref-42">42</xref>]. Eventually, many CNNs such as AlexNet [<xref ref-type="bibr" rid="ref-19">19</xref>], ResNet [<xref ref-type="bibr" rid="ref-43">43</xref>], and VGGNet [<xref ref-type="bibr" rid="ref-21">21</xref>] were deployed after being trained on a large ImageNet dataset in ILSVRC [<xref ref-type="bibr" rid="ref-20">20</xref>] for classification. These CNNs are recognized as state-of-the-art DL models [<xref ref-type="bibr" rid="ref-19">19</xref>,<xref ref-type="bibr" rid="ref-21">21</xref>,<xref ref-type="bibr" rid="ref-43">43</xref>]. Afshar&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-44">44</xref>] proposed a method for the brain tumor classification based on the CapsNet model. The method relies on adopting CapsNets, capability of CapsNets, analysis of overfitting, and output visualization pattern setup. Furthermore, Zia&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-45">45</xref>] presented a brain tumor classification technique that relied on rectangular window image cropping. The technique used DWT for feature extraction, principal component analysis for dimensionality reduction, and SVM for classification. Hossam&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-46">46</xref>] presented a CNN-based DL model to classify different types of brain tumors using two datasets. They used one dataset for three-class brain tumor classification, while the other was used for glioma tumor classification into grades II, III, and IV. The proposed methodology proved to be useful for the multi-classification of brain tumors. Jia&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-47">47</xref>] achieved an accuracy of 98.51% for normal and anomalous brain tissue classification while evaluating MR images. The author used a support vector machine to perform fully automatic heterogeneous segmentation of brain tumors based on the extreme learning machine (ELM) deep learning technique. They proposed a fully automatic algorithm with the support of structural, relaxometry, and morphological details to obtain optimum features.</p>
<p>The increase in deep CNN performance, owing to the concepts of feature extraction in a hierarchical fashion, motivated researchers to transfer the pre-trained network knowledge acquired during training with millions of images into new classification tasks with a small amount of data, to take advantage of their learned parameters, specifically, weights. <xref ref-type="table" rid="table-1">Tab. 1</xref> summarizes a review of research performed in the area of brain tumor classification based on knowledge-based transfer learning for the period 2017&#x2013;2022. In [<xref ref-type="bibr" rid="ref-48">48</xref>], the author fine-tuned ResNet [<xref ref-type="bibr" rid="ref-49">49</xref>] and VGG [<xref ref-type="bibr" rid="ref-21">21</xref>], the pre-trained classification models, to distinguish between high-grade and low-grade brain tumors. They implemented the concept of transfer learning and achieved an accuracy of 97.19%. Talo&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-50">50</xref>] presented a binary classifier based on the transfer learning concept for brain MR image classification into normal and anomalous brain images. The authors fine-tuned a pre-trained DL model, ResNet34, and claimed that this is the first work on brain MRI classification using the deep transfer learning approach. They used a dataset containing 613 images [<xref ref-type="bibr" rid="ref-51">51</xref>] and obtained better performance in comparison with other DL-based approaches using the same dataset. Sajjad&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-52">52</xref>] used a pre-trained DL model and customized it for grading of brain tumors. The authors evaluated the proposed system on both original and segmented data. The results show a convincing performance in comparison with benchmark systems. Swati&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-53">53</xref>] used VGG-19 [<xref ref-type="bibr" rid="ref-21">21</xref>], a pre-trained DL model, to transfer knowledge. The author performed a manual block-wise fine-tuning approach for the classification of a more challenging task of multi-class brain tumors. Following his work in [<xref ref-type="bibr" rid="ref-53">53</xref>], Swati proposed a method to develop similar brain tumor images based on content retrieval [<xref ref-type="bibr" rid="ref-54">54</xref>]. For similarity measurements, the author used the VGG-19 [<xref ref-type="bibr" rid="ref-21">21</xref>] features. Deepak&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-55">55</xref>] utilized the knowledge of GoogleNet (ImageNet) [<xref ref-type="bibr" rid="ref-56">56</xref>] as a pre-trained DL model for brain tumor classification into different classes.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Literature summary related to MR image processing using transfer learning techniques</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Reference</th>
<th>Year</th>
<th>Study Domain</th>
<th>Dataset</th>
<th>Used Model for Classification</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>[<xref ref-type="bibr" rid="ref-57">57</xref>]</td>
<td>2017</td>
<td>Segmentation of Brain Tumor</td>
<td>Radboud University Nijmegen Diffusion Tensor and Magnetic Resonance Cohort (RUN DMC)</td>
<td>15 layers CNN <break/>without pooling layer</td>
<td>They trained a CNN model on brain MRI, followed by its assessment with different domains images.</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-58">58</xref>]</td>
<td>2018</td>
<td>Brain Tumor Detection &#x0026; Classification</td>
<td>The Cancer Genome<break/> Atlas Glioblastoma Multiforme (TCGA-GBM) &#x0026; The Cancer Genome Atlas Low Grade Glioma (TCGA-LGG)</td>
<td>VGG 16 and<break/> ResNet 50</td>
<td>The authors implemented the concept<break/> of transfer learning by utilizing two pre-trained convolutional networks to distinguish high grade glioma (HGG)<break/> and low grade glioma (LGG).</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-50">50</xref>]</td>
<td>2019</td>
<td>Brain abnormality classification</td>
<td>Harvard Medical<break/> School Data</td>
<td>ResNet34</td>
<td>In this work pre-trained ResNet model is used for classification of normal and anomalous brain MRI scans.</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-53">53</xref>]</td>
<td>2019</td>
<td>Brain tumor classification</td>
<td>Brain tumor public dataset (Figshare)</td>
<td>VGG-19</td>
<td>Performed block-wise fine-tuning for<break/> a more challenging classification of multi-class brain tumors.</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-54">54</xref>]</td>
<td>2019</td>
<td>Content-based Image retrieval</td>
<td>Brain tumor public dataset (Figshare)</td>
<td>VGG-19</td>
<td>VGG-19 features were used to retrieve similar brain tumor images. </td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-55">55</xref>]</td>
<td>2019</td>
<td>Brain tumor classification</td>
<td>Brain tumor public dataset (Figshare)</td>
<td>GoogleNet </td>
<td>A pre-trained model GoogleNet was used for brain tumor classification into three different classes.</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-59">59</xref>]</td>
<td>2020</td>
<td>Grading based Brain tumor classification</td>
<td>Multiple Brain tumor dataset</td>
<td>AlexNet</td>
<td>Brain tumor grading based on a pre-trained AlexNet model was proposed.</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-60">60</xref>]</td>
<td>2020</td>
<td>Multi-grade Classification of Brain Tumor</td>
<td>Radiopedia dataset</td>
<td>VGGNet</td>
<td>The paper investigates the CNN models for the Classification of Brain Tumor.</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-61">61</xref>]</td>
<td>2020</td>
<td>Brain MRI Reconstruction for classification</td>
<td>The Molecular Interactive Display and Simulation (MIDAS) Dataset</td>
<td>Custom CNN</td>
<td>The authors trained a customized CNN model using a public dataset. They fine-tuned the model for the reconstruction of brain MR Images.</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-62">62</xref>]</td>
<td>2020</td>
<td>MRI based Brain tumor diagnosis</td>
<td>Brain tumor public dataset (Figshare) &#x0026; Harvard Medical School Data</td>
<td>Inception V3, DensNet201</td>
<td>The author concatenated different features that are extracted from multiple layers of used pre-trained deep learning models and then utilized those features to classify the brain tumors using a softmax classifier.</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-63">63</xref>]</td>
<td>2021</td>
<td>Brain Tumor Classification</td>
<td>Brain tumor public dataset (Figshare)</td>
<td>Inception V3, Xception, and multiple ML Algorithm</td>
<td>Author used Inception V3 &#x0026; Xception models to extract the features and then different deep and ML classifiers for the classification. The main contribution of the author is to introduced ensemble model based on the extracted features from used deep learning models</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-64">64</xref>]</td>
<td>2021</td>
<td>Brain Tumor Classification using GoogleNet features and ML algorithm</td>
<td>Brain tumor public dataset (Figshare) &#x0026; Harvard Medical School Data</td>
<td>GoogleNet</td>
<td>In this paper, the authors used the Pre-trained GoogleNet to extract the features and input to the ML-based classifiers K-NN and SVM.</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-65">65</xref>]</td>
<td>2021</td>
<td>Multimodal Brain Tumor Classification</td>
<td>Brain Tumor Segmentation (BraTS) 2018 Dataset</td>
<td>VGG-19</td>
<td>First, the authors extracted features from two different dense layers of VGG19 pre-trained deep learning model and fused them to get more informative features&#x0027; knowledge. Second, they used IPSO algorithm to select the optimum features.</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-66">66</xref>]</td>
<td>2021</td>
<td>Brain Tumor classification using Ensemble of Deep features and ML algorithm</td>
<td>02 different brain tumor dataset from Kaggle consists of MR Images with and without tumor &#x0026; Brain tumor public dataset (Figshare).</td>
<td>DeneNet-160, InceptionV3, ResNet50 for feature extraction and Multiple ML algorithms for classification</td>
<td>Authors used ensemble features, extracted from DeneNet-160, InceptionV3, and ResNet50 and use multiple ML algorithms such as K-NN, SVM, AdaBoost, etc., to classify MR images into different classes,</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-67">67</xref>]</td>
<td>2022</td>
<td>Brain Tumor Classification</td>
<td>02 different brain tumor dataset from BraTS and Figshare</td>
<td>Features extracted from Xception model</td>
<td>Authors extracted features using Xception network and proposed multi-level attention network (MANet) by designing spatial attention &#x0026; long short-term memory network (LSTM)-based cross-channel module.</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3">
<label>3</label>
<title>Proposed Framework</title>
<p>This section explains all the details related to the proposed framework, presented in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>, for implementing the transfer learning technique utilizing pre-trained DL models for the classification task. Any pre-trained classification model with its learned parameters can be used after customization. The proposed framework is based on a novel idea to input hyperparameters in the form of ordered pairs (batch size and learning rate). The ordered pair can be defined as a 2-tuple element of a matrix constructed using the concept of the Cartesian product of two initialized sets of batch size and learning rate. The following subsections discuss the step-by-step implementation of the transfer learning technique using the proposed framework.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Proposed framework to implement transfer learning technique</title>
</caption>
<graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30698-fig-1.png"/>
</fig>
<sec id="s3_1">
<label>3.1</label>
<title>Dataset</title>
<p>A publicly accessible brain tumor dataset of 233 patients consisting of 3064 MR images [<xref ref-type="bibr" rid="ref-39">39</xref>], was used. Three types of brain tumor MR images comprising 1426 slices of gliomas, 708 slices of meningioma, and 930 slices of pituitary tumors are available in this dataset. <xref ref-type="fig" rid="fig-2">Fig. 2</xref> illustrates each type of tumor percentage in the dataset. The data is available in .mat file format (Matlab data format), contains a label for the image, patient ID, image in the form of a 512 &#x00D7; 512 matrix, a tumor mask, and discrete points coordinate on tumor border.</p>
<p>Dataset statistics clearly depicts that the classes in the dataset are imbalance. Imbalance dataset is itself a challenging issue in which predicting correctly smaller size class is more critical than larger size class. In this research, model-based approach, such as transfer learning technique, is used to tackle this issue. Experiments in previous research [<xref ref-type="bibr" rid="ref-50">50</xref>,<xref ref-type="bibr" rid="ref-53">53</xref>&#x2013;<xref ref-type="bibr" rid="ref-55">55</xref>,<xref ref-type="bibr" rid="ref-57">57</xref>&#x2013;<xref ref-type="bibr" rid="ref-64">64</xref>] have supported the use of model-based approach provided that the tuning parameters are chosen carefully. In fact, transfer learning technique takes advantage of the features extracted from source domain at different levels and compensates for the overall lack of samples in the training data or targeted domain, ending up with a good trained model for imbalanced small dataset.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Percentage of different type of tumors in the dataset</title>
</caption>
<graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30698-fig-2.png"/>
</fig>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Preprocessing</title>
<p>Medical image analysis requires data preprocessing, which includes contrast enhancement and standardization. First, the dataset was normalized to the intensity values and then mapped to grayscale&#x2019;s 256 levels using the relationship, as described in <xref ref-type="disp-formula" rid="eqn-1">Eq. (1)</xref>.</p>
<p><disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>&#x00D7;</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mrow><mml:mn>8</mml:mn></mml:mrow></mml:msup></mml:math></disp-formula>where <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> corresponds to any one of the 8-bit grayscale pixel values between 0 and 255 against <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mi>x</mml:mi></mml:math></inline-formula> at position<inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are the maximum pixel intensity and minimum pixel intensity in the original image, respectively. <xref ref-type="fig" rid="fig-3">Fig. 3</xref> shows the original and enhanced images.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Original and enhanced image</title>
</caption>
<graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30698-fig-3.png"/>
</fig>
<p>The enhanced resultant images are resized and concatenated three times, as per the standard input image size of the pre-trained DL models, to create channels. <xref ref-type="table" rid="table-2">Tab. 2</xref> lists the models used in this research and their standard input sizes.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Used pre-trained models and input image size</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Model Name</th>
<th>Standard Input Image Size</th>
</tr>
</thead>
<tbody>
<tr>
<td>AlexNet, SqueezeNet</td>
<td><inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:mn>227</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>227</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>3</mml:mn></mml:math></inline-formula></td>
</tr>
<tr>
<td>GoogleNet (ImageNet), GoogleNet (Places365), VGG 16, VGG 19, ResNet 18, ResNet 50, ResNet 101, MobileNet</td>
<td><inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:mn>224</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>224</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>3</mml:mn></mml:math></inline-formula></td>
</tr>
<tr>
<td>Inception V3</td>
<td><inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mn>299</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>299</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>3</mml:mn></mml:math></inline-formula></td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Data Augmentation</title>
<p>High-quality big data plays a significant role in the effective training of any DL model. It is undoubtedly expensive to collect sufficient medical data available for training to reconstruct the classifier. In general, data augmentation techniques are used to enlarge the data size to provide a larger input sample space to achieve the desired accuracy and to reduce overfitting.</p>
<p>In this research, extensive data augmentation is performed not only to increase the data size but also to check the effectiveness of the knowledge-based transfer learning technique with and without data augmentation, particularly for our brain tumor classification task. A total of eight augmentation techniques with 32 different parameters were implemented to extend each data sample into 32 samples. Out of eight, four techniques&#x2013;flipping, rotation, shears, and skewness&#x2013;are for geometric transformation invariance, and the rest of the techniques, sharpening, Gaussian blur, emboss, and edge detection are used for noise invariance [<xref ref-type="bibr" rid="ref-52">52</xref>]. The details of the total dataset size and each class size before and after augmentation are listed in <xref ref-type="table" rid="table-3">Tab. 3</xref>.</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Brain tumor dataset statistics with and without augmentation</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th rowspan="2">Types of tumors</th>
<th colspan="2">Dataset statistics</th>
</tr>
<tr>
<th>Original dataset</th>
<th>Augmented dataset</th>
</tr>
</thead>
<tbody>
<tr>
<td>Glioma</td>
<td>1426</td>
<td>45632</td>
</tr>
<tr>
<td>Meningioma</td>
<td>708</td>
<td>22656</td>
</tr>
<tr>
<td>Pituitary</td>
<td>930</td>
<td>29760</td>
</tr>
<tr>
<td>Total</td>
<td>3064</td>
<td>98084</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>Pre-Trained Networks</title>
<p>For the classification task, there were many pre-trained CNN models. In this research, the idea behind the proposed framework is domain adaptation, in which transfer learning allows us to utilize the network and knowledge in terms of network weights of pre-trained DL models, from a source domain, to re-train it using new training data for another classification task in the target domain. The data size and similarity between the target and source domain tasks are important parameters for pre-trained model selection. Because almost all pre-trained existing DL models are trained on millions of natural images, choosing one pre-trained model directly to implement the transfer learning technique for the classification of brain tumors is quite difficult. For this reason, we used 11 contemporary pre-trained DL models, as shown in <xref ref-type="table" rid="table-2">Tab. 2</xref>, and fine-tuned them to find the optimum model for our classification task. All 11 models were selected based on their learned rich feature representation as they were trained on the ImageNet database, consisting of 1000 object categories except one GoogleNet variant that was trained on images from the places 365 database, consisting of 365 image categories. Other than the &#x201C;similarity&#x201D; between source and target domain, the selected model&#x2019;s performance and efficiency differs because of many other characteristics such as nature of network architecture (sequential or Directed Acyclic Graph (DAG)), depth of the network (number of network layers), number of learned parameters (weights), etc. In this research, the selected models contain all types of networks, that is, sequential, DAG, and advanced compact CNN models.</p>
</sec>
<sec id="s3_5">
<label>3.5</label>
<title>Fine-Tuning</title>
<p>In the proposed framework, all 11 selected pre-existing deep learning models were fine-tuned for brain tumor multi-classification. In general, deep learning models consist of different layers, including convolutional, max pooling, FC, softmax layer, and a last cross-entropy-based classification layer. Using the concept of transfer learning, we retained all the network layers with their learned parameters, particularly weights, except that the last FC layer was replaced with a new FC layer with output size three, and the network&#x2019;s last cross-entropy-based classification layer is replaced with a new one. The retained layers help the network to use the low-level extracted features from the pre-trained model, while the replaced layers facilitate the network in high-level feature learning. The modified network is then fine-tuned to obtain an optimum model by training it with our brain tumor dataset. For training, the transfer learning technique uses two types of parameters: learned parameters (e.g., weights) from the original pre-existing deep learning model and hyperparameters, for example, batch size and learning rate, but the latter needs to be optimized. Because the choice of hyperparameters and their values depends on the targeted task, types in the dataset, and size of the dataset, it is quite difficult to select one specific hyper-parameter(s) optimal value that works for all pre-trained models.</p>
<p>In general, while training the CNN, a back-propagation algorithm is employed to minimize the cost function. <xref ref-type="disp-formula" rid="eqn-2">Eq. (2)</xref> illustrates the cost function</p>
<p><disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mi>C</mml:mi><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>m</mml:mi></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:munderover><mml:mi>ln</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:mi>m</mml:mi></mml:math></inline-formula> represents the total number of samples (images) for training, <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:msup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> represents <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:msup><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> image sample with a <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:msup><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> label, and <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the probability of true classification. Optimization algorithms (solvers), such as stochastic gradient descent, are used to perform learning for mini-batches of size <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:mi>N</mml:mi></mml:math></inline-formula> using the gradient, computed using back-propagation, which results in minimizing the cost function <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mi>C</mml:mi></mml:math></inline-formula>. Considering, for a convolutional layer <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mi>l</mml:mi></mml:math></inline-formula>, the weights <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:msubsup><mml:mi>W</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> at iteration <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:mi>t</mml:mi></mml:math></inline-formula> and mini-batch cost <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:mrow><mml:mover><mml:mi>C</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> to update the weights in the following iteration, using <xref ref-type="disp-formula" rid="eqn-3">Eqs. (3)</xref>, <xref ref-type="disp-formula" rid="eqn-4">(4)</xref>, and <xref ref-type="disp-formula" rid="eqn-5">(5)</xref>.</p>
<p><disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:msup><mml:mi>&#x03B3;</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mi>&#x03B3;</mml:mi><mml:mrow><mml:mrow><mml:mo>&#x230A;</mml:mo><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mrow><mml:mi>t</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mi>m</mml:mi></mml:mfrac></mml:mstyle><mml:mo>&#x230B;</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:math></disp-formula></p>
<p><disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:msubsup><mml:mi>V</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:msubsup><mml:mi>V</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x2212;</mml:mo><mml:msup><mml:mi>&#x03B3;</mml:mi><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msup><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:mrow><mml:mover><mml:mi>C</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p><disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:msubsup><mml:mi>W</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mi>W</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:msubsup><mml:mi>V</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math></disp-formula>where <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the layer <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:mi>l</mml:mi></mml:math></inline-formula> learning rate, <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:mi>&#x03B3;</mml:mi></mml:math></inline-formula> represents the scheduling rate that affects the learning rate, and <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:mi>&#x03BC;</mml:mi></mml:math></inline-formula> represents the momentum that affects the weights, updated previously, in the current iteration. It is clear from the above equations that among all hyper-parameters, batch size and learning rate are the two most important hyper-parameters and their optimal values, side by side, may help in solving different issues such as convergence problems, convergence time, and overfitting, and ultimately improve the accuracy of the pre-trained DL model while using them to implement the transfer learning technique. In this research, we used the following optimization strategy to obtain the optimal pair rather than the individual optimal values of the two most important hyper-parameters: learning rate and batch size.</p>
<p>We initialized two different sets <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:mi>Y</mml:mi></mml:math></inline-formula> for both hyperparameters, consisting of possible values based on their available values in various studies [<xref ref-type="bibr" rid="ref-46">46</xref>,<xref ref-type="bibr" rid="ref-50">50</xref>,<xref ref-type="bibr" rid="ref-53">53</xref>,<xref ref-type="bibr" rid="ref-62">62</xref>,<xref ref-type="bibr" rid="ref-68">68</xref>,<xref ref-type="bibr" rid="ref-69">69</xref>]. We define <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:mi>X</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mn>7</mml:mn><mml:mo>,</mml:mo><mml:mn>10</mml:mn><mml:mo>,</mml:mo><mml:mn>32</mml:mn><mml:mo>,</mml:mo><mml:mn>128</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mn>0.01</mml:mn><mml:mo>,</mml:mo><mml:mn>0.001</mml:mn><mml:mo>,</mml:mo><mml:mn>0.0001</mml:mn><mml:mo>,</mml:mo><mml:mn>0.00001</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> for the batch size and learning rate, respectively. A<inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:mn>2</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mtext>dimensional</mml:mtext></mml:mrow></mml:math></inline-formula> matrix of size <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:mn>4</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>4</mml:mn></mml:math></inline-formula>, containing <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:mn>2</mml:mn><mml:mo>&#x2212;</mml:mo></mml:math></inline-formula>tuple elements, was generated by taking the Cartesian product of two initialized sets <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:mi>Y</mml:mi></mml:math></inline-formula>. The Cartesian product of two sets <inline-formula id="ieqn-33"><mml:math id="mml-ieqn-33"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-34"><mml:math id="mml-ieqn-34"><mml:mi>Y</mml:mi></mml:math></inline-formula> is the set of all ordered pairs <inline-formula id="ieqn-35"><mml:math id="mml-ieqn-35"><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, can be defined as</p>
<p><disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mi>X</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>x</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>X</mml:mi><mml:mrow><mml:mtext>&#xA0;and&#xA0;</mml:mtext></mml:mrow><mml:mrow><mml:mtext>y</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mtext>Y</mml:mtext></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>It can be generalized to <inline-formula id="ieqn-36"><mml:math id="mml-ieqn-36"><mml:mi>n</mml:mi></mml:math></inline-formula>-ary Cartesian product over <inline-formula id="ieqn-37"><mml:math id="mml-ieqn-37"><mml:mi>n</mml:mi></mml:math></inline-formula> sets <inline-formula id="ieqn-38"><mml:math id="mml-ieqn-38"><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x22EF;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> of different hyperparameters</p>
<p><disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>&#x00D7;</mml:mo><mml:mo>&#x22EF;</mml:mo><mml:mo>&#x00D7;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x22EF;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mtext>&#xA0;for every&#xA0;</mml:mtext></mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x22EF;</mml:mo><mml:mo>,</mml:mo><mml:mrow><mml:mtext>n</mml:mtext></mml:mrow><mml:mo>}</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>In our case, we transformed the Cartesian product vector into a matrix for a better understanding, as described in <xref ref-type="disp-formula" rid="eqn-8">Eq. (8)</xref>:</p>
<p><disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:mi>X</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnalign="center center center center" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>7</mml:mn><mml:mo>,</mml:mo><mml:mn>0.01</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>7</mml:mn><mml:mo>,</mml:mo><mml:mn>0.001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>7</mml:mn><mml:mo>,</mml:mo><mml:mn>0.0001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>7</mml:mn><mml:mo>,</mml:mo><mml:mn>0.00001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>10</mml:mn><mml:mo>,</mml:mo><mml:mn>0.01</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>10</mml:mn><mml:mo>,</mml:mo><mml:mn>0.001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>10</mml:mn><mml:mo>,</mml:mo><mml:mn>0.0001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>10</mml:mn><mml:mo>,</mml:mo><mml:mn>0.00001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>32</mml:mn><mml:mo>,</mml:mo><mml:mn>0.01</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>32</mml:mn><mml:mo>,</mml:mo><mml:mn>0.001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>32</mml:mn><mml:mo>,</mml:mo><mml:mn>0.0001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>32</mml:mn><mml:mo>,</mml:mo><mml:mn>0.00001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>128</mml:mn><mml:mo>,</mml:mo><mml:mn>0.01</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>128</mml:mn><mml:mo>,</mml:mo><mml:mn>0.001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>128</mml:mn><mml:mo>,</mml:mo><mml:mn>0.0001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mn>128</mml:mn><mml:mo>,</mml:mo><mml:mn>0.00001</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Each element of the Cartesian product matrix is applied as a pair of inputs for two hyperparameters to retrain the modified network architecture against each pre-trained deep learning model with our dataset for the brain tumor classification task. Each modified network architecture was evaluated for the three most popular solvers: SGDM, ADAM, and RMSProp. An extensive comparative assessment was conducted in terms of accuracy to obtain the optimal values of batch size and learning rate, along with the most appropriate solver.</p>
</sec>
<sec id="s3_6">
<label>3.6</label>
<title>Performance Assessment</title>
<p>The performance assessment of the proposed framework was carried out using the same performance metrics used in related references [<xref ref-type="bibr" rid="ref-5">5</xref>,<xref ref-type="bibr" rid="ref-40">40</xref>,<xref ref-type="bibr" rid="ref-52">52</xref>,<xref ref-type="bibr" rid="ref-53">53</xref>,<xref ref-type="bibr" rid="ref-55">55</xref>,<xref ref-type="bibr" rid="ref-62">62</xref>,<xref ref-type="bibr" rid="ref-64">64</xref>,<xref ref-type="bibr" rid="ref-70">70</xref>]. A classifier can be tested using four parameters: true positive (TP): an outcome where model predicts the positive class correctly, true negative (TN): an outcome where model predicts the negative class correctly, false positive (FP): an outcome where model predicts the positive class incorrectly, and false negative (FN): an outcome where model predicts the negative class incorrectly. These parameters can be extracted from the confusion matrix to compute the performance metrics: accuracy, precision, specificity, and sensitivity (recall).</p>
<p><bold>Precision:</bold></p>
<p>Precision represents the ratio of correctly predicted positive data samples (instances) to the total predicted positive instances. Mathematically,</p>
<p><disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:mrow><mml:mtext>Precision</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p><bold>Specificity:</bold></p>
<p>Specificity measures the ratio of correctly predicted negative instances to the all instances in an actual class. Mathematically,</p>
<p><disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:mrow><mml:mtext>Specificity</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p><bold>Sensitivity:</bold></p>
<p>Sensitivity measures the ratio of correctly predicted positive instances to the all instances in an actual class. Mathematically,</p>
<p><disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block"><mml:mrow><mml:mtext>Sensitivity</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>Recall</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p><bold>Accuracy:</bold></p>
<p>Accuracy measures the correctly classified instances with respect to the total number of instances.</p>
<p><disp-formula id="eqn-12"><label>(12)</label><mml:math id="mml-eqn-12" display="block"><mml:mrow><mml:mtext>Accuracy</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Results and Analysis of Framework</title>
<p>As discussed in <xref ref-type="table" rid="table-1">Tab. 1</xref>, many researchers have explored transfer learning techniques for brain tumor classification by utilizing the knowledge of pre-trained DL models. The models were fine-tuned using different hyperparameters. However, the literature review reveals that pre-trained models have not been deeply investigated for brain tumor classification, particularly for multi-hyperparameters simultaneously. As discussed earlier, each hyperparameter has an impact on the model&#x2019;s performance, depending on the targeted domain and task. To analyze the proposed framework, we performed an extensive comparative analysis to assess the optimal values of hyperparameters (Learning Rate and Batch Size) by applying their different values as 2-tuple input from a Cartesian product matrix, to 11 different deep learning models. The 11 models used for investigation were AlexNet, GoogleNet, GoogleNet (Places365) [<xref ref-type="bibr" rid="ref-71">71</xref>], ResNet18, ResNet50, ResNet101, VGG16, VGG19, SqueezeNet [<xref ref-type="bibr" rid="ref-72">72</xref>], MobileNet [<xref ref-type="bibr" rid="ref-73">73</xref>], and InceptionV3 [<xref ref-type="bibr" rid="ref-74">74</xref>]. The proposed framework was implemented and investigated for brain tumor classification using a system equipped with NVIDIA GEFORCE GTX 1080 &#x2013; 8GB Graphics and MATLAB 2020. The dataset was divided into 70%, 15%, and 15% for training, validation, and testing of the model, respectively. After customizing the pre-trained deep learning model, a total of 435 experiments were performed with each pair of inputs from the Cartesian matrix of batch size and learning rate for the three most popular solvers. All the results in terms of accuracy are presented in <xref ref-type="fig" rid="fig-4">Fig. 4</xref>.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>Performance evaluation in terms of accuracy for the pre-trained models using hyperparameters pair</title>
</caption>
<graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30698-fig-4.png"/>
</fig>
<p>To determine the optimal parameters, we first separated the maximum accuracy result for each pre-trained model and summarized it in <xref ref-type="table" rid="table-4">Tab. 4</xref> with other parameters observed during the training processes. The other parameters are the number of epochs utilized in convergence, number of iterations, validation accuracy, training time, and confusion matrix. To determine the optimized model for our brain tumor classification problem, all experimental results were carefully investigated. Starting with AlexNet, out of 48 combinations of {Solver type, Batch Size, Learning Rate}, our proposed framework performed well for the combination {SGDM, 32, 0.001} with an accuracy of 97.6%. AlexNet showed good performance for another combination {Adam, 128, 0.0001} with an accuracy of 97.82%, but its training time was much higher than the previous one, and could not converge until forced stop on completing 100 epochs. It is quite obvious from the experimental results that the training time for AlexNet is considerably less than that for other pre-trained networks because of its sequential nature. The results of GoogleNet (ImageNet), GoogleNet (Places365), and SqueezeNet are almost the same as AlexNet, even though they have a complex architecture based on a DAG network. The modified models VGG16, VGG19, MobileNet, and InceptionV3 when re-trained using our proposed framework performed better than the models discussed earlier. All three variants of ResNet, especially ResNet18, outperformed all other networks with parameters {SGDM, 32, 0.01} by achieving 99.56% accuracy when using our proposed framework for brain tumor classification. This is due to the ResNet working principle of building a deeper network compared to other networks and its capability to solve the vanishing gradient problem simultaneously. <xref ref-type="fig" rid="fig-5">Figs. 5(a)</xref> and <xref ref-type="fig" rid="fig-5">5(b)</xref> depict the training-validation accuracy and loss curve, and confusion matrix while training, validating, and testing ResNet18, the best-performing model. In addition to the ultimate accuracy measurement, utilizing three other measures: precision, recall, and specificity, the proposed framework was further evaluated. <xref ref-type="table" rid="table-5">Tab. 5</xref> summarizes the performance measures related to the above-mentioned measuring parameters for the average of all classes and each class separately as well for all deep learning networks presented in <xref ref-type="table" rid="table-4">Tab. 4</xref>. The comparison shows that ResNet18 outperforms all the others in all the measuring fields. To the best of our knowledge, the strategy of our proposed framework has proven to be quite efficient, with an accuracy of 99.56%.</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Comparative study of models with their optimal parameters</title>
</caption>
<table frame="hsides">
<colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Pre-Trained Model</th>
<th colspan="2">Confusion Matrix</th>
<th colspan="3">Predicted<break/>Class</th>
<th>Solver</th>
<th>Batch Size</th>
<th>Lear<break/>ning Rate</th>
<th>Epoch</th>
<th>Itera<break/>tions</th>
<th>Validation Accuracy (%)</th>
<th>Testing Accuracy (%)</th>
<th>Training Time</th>
</tr>
<tr>
<th/>
<th/>
<th>G</th>
<th>M</th>
<th>P</th>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
</tr>
</thead>
<tbody>
<tr>
<td>AlexNet</td>
<td>True Class</td>
<td>G</td>
<td>210</td>
<td>3</td>
<td>1</td>
<td>SGDM</td>
<td>32</td>
<td>0.001</td>
<td>54</td>
<td>3600</td>
<td>97.17</td>
<td>97.6</td>
<td>0:14:44</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>2</td>
<td>101</td>
<td>3</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>0</td>
<td>2</td>
<td>137</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>GoogleNet (ImageNet)</td>
<td>True Class</td>
<td>G</td>
<td>210</td>
<td>4</td>
<td>0</td>
<td>Adam</td>
<td>10</td>
<td>0.0001</td>
<td>16</td>
<td>3300</td>
<td>98.4</td>
<td>97.39</td>
<td>00:16:03</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>2</td>
<td>101</td>
<td>3</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>1</td>
<td>2</td>
<td>136</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>GoogleNet (Places365)</td>
<td>True Class</td>
<td>G</td>
<td>210</td>
<td>4</td>
<td>0</td>
<td>SGDM</td>
<td>10</td>
<td>0.001</td>
<td>20</td>
<td>4200</td>
<td>98.26</td>
<td>97.17</td>
<td>00:14:42</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>6</td>
<td>99</td>
<td>1</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>1</td>
<td>1</td>
<td>137</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>ResNet-50</td>
<td>True Class</td>
<td>G</td>
<td>213</td>
<td>1</td>
<td>0</td>
<td>SGDM</td>
<td>7</td>
<td>0.001</td>
<td>17</td>
<td>5100</td>
<td>98.26</td>
<td>99.56</td>
<td>0:24:46</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>1</td>
<td>105</td>
<td>0</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>0</td>
<td>0</td>
<td>139</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>ResNet-101</td>
<td>True Class</td>
<td>G</td>
<td>213</td>
<td>1</td>
<td>0</td>
<td>SGDM</td>
<td>10</td>
<td>0.001</td>
<td>23</td>
<td>4800</td>
<td>98.26</td>
<td>99.35</td>
<td>0:51:04</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>1</td>
<td>105</td>
<td>0</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>1</td>
<td>0</td>
<td>138</td>
</tr>
<tr>
<td>ResNet-18</td>
<td>True Class</td>
<td>G</td>
<td>213</td>
<td>1</td>
<td>0</td>
<td>SGDM</td>
<td>32</td>
<td>0.01</td>
<td>54</td>
<td>3600</td>
<td>98.48</td>
<td>99.56</td>
<td>0:19:25</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>0</td>
<td>105</td>
<td>1</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>0</td>
<td>0</td>
<td>139</td>
</tr>
<tr>
<td>VGG16</td>
<td>True Class</td>
<td>G</td>
<td>214</td>
<td>0</td>
<td>0</td>
<td>SGDM</td>
<td>7</td>
<td>0.0001</td>
<td>11</td>
<td>3300</td>
<td>96.74</td>
<td>98.26</td>
<td>0:14:41</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>6</td>
<td>98</td>
<td>2</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>0</td>
<td>0</td>
<td>139</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>VGG19</td>
<td>True Class</td>
<td>G</td>
<td>211</td>
<td>3</td>
<td>0</td>
<td>SGDM</td>
<td>7</td>
<td>0.0001</td>
<td>15</td>
<td>4500</td>
<td>97.17</td>
<td>98.69</td>
<td>0:21:24</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>1</td>
<td>105</td>
<td>0</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>0</td>
<td>2</td>
<td>137</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>SqueezeNet</td>
<td>True Class</td>
<td>G</td>
<td>208</td>
<td>5</td>
<td>1</td>
<td>SGDM</td>
<td>32</td>
<td>0.001</td>
<td>36</td>
<td>2400</td>
<td>97.39</td>
<td>97.39</td>
<td>0:11:18</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>1</td>
<td>103</td>
<td>2</td>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>2</td>
<td>1</td>
<td>136</td>
</tr>
<tr>
<td>MobileNet</td>
<td>True Class</td>
<td>G</td>
<td>213</td>
<td>1</td>
<td>0</td>
<td>SGDM</td>
<td>32</td>
<td>0.01</td>
<td>54</td>
<td>3600</td>
<td>97.61</td>
<td>98.91</td>
<td>0:43:46</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>1</td>
<td>103</td>
<td>2</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>0</td>
<td>1</td>
<td>138</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>Inception V3</td>
<td>True Class</td>
<td>G</td>
<td>211</td>
<td>3</td>
<td>0</td>
<td>RMS-Prop</td>
<td>10</td>
<td>0.0001</td>
<td>20</td>
<td>4200</td>
<td>98.04</td>
<td>98.26</td>
<td>0:58:39</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>1</td>
<td>103</td>
<td>2</td>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>1</td>
<td>1</td>
<td>137</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>(a) Training-Validation accuracy and loss for best performing model, (b) Confusion matrix for best performing model</title>
</caption>
<graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30698-fig-5a.png"/><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_30698-fig-5b.png"/>
</fig>
<table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>Comparative study of models in terms of performance metrics</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Fine-Tune Models</th>
<th>Precision Per Class</th>
<th>Average Precision</th>
<th>Sensitivity Per Class</th>
<th>Average Sensitivity</th>
<th>Specificity Per Class</th>
<th>Average Specificity</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="3">AlexNet</td>
<td>99.06%</td>
<td rowspan="3">97.61%</td>
<td>98.13%</td>
<td rowspan="3">97.60%</td>
<td>99.18%</td>
<td rowspan="3">98.91%</td>
</tr>
<tr>
<td>95.28%</td>
<td>95.28%</td>
<td>98.58%</td>
</tr>
<tr>
<td>97.16%</td>
<td>98.56%</td>
<td>98.75%</td>
</tr>
<tr>
<td rowspan="3">GoogleNet (ImageNet)</td>
<td>98.11%</td>
<td rowspan="3">96.97%</td>
<td>97.20%</td>
<td rowspan="3">96.95%</td>
<td>98.37%</td>
<td rowspan="3">98.53%</td>
</tr>
<tr>
<td>92.59%</td>
<td>94.34%</td>
<td>97.73%</td>
</tr>
<tr>
<td>98.56%</td>
<td>98.56%</td>
<td>99.38%</td>
</tr>
<tr>
<td rowspan="3">GoogleNet (Places365)</td>
<td>96.77%</td>
<td rowspan="3">97.17%</td>
<td>98.13%</td>
<td rowspan="3">97.17%</td>
<td>97.14%</td>
<td rowspan="3">98.25%</td>
</tr>
<tr>
<td>95.19%</td>
<td>93.40%</td>
<td>98.58%</td>
</tr>
<tr>
<td>99.28%</td>
<td>98.56%</td>
<td>99.69%</td>
</tr>
<tr>
<td rowspan="3">ResNet50</td>
<td>99.53%</td>
<td rowspan="3">99.56%</td>
<td>99.53%</td>
<td rowspan="3">99.56%</td>
<td>99.59%</td>
<td rowspan="3">99.74%</td>
</tr>
<tr>
<td>99.06%</td>
<td>99.06%</td>
<td>99.72%</td>
</tr>
<tr>
<td>100.00%</td>
<td>100.00%</td>
<td>100.00%</td>
</tr>
<tr>
<td rowspan="3">ResNet101</td>
<td>99.07%</td>
<td rowspan="3">99.35%</td>
<td>99.53%</td>
<td rowspan="3">99.35%</td>
<td>99.18%</td>
<td rowspan="3">99.55%</td>
</tr>
<tr>
<td>99.06%</td>
<td>99.06%</td>
<td>99.72%</td>
</tr>
<tr>
<td>100.00%</td>
<td>99.28%</td>
<td>100.00%</td>
</tr>
<tr>
<td rowspan="3">ResNet18</td>
<td>100.00%</td>
<td rowspan="3">99.57%</td>
<td>99.53%</td>
<td rowspan="3">99.56%</td>
<td>100.00%</td>
<td rowspan="3">99.84%</td>
</tr>
<tr>
<td>99.06%</td>
<td>99.06%</td>
<td>99.72%</td>
</tr>
<tr>
<td>99.29%</td>
<td>100.00%</td>
<td>99.69%</td>
</tr>
<tr>
<td rowspan="3">VGG16</td>
<td>97.27%</td>
<td rowspan="3">98.30%</td>
<td>100.00%</td>
<td rowspan="3">98.26%</td>
<td>97.55%</td>
<td rowspan="3">98.67%</td>
</tr>
<tr>
<td>100.00%</td>
<td>92.45%</td>
<td>100.00%</td>
</tr>
<tr>
<td>98.58%</td>
<td>100.00%</td>
<td>99.38%</td>
</tr>
<tr>
<td rowspan="3">VGG19</td>
<td>99.53%</td>
<td rowspan="3">98.73%</td>
<td>98.60%</td>
<td rowspan="3">98.69%</td>
<td>99.59%</td>
<td rowspan="3">99.48%</td>
</tr>
<tr>
<td>95.45%</td>
<td>99.06%</td>
<td>98.58%</td>
</tr>
<tr>
<td>100.00%</td>
<td>98.56%</td>
<td>100.00%</td>
</tr>
<tr>
<td rowspan="3">SqueezeNet</td>
<td>98.58%</td>
<td rowspan="3">97.41%</td>
<td>97.20%</td>
<td rowspan="3">97.39%</td>
<td>98.78%</td>
<td rowspan="3">98.75%</td>
</tr>
<tr>
<td>94.50%</td>
<td>97.17%</td>
<td>98.30%</td>
</tr>
<tr>
<td>97.84%</td>
<td>97.84%</td>
<td>99.06%</td>
</tr>
<tr>
<td rowspan="3">MobileNet</td>
<td>99.53%</td>
<td rowspan="3">98.91%</td>
<td>99.53%</td>
<td rowspan="3">98.91%</td>
<td>99.59%</td>
<td rowspan="3">99.49%</td>
</tr>
<tr>
<td>98.10%</td>
<td>97.17%</td>
<td>99.43%</td>
</tr>
<tr>
<td>98.57%</td>
<td>99.28%</td>
<td>99.38%</td>
</tr>
<tr>
<td rowspan="3">InceptionV3</td>
<td>99.06%</td>
<td rowspan="3">98.26%</td>
<td>98.60%</td>
<td rowspan="3">98.26%</td>
<td>99.18%</td>
<td rowspan="3">99.17%</td>
</tr>
<tr>
<td>96.26%</td>
<td>97.17%</td>
<td>98.87%</td>
</tr>
<tr>
<td>98.56%</td>
<td>98.56%</td>
<td>99.38%</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>However, do we need an augmentation technique to increase the data size. In this research, as discussed above, to increase the data size, extensive data augmentation techniques are utilized, and the framework is evaluated only for the optimal hyperparameters that provide high accuracy for each pre-trained model without augmentation evaluation. <xref ref-type="table" rid="table-6">Tab. 6</xref> shows a minimum improvement of 0.09% using the GoogleNet (Places365) model and a maximum improvement of 0.95% using the SqueezeNet model, which shows quite similar results to the original data (without augmentation). This suggests that the transfer learning technique, if implemented with a proper framework for the selection of optimal hyperparameters, may not require data augmentation.</p>
<table-wrap id="table-6">
<label>Table 6</label>
<caption>
<title>Comparative study of models with their optimal parameters using augmented dataset</title>
</caption>
<table frame="hsides">
<colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Pre-Trained Model</th>
<th colspan="2">Confusion Matrix</th>
<th colspan="3">Predicted<break/>Class</th>
<th>Solver</th>
<th>Batch Size</th>
<th>Learn<break/>ing Rate</th>
<th>Epoch</th>
<th>Iter<break/>ations</th>
<th>Valida<break/>tion Accuracy (%)</th>
<th>Testing Accuracy (%)</th>
<th>Training Time</th>
</tr>
<tr>
<th/>
<th/>
<th/>
<th>G</th>
<th>M</th>
<th>P</th>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
</tr>
</thead>
<tbody>
<tr>
<td>AlexNet</td>
<td>True Class</td>
<td>G</td>
<td>6699</td>
<td>114</td>
<td>32</td>
<td>SGDM</td>
<td>32</td>
<td>0.001</td>
<td>4</td>
<td>8100</td>
<td>98.4</td>
<td>98.27</td>
<td>1:10:41</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>45</td>
<td>3316</td>
<td>37</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>3</td>
<td>24</td>
<td>4435</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>Google Net (ImageNet)</td>
<td>True Class</td>
<td>G</td>
<td>6664</td>
<td>142</td>
<td>39</td>
<td>Adam</td>
<td>10</td>
<td>0.0001</td>
<td>1</td>
<td>5100</td>
<td>97.6</td>
<td>97.97</td>
<td>1:04:27</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>28</td>
<td>3320</td>
<td>50</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>10</td>
<td>30</td>
<td>4422</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>GoogleNet (Places365)</td>
<td>True Class</td>
<td>G</td>
<td>6820</td>
<td>19</td>
<td>6</td>
<td>SGDM</td>
<td>10</td>
<td>0.001</td>
<td>2</td>
<td>6900</td>
<td>97.61</td>
<td>97.26</td>
<td>1:12:04</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>238</td>
<td>3097</td>
<td>63</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>63</td>
<td>14</td>
<td>4385</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>ResNet-50</td>
<td>True Class</td>
<td>G</td>
<td>6748</td>
<td>78</td>
<td>19</td>
<td>SGDM</td>
<td>7</td>
<td>0.001</td>
<td>01</td>
<td>7800</td>
<td>98.8</td>
<td>98.87</td>
<td>01:50:01</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>19</td>
<td>3371</td>
<td>8</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>7</td>
<td>35</td>
<td>4420</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>ResNet-101</td>
<td>True Class</td>
<td>G</td>
<td>6831</td>
<td>11</td>
<td>3</td>
<td>SGDM</td>
<td>10</td>
<td>0.001</td>
<td>2</td>
<td>9000</td>
<td>99.15</td>
<td>99.20</td>
<td>03:50:23</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>58</td>
<td>3305</td>
<td>35</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>4</td>
<td>6</td>
<td>4452</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>ResNet-18</td>
<td>True Class</td>
<td>G</td>
<td>6806</td>
<td>35</td>
<td>4</td>
<td>SGDM</td>
<td>32</td>
<td>0.01</td>
<td>4</td>
<td>7200</td>
<td>99.23</td>
<td>99.23</td>
<td>1:29:40</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>30</td>
<td>3363</td>
<td>5</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>8</td>
<td>31</td>
<td>4423</td>
</tr>
<tr>
<td>SqueezeNet</td>
<td>True Class</td>
<td>G</td>
<td>6822</td>
<td>18</td>
<td>5</td>
<td>SGDM</td>
<td>32</td>
<td>0.001</td>
<td>4</td>
<td>7200</td>
<td>98.33</td>
<td>98.34</td>
<td>1:29:59</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>153</td>
<td>3224</td>
<td>21</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>23</td>
<td>24</td>
<td>4415</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>MobileNet</td>
<td>True Class</td>
<td>G</td>
<td>6825</td>
<td>18</td>
<td>2</td>
<td>SGDM</td>
<td>32</td>
<td>0.01</td>
<td>3</td>
<td>6000</td>
<td>99.21</td>
<td>99.31</td>
<td>1:56:17</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>42</td>
<td>3341</td>
<td>15</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>9</td>
<td>16</td>
<td>4437</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td>Inception V3</td>
<td>True Class</td>
<td>G</td>
<td>6838</td>
<td>4</td>
<td>3</td>
<td>RMS-<break/>Prop</td>
<td>10</td>
<td>0.0001</td>
<td>2</td>
<td>7500</td>
<td>99.01</td>
<td>98.92</td>
<td>0:58:39</td>
</tr>
<tr>
<td/>
<td/>
<td>M</td>
<td>73</td>
<td>3283</td>
<td>42</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td>P</td>
<td>24</td>
<td>13</td>
<td>4425</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
</tbody>
</table>
</table-wrap>
<p>A performance comparison is presented in <xref ref-type="table" rid="table-7">Tab. 7</xref> between our work and other existing state-of-the-art research works that used the same brain tumor dataset for multi-type tumor classification. The comparison is mainly based on performance metric &#x201C;accuracy&#x201D; with support of three other parameters &#x201C;precision,&#x201D; &#x201C;recall,&#x201D; and &#x201C;specificity.&#x201D; The comparison shows that the transfer learning technique, implemented through our proposed framework for brain tumor classification, outperformed all existing approaches based on traditional image processing [<xref ref-type="bibr" rid="ref-5">5</xref>,<xref ref-type="bibr" rid="ref-40">40</xref>], CNN [<xref ref-type="bibr" rid="ref-44">44</xref>,<xref ref-type="bibr" rid="ref-70">70</xref>], and transfer learning [<xref ref-type="bibr" rid="ref-52">52</xref>,<xref ref-type="bibr" rid="ref-53">53</xref>,<xref ref-type="bibr" rid="ref-55">55</xref>,<xref ref-type="bibr" rid="ref-60">60</xref>,<xref ref-type="bibr" rid="ref-62">62</xref>,<xref ref-type="bibr" rid="ref-64">64</xref>,<xref ref-type="bibr" rid="ref-69">69</xref>].</p>
<table-wrap id="table-7">
<label>Table 7</label>
<caption>
<title>Comparison of proposed framework with the related work based on same dataset</title>
</caption>
<table frame="hsides">
<colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th rowspan="2">Related Work</th>
<th rowspan="2">Approach</th>
<th rowspan="2">Accuracy</th>
<th align="center" colspan="4">Precision</th>
<th colspan="4">Recall</th>
<th colspan="4">Specificity</th>
</tr>
<tr>
<th>G</th>
<th>M</th>
<th>P</th>
<th>Ave-rage</th>
<th>G</th>
<th>M</th>
<th>P</th>
<th>Ave-rage</th>
<th>G</th>
<th>M</th>
<th>P</th>
<th>Ave-rage</th>
</tr>
</thead>
<tbody>
<tr>
<td>[<xref ref-type="bibr" rid="ref-5">5</xref>]</td>
<td>BoW-SVM</td>
<td>91.28</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>96.4</td>
<td>86</td>
<td>87.3</td>
<td>&#x2013;</td>
<td>96.3</td>
<td>95.5</td>
<td>95.3</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-40">40</xref>]</td>
<td>DWT-Gabor-NN</td>
<td>91.90</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>95.1</td>
<td>86.9</td>
<td>91.2</td>
<td>&#x2013;</td>
<td>96.3</td>
<td>96</td>
<td>95.7</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-44">44</xref>]</td>
<td>CapsNet</td>
<td>90.89</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-70">70</xref>]</td>
<td>CNN-ELM</td>
<td>93.68</td>
<td>91</td>
<td>94.5</td>
<td>98.3</td>
<td>&#x2013;</td>
<td>97.5</td>
<td>76.8</td>
<td>100</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-52">52</xref>]</td>
<td>fine-tuned VGG19</td>
<td>94.58</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>88.41</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>96.12</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-53">53</xref>]</td>
<td>fine-tuned VGG19</td>
<td>94.82</td>
<td>93</td>
<td>87.97</td>
<td>87.34</td>
<td>89.52</td>
<td>95.97</td>
<td>89.98</td>
<td>96.81</td>
<td>94.25</td>
<td>93.79</td>
<td>96.42</td>
<td>93.93</td>
<td>94.69</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-55">55</xref>]</td>
<td>GoogleNet-SVM</td>
<td>97.10</td>
<td>99</td>
<td>94.7</td>
<td>98</td>
<td>&#x2013;</td>
<td>97.9</td>
<td>96</td>
<td>98.9</td>
<td>&#x2013;</td>
<td>99.4</td>
<td>98.4</td>
<td>99.1</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-69">69</xref>]</td>
<td>fine-tuned VGG16</td>
<td>98.69</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-60">60</xref>]</td>
<td>VGGNet</td>
<td>94.00</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-62">62</xref>]</td>
<td>DenseNet</td>
<td>99.51</td>
<td>99</td>
<td>99</td>
<td>100</td>
<td>&#x2013;</td>
<td>100</td>
<td>99</td>
<td>99</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>[<xref ref-type="bibr" rid="ref-64">64</xref>]</td>
<td>GoogleNet-KNN</td>
<td>98.30</td>
<td>98</td>
<td>95.55</td>
<td>97.78</td>
<td>&#x2013;</td>
<td>98.02</td>
<td>94.57</td>
<td>99.1</td>
<td>&#x2013;</td>
<td>98.63</td>
<td>98.65</td>
<td>99.01</td>
<td>&#x2013;</td>
</tr>
<tr>
<td>Proposed</td>
<td>ResNet18</td>
<td>99.56</td>
<td>100</td>
<td>99.06</td>
<td>99.29</td>
<td>99.45</td>
<td>99.53</td>
<td>99.06</td>
<td>100</td>
<td>99.53</td>
<td>100</td>
<td>99.72</td>
<td>99.69</td>
<td>99.8</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s5">
<label>5</label>
<title>Conclusion and Future Work</title>
<p>This research presents a comprehensive literature review, along with a robust framework for implementing the transfer learning technique. The comprehensive review reveals that there is a need for a solution to select an appropriate pre-trained deep learning model and optimal hyperparameter(s) values for such an implementation. Our proposed model not only solves the model selection issue, but also helps in determining the optimal hyperparameter values. To determine the appropriate pre-trained deep learning model, 11 state-of-the-art pre-trained models were used. A Cartesian product matrix is created to obtain all possible pairs from initialized sets of hyperparameters (batch size and learning rate). All pairs were applied as input one-by-one to each pre-trained deep learning model and re-trained with our brain tumor dataset for the three most popular solvers for the evaluation of the performance of the proposed framework. The simulation work for the framework&#x2019;s assessment reveals that the transfer learning technique is quite effective even with a small size imbalance dataset, and we may not need augmented data if it is implemented with a proper framework with an appropriate selection of hyperparameters and solvers. Further, the results reveal a tradeoff between batch size and learning rate, but it depends on the model architecture type and complexity. The assessment shows that the proposed framework is effective for radiologists and physicians in classifying diverse tumor types. The proposed framework can also be used for other classification issues.</p>
<p>The work can be broadened in the future to increase the dimensions of the Cartesian product matrix to obtain optimal values of other hyperparameters. Further, in-depth investigation is required for a few pre-trained DL models that failed to retrain on our dataset for selected pairs of the Cartesian product matrix.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="other"><p><bold>Funding Statement:</bold> The authors received no specific funding for this study.</p>
</fn>
<fn fn-type="conflict"><p><bold>Conflicts of Interest:</bold> The authors declare that they have no conflicts of interest to report regarding the present study.</p>
</fn>
</fn-group>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Selvanayaki</surname></string-name> and <string-name><given-names>M.</given-names> <surname>Karnan</surname></string-name></person-group>, &#x201C;<article-title>CAD system for automatic detection of brain tumor through magnetic resonance image&#x2014;A review</article-title>,&#x201D; <source>International Journal of Engineering Science and Technology</source>, vol. <volume>2</volume>, no. <issue>10</issue>, pp. <fpage>5890</fpage>&#x2013;<lpage>5901</lpage>, <year>2010</year>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K. M.</given-names> <surname>Brindle</surname></string-name>, <string-name><given-names>J. L.</given-names> <surname>Izquierdo-Garcia</surname></string-name>, <string-name><given-names>D. Y.</given-names> <surname>Lewis</surname></string-name>, <string-name><given-names>R. J.</given-names> <surname>Mair</surname></string-name> and <string-name><given-names>A. J.</given-names> <surname>Wright</surname></string-name></person-group>, &#x201C;<article-title>Brain tumor imaging</article-title>,&#x201D; <source>Journal of Clinical Oncology</source>, vol. <volume>35</volume>, no. <issue>21</issue>, pp. <fpage>2432</fpage>&#x2013;<lpage>2438</lpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>P. Y.</given-names> <surname>Wen</surname></string-name>, <string-name><given-names>D. R.</given-names> <surname>Macdonald</surname></string-name>, <string-name><given-names>D. A.</given-names> <surname>Reardon</surname></string-name>, <string-name><given-names>T. F.</given-names> <surname>Cloughesy</surname></string-name>, <string-name><given-names>A. G.</given-names> <surname>Sorensen</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Updated response assessment criteria for high-grade gliomas: Response assessment in neuro-oncology working group</article-title>,&#x201D; <source>Journal of Clinical Oncology</source>, vol. <volume>28</volume>, no. <issue>11</issue>, pp. <fpage>1963</fpage>&#x2013;<lpage>1972</lpage>, <year>2010</year>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Drevelegas</surname></string-name> and <string-name><given-names>N.</given-names> <surname>Papanikolaou</surname></string-name></person-group>, &#x201C;<chapter-title>Imaging modalities in brain tumors</chapter-title>,&#x201D; in <source>Imaging of Brain Tumors with Histological Correlations</source>, <edition>2<sup><roman>nd</roman></sup></edition> ed., <publisher-loc>Berlin Heidelberg</publisher-loc>: <publisher-name>Springer</publisher-name>, pp. <fpage>13</fpage>&#x2013;<lpage>33</lpage>, <year>2011</year>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Cheng</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Huang</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Cao</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Yang</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Enhanced performance of brain tumor classification via tumor region augmentation and partition</article-title>,&#x201D; <source>PloS One</source>, vol. <volume>10</volume>, no. <issue>12</issue>, pp. <fpage>e0144479</fpage>, <year>2015</year>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Cheng</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Huang</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Huang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Jiang</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Retrieval of brain tumors by adaptive spatial pooling and fisher vector representation</article-title>,&#x201D; <source>PloS One</source>, vol. <volume>11</volume>, no. <issue>6</issue>, pp. <fpage>e0157112</fpage>, <year>2016</year>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Kumar</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Dabas</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Godara</surname></string-name></person-group>, &#x201C;<article-title>Classification of brain MRI tumor images: A hybrid approach</article-title>,&#x201D; <source>Procedia Computer Science</source>, vol. <volume>122</volume>, pp. <fpage>510</fpage>&#x2013;<lpage>517</lpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G.</given-names> <surname>Mohan</surname></string-name> and <string-name><given-names>M. M.</given-names> <surname>Subashini</surname></string-name></person-group>, &#x201C;<article-title>MRI based medical image analysis: Survey on brain tumor grade classification</article-title>,&#x201D; <source>Biomedical Signal Processing and Control</source>, vol. <volume>39</volume>, pp. <fpage>139</fpage>&#x2013;<lpage>161</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>I.</given-names> <surname>Sutskever</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Martens</surname></string-name> and <string-name><given-names>G. E.</given-names> <surname>Hinton</surname></string-name></person-group>, &#x201C;<article-title>Generating text with recurrent neural networks</article-title>,&#x201D; in <conf-name>Proc. ICML</conf-name>, <publisher-loc>Bellevue, WA, USA</publisher-loc>, <year>2011</year>. </mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Collobert</surname></string-name> and <string-name><given-names>J.</given-names> <surname>Weston</surname></string-name></person-group>, &#x201C;<article-title>A unified architecture for natural language processing: Deep neural networks with multitask learning</article-title>,&#x201D; in <conf-name>Proc. 25th Int. Conf. on Machine learning</conf-name>, <publisher-loc>Helsinki, Finland</publisher-loc>, pp. <fpage>160</fpage>&#x2013;<lpage>167</lpage>, <year>2008</year>. </mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>N.</given-names> <surname>Jaitly</surname></string-name> and <string-name><given-names>G. E.</given-names> <surname>Hinton</surname></string-name></person-group>, &#x201C;<chapter-title>Vocal tract length perturbation (VTLP) improves speech recognition</chapter-title>,&#x201D; in <source>Proc. ICML Workshop on Deep Learning for Audio, Speech and Language</source>, vol. <volume>117</volume>, <publisher-loc>Atlanta, Georgia, USA</publisher-loc>, pp. <fpage>21</fpage>, <year>2013</year>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Taigman</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>M. A.</given-names> <surname>Ranzato</surname></string-name> and <string-name><given-names>L.</given-names> <surname>Wolf</surname></string-name></person-group>, &#x201C;<article-title>Deepface: Closing the gap to human-level performance in face verification</article-title>,&#x201D; in <conf-name>Proc. IEEE Computer Vision and Pattern Recognition (CVPR)</conf-name>, <publisher-loc>Columbus, OH, USA</publisher-loc>, pp. <fpage>1701</fpage>&#x2013;<lpage>1708</lpage>, <year>2014</year>. </mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Szegedy</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Toshev</surname></string-name> and <string-name><given-names>D.</given-names> <surname>Erhan</surname></string-name></person-group>, &#x201C;<chapter-title>Deep neural networks for object detection</chapter-title>,&#x201D; in <source>Advances in Neural Information Processing Systems</source>. Vol. <volume>26</volume>, pp. <fpage>2553</fpage>&#x2013;<lpage>2561</lpage>, <year>2013</year>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Karpathy</surname></string-name> and <string-name><given-names>L.</given-names> <surname>Fei-Fei</surname></string-name></person-group>, &#x201C;<article-title>Deep visual-semantic alignments for generating image descriptions</article-title>,&#x201D; in <conf-name>Proc. IEEE Computer Vision and Pattern Recognition</conf-name>, <conf-loc>Boston, MA, USA</conf-loc>, pp. <fpage>3128</fpage>&#x2013;<lpage>3137</lpage>, <year>2015</year>. </mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Zhang</surname></string-name> and <string-name><given-names>C.</given-names> <surname>Zong</surname></string-name></person-group>, &#x201C;<article-title>Deep neural networks in machine translation: An overview</article-title>,&#x201D; <source>IEEE Intelligent Systems</source>, vol. <volume>30</volume>, no. <issue>5</issue>, pp. <fpage>16</fpage>&#x2013;<lpage>25</lpage>, <year>2015</year>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>D.</given-names> <surname>Silver</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Huang</surname></string-name>, <string-name><given-names>C. J.</given-names> <surname>Maddison</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Guez</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Sifre</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Mastering the game of Go with deep neural networks and tree search</article-title>,&#x201D; <source>Nature</source>, vol. <volume>529</volume>, no. <issue>7587</issue>, pp. <fpage>484</fpage>&#x2013;<lpage>489</lpage>, <year>2016</year>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Kleesiek</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Urban</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Hubert</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Schwarz</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Maier-Hein</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Deep MRI brain extraction: A 3D convolutional neural network for skull stripping</article-title>,&#x201D; <source>NeuroImage</source>, vol. <volume>129</volume>, no. <issue>869&#x2013;877</issue>, pp. <fpage>460</fpage>&#x2013;<lpage>469</lpage>, <year>2016</year>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>N.</given-names> <surname>Tajbakhsh</surname></string-name>, <string-name><given-names>J. Y.</given-names> <surname>Shin</surname></string-name>, <string-name><given-names>S. R.</given-names> <surname>Gurudu</surname></string-name>, <string-name><given-names>R. T.</given-names> <surname>Hurst</surname></string-name>, <string-name><given-names>C. B.</given-names> <surname>Kendall</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Convolutional neural networks for medical image analysis: Full training or fine tuning?</article-title>,&#x201D; <source>IEEE Transactions on Medical Imaging</source>, vol. <volume>35</volume>, no. <issue>5</issue>, pp. <fpage>1299</fpage>&#x2013;<lpage>1312</lpage>, <year>2016</year>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Krizhevsky</surname></string-name>, <string-name><given-names>I.</given-names> <surname>Sutskever</surname></string-name> and <string-name><given-names>G. E.</given-names> <surname>Hinton</surname></string-name></person-group>, &#x201C;<article-title>Imagenet classification with deep convolutional neural networks</article-title>,&#x201D; <source>Communications of the ACM</source>, vol. <volume>60</volume>, no. <issue>6</issue>, pp. <fpage>84</fpage>&#x2013;<lpage>90</lpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>O.</given-names> <surname>Russakovsky</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Deng</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Su</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Krause</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Satheesh</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>ImageNet large scale visual recognition challenge</article-title>,&#x201D; <source>International Journal of Computer Vision</source>, vol. <volume>115</volume>, no. <issue>3</issue>, pp. <fpage>211</fpage>&#x2013;<lpage>252</lpage>, <year>2015</year>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Simonyan</surname></string-name> and <string-name><given-names>A.</given-names> <surname>Zisserman</surname></string-name></person-group>, &#x201C;<article-title>Very deep convolutional networks for large-scale image recognition</article-title>,&#x201D; in <conf-name>Int. Conf. on Learning Representations (ICLR)</conf-name>, <year>2015</year><comment>. https://arxiv.org/abs/1409.1556</comment></mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Everingham</surname></string-name> and <string-name><given-names>J.</given-names> <surname>Winn</surname></string-name></person-group>, &#x201C;<article-title>The pascal visual object classes challenge 2012 (voc2012) development kit</article-title>,&#x201D; <source>Pattern Analysis, Statistical Modelling and Computational Learning</source>, vol. <volume>8</volume>, pp. <fpage>5</fpage>, <year>2011</year>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Roux</surname></string-name></person-group>, &#x201C;<article-title>Mitosis Atypia 14 Grand Challenge</article-title>,&#x201D; <comment>2014 [Online]</comment>. <italic>Available:</italic> <uri>https://mitos-atypia-14.grandcha</uri>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G. E.</given-names> <surname>Hinton</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Osindero</surname></string-name> and <string-name><given-names>Y.</given-names> <surname>Teh</surname></string-name></person-group>, &#x201C;<article-title>A fast learning algorithm for deep belief nets</article-title>,&#x201D; <source>Neural Computation</source>, vol. <volume>18</volume>, no. <issue>7</issue>, pp. <fpage>1527</fpage>&#x2013;<lpage>1554</lpage>, <year>2006</year>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G. E.</given-names> <surname>Hinton</surname></string-name> and <string-name><given-names>R. R.</given-names> <surname>Salakhutdinov</surname></string-name></person-group>, &#x201C;<article-title>Reducing the dimensionality of data with neural networks</article-title>,&#x201D; <source>Science</source>, vol. <volume>313</volume>, no. <issue>5786</issue>, pp. <fpage>504</fpage>&#x2013;<lpage>507</lpage>, <year>2006</year>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Ioffe</surname></string-name> and <string-name><given-names>C.</given-names> <surname>Szegedy</surname></string-name></person-group>, &#x201C;<article-title>Batch normalization: Accelerating deep network training by reducing internal covariate shift</article-title>,&#x201D; in <conf-name>Proc. Int. Conf. on Machine Learning</conf-name>, <publisher-loc>Lille, France</publisher-loc>, <year>2015</year>. </mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>V.</given-names> <surname>Nair</surname></string-name> and <string-name><given-names>G. E.</given-names> <surname>Hinton</surname></string-name></person-group>, &#x201C;<article-title>Rectified linear units improve restricted Boltzmann machines</article-title>,&#x201D; in <conf-name>Proc. ICML</conf-name>, <publisher-loc>Haifa, Israel</publisher-loc>, <year>2010</year>. </mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>N.</given-names> <surname>Srivastava</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Hinton</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Krizhevsky</surname></string-name>, <string-name><given-names>I.</given-names> <surname>Sutskever</surname></string-name> and <string-name><given-names>R.</given-names> <surname>Salakhutdinov</surname></string-name></person-group>, &#x201C;<article-title>Dropout: A simple way to prevent neural networks from overfitting</article-title>,&#x201D; <source>The Journal of Machine Learning Research</source>, vol. <volume>15</volume>, no. <issue>1</issue>, pp. <fpage>1929</fpage>&#x2013;<lpage>1958</lpage>, <year>2014</year>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>T.</given-names> <surname>Ateeq</surname></string-name>, <string-name><given-names>M. N.</given-names> <surname>Majeed</surname></string-name>, <string-name><given-names>S. M.</given-names> <surname>Anwar</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Maqsood</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Rehman</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Ensemble-classifiers-assisted detection of cerebral microbleeds in brain MRI</article-title>,&#x201D; <source>Computers &#x0026; Electrical Engineering</source>, vol. <volume>69</volume>, pp. <fpage>768</fpage>&#x2013;<lpage>781</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Havaei</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Davy</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Farley</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Biard</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Courville</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Brain tumor segmentation with deep neural networks</article-title>,&#x201D; <source>Medical Image Analysis</source>, vol. <volume>35</volume>, no. <issue>4</issue>, pp. <fpage>18</fpage>&#x2013;<lpage>31</lpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>B. H.</given-names> <surname>Menze</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Jakab</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Bauer</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Cramer</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Farahni</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>The multimodal brain tumor image segmentation benchmark (BraTS)</article-title>,&#x201D; <source>IEEE Transactions on Medical Imaging</source>, vol. <volume>34</volume>, no. <issue>10</issue>, pp. <fpage>1993</fpage>&#x2013;<lpage>2024</lpage>, <year>2014</year>.</mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Prastawa</surname></string-name>, <string-name><given-names>E.</given-names> <surname>Bullitt</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Moon</surname></string-name>, <string-name><given-names>K. V.</given-names> <surname>Leemput</surname></string-name> and <string-name><given-names>G.</given-names> <surname>Gerig</surname></string-name></person-group>, &#x201C;<article-title>Automatic brain tumor segmentation by subject specific modification of atlas priors1</article-title>,&#x201D; <source>Academic Radiology</source>, vol. <volume>10</volume>, no. <issue>12</issue>, pp. <fpage>1341</fpage>&#x2013;<lpage>1348</lpage>, <year>2003</year>.</mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>P.</given-names> <surname>Abdolmaleki</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Mihara</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Masuda</surname></string-name> and <string-name><given-names>L. D.</given-names> <surname>Buadu</surname></string-name></person-group>, &#x201C;<article-title>Neural networks analysis of astrocytic gliomas from MRI appearances</article-title>,&#x201D; <source>Cancer Letters</source>, vol. <volume>118</volume>, no. <issue>1</issue>, pp. <fpage>69</fpage>&#x2013;<lpage>78</lpage>, <year>1997</year>.</mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Kharrat</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Gasmi</surname></string-name>, <string-name><given-names>M. B.</given-names> <surname>Messaoud</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Benamrane</surname></string-name> and <string-name><given-names>M.</given-names> <surname>Abid</surname></string-name></person-group>, &#x201C;<article-title>A hybrid approach for automatic classification of brain MRI using genetic algorithm and support vector machine</article-title>,&#x201D; <source>Leonardo Journal of Sciences</source>, vol. <volume>17</volume>, no. <issue>1</issue>, pp. <fpage>71</fpage>&#x2013;<lpage>82</lpage>, <year>2010</year>.</mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>E. I.</given-names> <surname>Papageorgiou</surname></string-name>, <string-name><given-names>P. P.</given-names> <surname>Spyridonos</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Glotsos</surname></string-name>, <string-name><given-names>C. D.</given-names> <surname>Stylios</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Ravazoula</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Brain tumor characterization using the soft computing technique of fuzzy cognitive maps</article-title>,&#x201D; <source>Applied Soft Computing</source>, vol. <volume>8</volume>, no. <issue>1</issue>, pp. <fpage>820</fpage>&#x2013;<lpage>828</lpage>, <year>2008</year>.</mixed-citation></ref>
<ref id="ref-36"><label>[36]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>E. I.</given-names> <surname>Zacharaki</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Chawla</surname></string-name>, <string-name><given-names>D. S.</given-names> <surname>Yoo</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Wolf</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Classification of brain tumor type and grade using MRI texture and shape in a machine learning scheme</article-title>,&#x201D; <source>Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine</source>, vol. <volume>62</volume>, no. <issue>6</issue>, pp. <fpage>1609</fpage>&#x2013;<lpage>1618</lpage>, <year>2009</year>.</mixed-citation></ref>
<ref id="ref-37"><label>[37]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K. L.</given-names> <surname>Hsieh</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Lo</surname></string-name> and <string-name><given-names>C.</given-names> <surname>Hsiao</surname></string-name></person-group>, &#x201C;<article-title>Computer-aided grading of gliomas based on local and global MRI features</article-title>,&#x201D; <source>Computer Methods and Programs in Biomedicine</source>, vol. <volume>139</volume>, pp. <fpage>31</fpage>&#x2013;<lpage>38</lpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="ref-38"><label>[38]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Sachdeva</surname></string-name>, <string-name><given-names>V.</given-names> <surname>Kumar</surname></string-name>, <string-name><given-names>I.</given-names> <surname>Gupta</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Khandelwal</surname></string-name> and <string-name><given-names>C. K.</given-names> <surname>Ahuja</surname></string-name></person-group>, &#x201C;<article-title>A package-SFERCB-Segmentation, feature extraction, reduction and classification analysis by both SVM and ANN for brain tumors</article-title>,&#x201D; <source>Applied Soft Computing</source>, vol. <volume>47</volume>, no. <issue>12B</issue>, pp. <fpage>151</fpage>&#x2013;<lpage>167</lpage>, <year>2016</year>.</mixed-citation></ref>
<ref id="ref-39"><label>[39]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Cheng</surname></string-name></person-group>, &#x201C;<chapter-title>Brain magnetic resonance imaging tumor dataset</chapter-title>,&#x201D; in <source>Figshare MRI Dataset Version 5,</source> <comment>2017. [Online]. Available: https://figshare.com/articles/dataset/brain_tumor_dataset/1512427/5</comment></mixed-citation></ref>
<ref id="ref-40"><label>[40]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>M. R.</given-names> <surname>Ismael</surname></string-name> and <string-name><given-names>I.</given-names> <surname>Abdel-Qader</surname></string-name></person-group>, &#x201C;<article-title>Brain tumor classification via statistical features and back-propagation neural network</article-title>,&#x201D; in <conf-name>Proc. IEEE Int. Conf. on Electro/Information Technology (EIT)</conf-name>, <conf-loc>Rochester, Michigan, USA</conf-loc>, pp. <fpage>252</fpage>&#x2013;<lpage>257</lpage>, <year>2018</year>. </mixed-citation></ref>
<ref id="ref-41"><label>[41]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Khalid</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Khalil</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Nasreen</surname></string-name></person-group>, &#x201C;<article-title>A survey of feature selection and feature extraction techniques in machine learning</article-title>,&#x201D; in <conf-name>Proc. Science and Information Conf.</conf-name>, <conf-loc>London, UK</conf-loc>, pp. <fpage>372</fpage>&#x2013;<lpage>378</lpage>, <year>2014</year>. </mixed-citation></ref>
<ref id="ref-42"><label>[42]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>LeCun</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Bengio</surname></string-name> and <string-name><given-names>G.</given-names> <surname>Hinton</surname></string-name></person-group>, &#x201C;<article-title>Deep learning</article-title>,&#x201D; <source>Nature</source>, vol. <volume>521</volume>, no. <issue>7553</issue>, pp. <fpage>436</fpage>&#x2013;<lpage>444</lpage>, <year>2015</year>.</mixed-citation></ref>
<ref id="ref-43"><label>[43]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>He</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Ren</surname></string-name> and <string-name><given-names>J.</given-names> <surname>Sun</surname></string-name></person-group>, &#x201C;<article-title>Deep residual learning for image recognition</article-title>,&#x201D; in <conf-name>Proc. IEEE Conf. on Computer Vision and Pattern Recognition</conf-name>, <publisher-loc>Las Vegas, NV, USA</publisher-loc>, pp. <fpage>770</fpage>&#x2013;<lpage>778</lpage>, <year>2016</year>. </mixed-citation></ref>
<ref id="ref-44"><label>[44]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>P.</given-names> <surname>Afshar</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Mohammadi</surname></string-name> and <string-name><given-names>K. N.</given-names> <surname>Plataniotis</surname></string-name></person-group>, &#x201C;<article-title>Brain tumor type classification via capsule networks</article-title>,&#x201D; in <conf-name>Proc. IEEE Int. Conf. on Image Processing (ICIP)</conf-name>, <publisher-loc>Athens, Greece</publisher-loc>, pp. <fpage>3129</fpage>&#x2013;<lpage>3133</lpage>, <year>2018</year>. </mixed-citation></ref>
<ref id="ref-45"><label>[45]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Zia</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Akhtar</surname></string-name> and <string-name><given-names>A.</given-names> <surname>Aziz</surname></string-name></person-group>, &#x201C;<article-title>A new rectangular window based image cropping method for generalization of brain neoplasm classification systems</article-title>,&#x201D; <source>International Journal of Imaging Systems and Technology</source>, vol. <volume>28</volume>, no. <issue>3</issue>, pp. <fpage>153</fpage>&#x2013;<lpage>162</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-46"><label>[46]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H. H.</given-names> <surname>Sultan</surname></string-name>, <string-name><given-names>N. M.</given-names> <surname>Salem</surname></string-name> and <string-name><given-names>W.</given-names> <surname>Al-Atabany</surname></string-name></person-group>, &#x201C;<article-title>Multi-classification of brain tumor images using deep neural network</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>7</volume>, pp. <fpage>69215</fpage>&#x2013;<lpage>69225</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-47"><label>[47]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Jia</surname></string-name> and <string-name><given-names>D.</given-names> <surname>Chen</surname></string-name></person-group>, &#x201C;<article-title>Brain tumor identification and classification of MRI images using deep learning techniques</article-title>,&#x201D; <source>IEEE Access</source>, pp. <fpage>1</fpage>, <year>2020</year>. <uri>https//org.10.1109/ACCESS.2020.3016319</uri>.</mixed-citation></ref>
<ref id="ref-48"><label>[48]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Banerjee</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Mitra</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Masulli</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Rovetta</surname></string-name></person-group>, &#x201C;<article-title>Brain tumor detection and classification from multi-sequence MRI: Study using convnets</article-title>,&#x201D; in <conf-name>Proc. Int. MICCAI Brainlesion Workshop</conf-name>, <publisher-loc>Granada, Spain</publisher-loc>, pp. <fpage>170</fpage>&#x2013;<lpage>179</lpage>, <year>2018</year>. </mixed-citation></ref>
<ref id="ref-49"><label>[49]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Szegedy</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Ioffe</surname></string-name>, <string-name><given-names>V.</given-names> <surname>Vanhoucke</surname></string-name> and <string-name><given-names>A.</given-names> <surname>Alemi</surname></string-name></person-group>, &#x201C;<article-title>Inception-V4, inception-ResNet and the impact of residual connections on learning</article-title>,&#x201D; in <conf-name>Proc. AAAI Conf. on Artificial Intelligence</conf-name>, <conf-loc>San Francisco, California, USA</conf-loc>, <year>2017</year>. </mixed-citation></ref>
<ref id="ref-50"><label>[50]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Talo</surname></string-name>, <string-name><given-names>U. B.</given-names> <surname>Baloglu</surname></string-name>, <string-name><given-names>&#x00D6;.</given-names> <surname>Y&#x0131;ld&#x0131;r&#x0131;m</surname></string-name> and <string-name><given-names>U. R.</given-names> <surname>Acharya</surname></string-name></person-group>, &#x201C;<article-title>Application of deep transfer learning for automated brain abnormality classification using MR images</article-title>,&#x201D; <source>Cognitive Systems Research</source>, vol. <volume>54</volume>, pp. <fpage>176</fpage>&#x2013;<lpage>188</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-51"><label>[51]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>K. A.</given-names> <surname>Johnson</surname></string-name> and <string-name><given-names>J. A.</given-names> <surname>Becker</surname></string-name></person-group>, &#x201C;<chapter-title>Braintumor datasets</chapter-title>,&#x201D; in <source>Harvard Medical School Data</source> <comment>[Online]. Available: http://www.med.harvard.edu/AANLIB/</comment></mixed-citation></ref>
<ref id="ref-52"><label>[52]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Sajjad</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Khan</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Muhammad</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Wu</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Ullah</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Multi-grade brain tumor classification using deep CNN with extensive data augmentation</article-title>,&#x201D; <source>Journal of Computational Science</source>, vol. <volume>30</volume>, pp. <fpage>174</fpage>&#x2013;<lpage>182</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-53"><label>[53]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z. N. K.</given-names> <surname>Swati</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Zhao</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Kabir</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Ali</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Ali</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Brain tumor classification for MR images using transfer learning and fine-tuning</article-title>,&#x201D; <source>Computerized Medical Imaging and Graphics</source>, vol. <volume>75</volume>, pp. <fpage>34</fpage>&#x2013;<lpage>46</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-54"><label>[54]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z. N. K.</given-names> <surname>Swati</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Zhao</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Kabir</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Ali</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Ali</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Content-based brain tumor retrieval for MR images using transfer learning</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>7</volume>, pp. <fpage>17809</fpage>&#x2013;<lpage>17822</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-55"><label>[55]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Deepak</surname></string-name> and <string-name><given-names>P.</given-names> <surname>Ameer</surname></string-name></person-group>, &#x201C;<article-title>Brain tumor classification using deep CNN features via transfer learning</article-title>,&#x201D; <source>Computers in Biology and Medicine</source>, vol. <volume>111</volume>, no. <issue>3</issue>, pp. <fpage>103345</fpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-56"><label>[56]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Szegedy</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Jia</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Sermanet</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Reed</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Going deeper with convolutions</article-title>,&#x201D; in <conf-name>Proc. IEEE Conf. on Computer Vision and Pattern Recognition</conf-name>, <publisher-loc>Boston, MA, USA</publisher-loc>, pp. <fpage>1</fpage>&#x2013;<lpage>9</lpage>, <year>2015</year>. </mixed-citation></ref>
<ref id="ref-57"><label>[57]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Ghafoorian</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Mehertash</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Kapur</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Karssemeijer</surname></string-name>, <string-name><given-names>E.</given-names> <surname>Marchiori</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Transfer learning for domain adaptation in MRI: Application in brain lesion segmentation</article-title>,&#x201D; in <conf-name>Proc. Int. Conf. on Medical Image Computing and Computer-Assisted Intervention</conf-name>, <publisher-loc>Quebec, Canada</publisher-loc>, pp. <fpage>516</fpage>&#x2013;<lpage>524</lpage>, <year>2017</year>. </mixed-citation></ref>
<ref id="ref-58"><label>[58]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Haarburger</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Langenberg</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Truhn</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Schneider</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Thuring</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<chapter-title>Transfer learning for breast cancer malignancy classification based on dynamic contrast-enhanced MR images</chapter-title>,&#x201D; in <source>Bildverarbeitung f&#x00FC;r die Medizin 2018</source>. <publisher-loc>Berlin, Heidelberg</publisher-loc>, <fpage>216</fpage>&#x2013;<lpage>221</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-59"><label>[59]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G. S.</given-names> <surname>Tandel</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Balestrieri</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Jujaray</surname></string-name>, <string-name><given-names>N. N.</given-names> <surname>Khanna</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Saba</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Multiclass magnetic resonance imaging brain tumor classification using artificial intelligence paradigm</article-title>,&#x201D; <source>Computers in Biology and Medicine</source>, vol. <volume>122</volume>, no. <issue>1</issue>, pp. <fpage>103804</fpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-60"><label>[60]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Muhammad</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Khan</surname></string-name>, <string-name><given-names>J. D.</given-names> <surname>Ser</surname></string-name> and <string-name><given-names>V. H. C.</given-names> <surname>de Albuquerque</surname></string-name></person-group>, &#x201C;<article-title>Deep learning for multigrade brain tumor classification in smart healthcare systems: A prospective survey</article-title>,&#x201D; <source>IEEE Transactions on Neural Networks and Learning Systems</source>, vol. <volume>32</volume>, no. <issue>2</issue>, pp. <fpage>507</fpage>&#x2013;<lpage>522</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-61"><label>[61]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S. U. H.</given-names> <surname>Dar</surname></string-name>, <string-name><given-names>M.</given-names> <surname>&#x00D6;zbey</surname></string-name>, <string-name><given-names>A. B.</given-names> <surname>&#x00C7;atl&#x0131;</surname></string-name> and <string-name><given-names>T.</given-names> <surname>&#x00C7;ukur</surname></string-name></person-group>, &#x201C;<article-title>A transfer-learning approach for accelerated MRI using deep neural networks</article-title>,&#x201D; <source>Magnetic Resonance in Medicine</source>, vol. <volume>84</volume>, no. <issue>2</issue>, pp. <fpage>663</fpage>&#x2013;<lpage>685</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-62"><label>[62]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>N.</given-names> <surname>Noreen</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Palaniappan</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Qayyum</surname></string-name>, <string-name><given-names>I.</given-names> <surname>Ahmad</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Imran</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>A deep learning model based on concatenation approach for the diagnosis of brain tumor</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>8</volume>, pp. <fpage>55135</fpage>&#x2013;<lpage>55144</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-63"><label>[63]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>N.</given-names> <surname>Noreen</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Palaniappan</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Qayyum</surname></string-name>, <string-name><given-names>I.</given-names> <surname>Ahmad</surname></string-name> and <string-name><given-names>M. O.</given-names> <surname>Alassafi</surname></string-name></person-group>, &#x201C;<article-title>Brain tumor classification based on fine-tuned models and the ensemble method</article-title>,&#x201D; <source>Computers Materials &#x0026; Continua</source>, vol. <volume>67</volume>, no. <issue>3</issue>, pp. <fpage>3967</fpage>&#x2013;<lpage>3982</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-64"><label>[64]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Sekhar</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Biswas</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Hazra</surname></string-name>, <string-name><given-names>A. K.</given-names> <surname>Sunaniya</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Mukherjee</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Brain tumor classification using fine-tuned GoogLeNet features and machine learning algorithms: IoMT enabled CAD system</article-title>,&#x201D; <source>IEEE Journal of Biomedical and Health Informatics</source>, vol. <volume>26</volume>, no. <issue>3</issue>, pp. <fpage>983</fpage>, <year>2022</year>.</mixed-citation></ref>
<ref id="ref-65"><label>[65]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A. B. T.</given-names> <surname>Tahir</surname></string-name>, <string-name><given-names>M. A.</given-names> <surname>Khan</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Alhaisoni</surname></string-name>, <string-name><given-names>J. A.</given-names> <surname>Khan</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Nam</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Deep learning and improved particle swarm optimization based multimodal brain tumor classification</article-title>,&#x201D; <source>Computers, Materials &#x0026; Continua</source>, vol. <volume>68</volume>, no. <issue>1</issue>, pp. <fpage>1099</fpage>&#x2013;<lpage>1116</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-66"><label>[66]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Kang</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Ullah</surname></string-name> and <string-name><given-names>J.</given-names> <surname>Gwak</surname></string-name></person-group>, &#x201C;<article-title>MRI-based brain tumor classification using ensemble of deep features and machine learning classifiers</article-title>,&#x201D; <source>Sensors</source>, vol. <volume>21</volume>, no. <issue>6</issue>, pp. <fpage>2222</fpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-67"><label>[67]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>N. S.</given-names> <surname>Shaik</surname></string-name> and <string-name><given-names>T. K.</given-names> <surname>Cherukuri</surname></string-name></person-group>, &#x201C;<article-title>Multi-level attention network: Application to brain tumor classification</article-title>,&#x201D; <source>Signal, Image and Video Processing</source>, vol. <volume>16</volume>, no. <issue>3</issue>, pp. <fpage>817</fpage>&#x2013;<lpage>824</lpage>, <year>2022</year>.</mixed-citation></ref>
<ref id="ref-68"><label>[68]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Yaqub</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Feng</surname></string-name>, <string-name><given-names>M. S.</given-names> <surname>Zia</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Arshid</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Jia</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>State-of-the-art CNN optimizer for brain tumor segmentation in magnetic resonance images</article-title>,&#x201D; <source>Brain Sciences</source>, vol. <volume>10</volume>, no. <issue>7</issue>, pp. <fpage>427</fpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-69"><label>[69]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Rehman</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Naz</surname></string-name>, <string-name><given-names>M. I.</given-names> <surname>Razzak</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Akram</surname></string-name> and <string-name><given-names>M.</given-names> <surname>Imran</surname></string-name></person-group>, &#x201C;<article-title>A deep learning-based framework for automatic brain tumors classification using transfer learning</article-title>,&#x201D; <source>Circuits, Systems, and Signal Processing</source>, vol. <volume>39</volume>, no. <issue>2</issue>, pp. <fpage>757</fpage>&#x2013;<lpage>775</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-70"><label>[70]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Pashaei</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Sajedi</surname></string-name> and <string-name><given-names>N.</given-names> <surname>Jazayeri</surname></string-name></person-group>, &#x201C;<article-title>Brain tumor classification via convolutional neural network and extreme learning machines</article-title>,&#x201D; in <conf-name>Proc. Int. Conf. on Computer and Knowledge Engineering (ICCKE)</conf-name>, <conf-loc>Ferdowsi University of Mashhad, Iran</conf-loc>, pp. <fpage>314</fpage>&#x2013;<lpage>319</lpage>, <year>2018</year>. </mixed-citation></ref>
<ref id="ref-71"><label>[71]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>B.</given-names> <surname>Zhou</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Khosla</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Lapedriza</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Torralba</surname></string-name> and <string-name><given-names>A.</given-names> <surname>Oliva</surname></string-name></person-group>, &#x201C;<article-title>Places: An image database for deep scene understanding</article-title>,&#x201D; <source>Journal of Vision</source>, vol. <volume>17</volume>, no. <issue>10</issue>, pp. <fpage>296</fpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="ref-72"><label>[72]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>F. N.</given-names> <surname>Iandola</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Han</surname></string-name>, <string-name><given-names>M. W.</given-names> <surname>Moskewicz</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Ashraf</surname></string-name>, <string-name><given-names>W. J.</given-names> <surname>Dally</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and &#x003C; 0.5 MB model size</article-title>,&#x201D; in <conf-name>Proc. Int. Conf. on Learning Representations</conf-name>, <publisher-loc>Toulon, France</publisher-loc>, <year>2017</year>. </mixed-citation></ref>
<ref id="ref-73"><label>[73]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Sandler</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Howard</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Zhu</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Zhmoginov</surname></string-name> and <string-name><given-names>L. C.</given-names> <surname>Chen</surname></string-name></person-group>, &#x201C;<article-title>MobilenetV2: Inverted residuals and linear bottlenecks</article-title>,&#x201D; in <conf-name>Proc. IEEE Conf. on Computer Vision and Pattern Recognition</conf-name>, <publisher-loc>Salt Lake City, UT, USA</publisher-loc>, pp. <fpage>4510</fpage>&#x2013;<lpage>4520</lpage>, <year>2018</year>. </mixed-citation></ref>
<ref id="ref-74"><label>[74]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Szegedy</surname></string-name>, <string-name><given-names>V.</given-names> <surname>Vanhoucke</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Ioffe</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Shlens</surname></string-name> and <string-name><given-names>Z.</given-names> <surname>Wojna</surname></string-name></person-group>, &#x201C;<article-title>Rethinking the inception architecture for computer vision</article-title>,&#x201D; in <conf-name>Proc. IEEE Conf. on Computer Vision and Pattern Recognition</conf-name>, <publisher-loc>Las Vegas, NV, USA</publisher-loc>, pp. <fpage>2818</fpage>&#x2013;<lpage>2826</lpage>, <year>2016</year>. </mixed-citation></ref>
</ref-list>
</back>
</article>
