<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">40874</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2023.040874</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Diagnosis of Autism Spectrum Disorder by Imperialistic Competitive Algorithm and Logistic Regression Classifier</article-title>
<alt-title alt-title-type="left-running-head">Diagnosis of Autism Spectrum Disorder by Imperialistic Competitive Algorithm and Logistic Regression Classifier</alt-title>
<alt-title alt-title-type="right-running-head">Diagnosis of Autism Spectrum Disorder by Imperialistic Competitive Algorithm and Logistic Regression Classifier</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Ziyad</surname><given-names>Shabana R.</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><email>ziyadshabana@gmail.com</email></contrib>
<contrib id="author-2" contrib-type="author">
<name name-style="western"><surname>Liyakathunisa</surname></name><xref ref-type="aff" rid="aff-2">2</xref></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Aljohani</surname><given-names>Eman</given-names></name><xref ref-type="aff" rid="aff-2">2</xref></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Saeed</surname><given-names>I. A.</given-names></name><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<aff id="aff-1"><label>1</label><institution>Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University</institution>, <addr-line>Al Kharj, 16274</addr-line>, <country>Saudi Arabia</country></aff>
<aff id="aff-2"><label>2</label><institution>Department of Computer Science, College of Computer Science and Engineering, Taibah University</institution>, <addr-line>Madinah, 41411</addr-line>, <country>Saudi Arabia</country></aff>
<aff id="aff-3"><label>3</label><institution>Department of Information Systems, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University</institution>, <addr-line>Al Kharj, 16274</addr-line>, <country>Saudi Arabia</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Shabana R. Ziyad. Email: <email>ziyadshabana@gmail.com</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic"><year>2023</year></pub-date>
<pub-date date-type="pub" publication-format="electronic"><day>29</day><month>11</month><year>2023</year></pub-date>
<volume>77</volume>
<issue>2</issue>
<fpage>1515</fpage>
<lpage>1534</lpage>
<history>
<date date-type="received"><day>02</day><month>4</month><year>2023</year></date>
<date date-type="accepted"><day>25</day><month>7</month><year>2023</year></date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2023 Ziyad et al.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Ziyad et al.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_40874.pdf"></self-uri>
<abstract>
<p>Autism spectrum disorder (ASD), classified as a developmental disability, is now more common in children than ever. A drastic increase in the rate of autism spectrum disorder in children worldwide demands early detection of autism in children. Parents can seek professional help for a better prognosis of the child&#x2019;s therapy when ASD is diagnosed under five years. This research study aims to develop an automated tool for diagnosing autism in children. The computer-aided diagnosis tool for ASD detection is designed and developed by a novel methodology that includes data acquisition, feature selection, and classification phases. The most deterministic features are selected from the self-acquired dataset by novel feature selection methods before classification. The Imperialistic competitive algorithm (ICA) based on empires conquering colonies performs feature selection in this study. The performance of Logistic Regression (LR), Decision tree, K-Nearest Neighbor (KNN), and Random Forest (RF) classifiers are experimentally studied in this research work. The experimental results prove that the Logistic regression classifier exhibits the highest accuracy for the self-acquired dataset. The ASD detection is evaluated experimentally with the Least Absolute Shrinkage and Selection Operator (LASSO) feature selection method and different classifiers. The Exploratory Data Analysis (EDA) phase has uncovered crucial facts about the data, like the correlation of the features in the dataset with the class variable.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Autism spectrum disorder</kwd>
<kwd>feature selection</kwd>
<kwd>imperialist competitive algorithm</kwd>
<kwd>LASSO</kwd>
<kwd>logistic regression</kwd>
<kwd>random forest</kwd>
</kwd-group>
<funding-group>
<award-group id="awg1">
<funding-source>Deputyship for Research &#x0026; Innovation, Ministry of Education in Saudi Arabia</funding-source>
<award-id>IF2-PSAU-2022/01/22043</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body>
<sec id="s1"><label>1</label><title>Introduction</title>
<p>Autism spectrum disorder is a neurological disorder that results in developmental setbacks affecting the children&#x2019;s social, communication, and behavioral activities. Children affected by ASD have difficulty interacting with parents, teachers, and friends. They show restricted interest in communicating with others. Such children suffer from maintaining eye contact with the people that they interact with. They have poor attention-holding ability, making it difficult to listen to others. According to statistics, 16 to 18 percent of children diagnosed with Down syndrome have autism [<xref ref-type="bibr" rid="ref-1">1</xref>]. Autism-affected children are oversensitive to noise and insensitive to pain. They seem lost in their thoughts, show difficulty recognizing their emotions, and sometimes have unusual memory. ASD starts in childhood but persists even when the person reaches adulthood. In 2021, the Center for Disease Control and Prediction, United States of America, reported that approximately 1 in 44 children is diagnosed with ASD. Children are usually diagnosed with ASD at the age of 3 years [<xref ref-type="bibr" rid="ref-2">2</xref>]. Autism can be detected in children under three years, and language delays can be seen in children as early as 18 months [<xref ref-type="bibr" rid="ref-3">3</xref>]. Children detected with ASD below five years, when trained with occupational and speech therapies, find remarkable improvement in their communication skills. Repetitive and stereotyped behavior, lack of social and language skills, poor eye contact, and delayed speech are warning alarms of autism to the parents. In very young children, a social skills assessment is challenging. Therefore, the limited eye contact with the parents, inability to bring an object, and inability to imitate the parent are the critical factors in identifying a child affected with ASD. In ASD, children aged 15 to 24 months encounter a degradation in language development. The child should maintain joint attention with the caregiver in sharing a social interaction between 8 to 16 months. ASD children generally lack this joint attention skill, and it is a critical feature in identifying ASD at an early age. All these signs and symptoms aid the parents or guardians in detecting ASD at an early stage [<xref ref-type="bibr" rid="ref-4">4</xref>]. Early detection of ASD helps in early intervention, improving the children&#x2019;s learning ability, communication, and social skills.</p>
<p>The study collected ASD children data studying in special schools under the age of 15 and was named as Questionnaire based ASD (QBASD) dataset. The QBASD dataset has been carefully selected to include questions about the vital features to detect autism in children. The study identifies the most discriminating features from the dataset by feature selection methods of LASSO and the ICA. The selected feature set improves efficiency in classifying the test data by diagnosing the child as ASD or non-ASD. LR, Decision tree, KNN, and RF classifiers classify ASD from non-ASD subjects. The novelty of this research work lies in the design of CADx, and the questionnaire-based dataset collected from parents of ASD and normal children. The proposed methodology with the ICA algorithm for feature selection in the QBASD dataset and LR as a classifier is a novel methodology for diagnosing autism with machine learning. This CADx aims to detect autism early by training the model with data samples of children under five.</p>
</sec>
<sec id="s2"><label>2</label><title>Materials and Methods</title>
<p>This section discusses the related study on early diagnosis of ASD in children. Researchers have identified several biomarkers to detect ASD at an early age. Any neuroimaging modality that discovers abnormal patterns in brain activity could detect ASD. Children with autism have typical morphological patterns in the Electroencephalogram (EEG) signals. One-dimensional local binary pattern extracts the features from EEG signals. The spectrogram images are generated from feature-extracted images using a short-time Fourier transform. The Relief algorithm carries out feature ranking and selection. With the SVM classifier, the model achieved an accuracy of 96.44&#x0025; [<xref ref-type="bibr" rid="ref-5">5</xref>]. Magnetic Resonance Imaging (MRI) and resting state functional Magnetic Resonance Imaging (fMRI) images are studied to represent anatomical and functional connectivity abnormalities. The classification accuracy is 75&#x0025; and 79&#x0025; for fMRI and MRI data. The fusion of both datasets gives a higher accuracy of 81&#x0025; [<xref ref-type="bibr" rid="ref-6">6</xref>]. The shortcoming of the above research work is that children must undergo neuroimaging tests, which is an unpleasant experience for them. Sixty acoustic features were extracted from the audio recording of 712 ASD children, and the feature selection method selected twenty-one deterministic features. Convolution Neural Network (CNN) showed improved results compared to Support Vector Machine (SVM) and Linear Regression in this study [<xref ref-type="bibr" rid="ref-7">7</xref>]. Facial expressions and vocal characteristics are biomarkers and detect ASD with 73&#x0025; accuracy. The signs like reduced social smiling, higher voice frequency, and harmonic-to-noise ratio are significant biomarkers for ASD detection [<xref ref-type="bibr" rid="ref-8">8</xref>]. The eye movement data of ASD children is analyzed to distinguish between autism and non-autism subjects. The eye gaze feature used in the supervised machine learning model aids in ASD detection. The model achieved an accuracy of 86&#x0025; and a sensitivity of 91&#x0025; [<xref ref-type="bibr" rid="ref-9">9</xref>]. This study investigated the gaze behavior with classifiers such as SVM, Decision tree, Linear discriminant analysis, and RF. Classification accuracy for the visual fixation feature is 82.64&#x0025; [<xref ref-type="bibr" rid="ref-10">10</xref>]. The features extracted from acoustic, video, and handwriting time series classified ASD children and children with neurotypical development. Eigenvalues of this correlation presented the coordination of speech with handwriting as a potential biomarker for classifying subjects [<xref ref-type="bibr" rid="ref-11">11</xref>]. The ASD subjects have unusual facial expressions and gaze patterns when they look at complex scenes. The authors have leveraged this fact to classify healthy subjects from ASD subjects. Classification accuracy is 85.8&#x0025; for studying facial expressions from photographs [<xref ref-type="bibr" rid="ref-12">12</xref>]. The research studies discussed so far have the shortcoming of being expensive to develop, time-consuming, and exhibiting unsatisfactory accuracy rates. This research study aims to collect data from parents regarding their children&#x2019;s behavior to identify the most deterministic biomarkers for ASD detection. Selecting a highly deterministic feature set can improve the classification model performance to a great extent.</p>
<sec id="s2_1"><label>2.1</label><title>Proposed Methodology for Diagnosis of ASD</title>
<p>The proposed methodology is a novel ICA feature selection algorithm with an LR classifier. The proposed study is a promising methodology based on the data collected through a parent-reported questionnaire based on the child&#x2019;s behavior. A diligent study of the biomarkers for ASD resulted in the questionnaire. The signs identified with the natural evolutionary history of ASD in infants are categorized. The first category is Prenatal, to study preconception through the gestation period and identifies biomarkers that trigger the development of ASD in offspring. The questionnaire includes questions on the mother&#x2019;s health condition, socio-economic status, and medication details to study this biomarker. The second category is pre-symptomatic, where the child shows chances of developing ASD [<xref ref-type="bibr" rid="ref-13">13</xref>]. The questions based on social interactions and emotional responses predict the child&#x2019;s potential risk of developing ASD. The communication and cognitive components are the questionnaires confirming the diagnosis of ASD in children. The questionnaire in this novel study included questions in line with the biomarkers that identify the signs of ASD in infants from the prenatal stage to the toddler stage. This study aims at developing a CADx that classifies ASD children from non-ASD children. The proposed methodology is compared with the LASSO feature selection method, followed by LR for the classification of the QBASD dataset. <xref ref-type="fig" rid="fig-1">Fig. 1</xref> is the block diagram for the proposed CADx for the autism diagnosis.</p>
<fig id="fig-1"><label>Figure 1</label><caption><title>Proposed methodology for AI-based ASD detection tool</title></caption><graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_40874-fig-1.tif"/></fig>
</sec>
<sec id="s2_2"><label>2.2</label><title>Datasets</title>
<p>The QBASD dataset is the questionnaire filled out by parents of children under 15. The questionnaire was designed after discussions with medical practitioners and experts giving therapy and help to ASD children. The questionnaire includes all questions related to the vital signs of ASD detection in children. The questions included in the questionnaire are the child&#x2019;s social skill assessment, emotional response to parents or caregivers, communication skills, behavior issues, sensory impulses, the cognitive component, and history of the mother&#x2019;s medications. The dataset has many samples of responses from parents of ASD-detected children under five years. Parents of special education schools and individuals filled out the QBASD questionnaire. A short description of the research study informed the parents the objective of the research study and motivated them to fill out the questionnaire accurately. The dataset QBASD is the downloaded responses from QBASD questionnaire. The dataset has a sample size of 321 and is balanced data. The dataset excluding personal information like age, gender, and nationality of the child has questions from Q1 to Q29.</p>
</sec>
<sec id="s2_3"><label>2.3</label><title>Feature Selection</title>
<p>Machine Learning (ML) algorithms detect patterns and make accurate classifications and predictions about the data. The performance of the ML algorithm depends on the dataset&#x2019;s quality. Noisy, inadequate, and redundant data negatively impact classification or prediction accuracy. The response variable specifies the class for a particular data sample in any labeled dataset. It is optional that all the features strongly correlate with the response variable. Certain features are redundant, insignificant, and correlate poorly with the response variable. Elimination of in-significant features results in dimensionality reduction. Filter, wrapper, and hybrid methods are conventional feature selection methods [<xref ref-type="bibr" rid="ref-14">14</xref>]. In the high-dimension dataset, certain features have a low correlation with the response variable of the dataset. Feature selection aims to construct a new dataset from the original dataset with features highly correlated with the response variable [<xref ref-type="bibr" rid="ref-15">15</xref>]. For high dimensional data sets, penalized regression is a promising approach for most deterministic variable selection. LASSO penalization is an excellent method for feature selection in the high-dimensional dataset. This study compares the Imperialistic competitive algorithm with the LASSO feature selection method. The methods are studied experimentally by common classifiers, and results are recorded in the result and discussion section.</p>
<sec id="s2_3_1"><label>2.3.1</label><title>Least Absolute Selection and Shrinkage Operator</title>
<p>In logistic lasso if <italic>n</italic> is the number of samples collected for the dataset <italic>D</italic>. Let <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> be the feature set. <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula> be the feature variable of the feature set <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. Let <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mi>m</mml:mi></mml:math></inline-formula> be the number of features in <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. Each sample in the dataset <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:mi>D</mml:mi></mml:math></inline-formula> is denoted as <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is a <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:mn>1</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mi>f</mml:mi></mml:math></inline-formula> vector representing a single subject&#x2019;s data. Let Y be the response variable for the two-dimensional table <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mi>n</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:math></inline-formula> Each <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the element of the vector Y representing the disorder&#x2019;s presence or absence for the related sample. If <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:mi>n</mml:mi></mml:math></inline-formula> is the sample size, then let <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:mi>i</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:mrow><mml:mtext>n</mml:mtext></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>. In linear regression, the relationship between X and Y is linear [<xref ref-type="bibr" rid="ref-16">16</xref>].
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:mi>m</mml:mi><mml:mi>X</mml:mi><mml:mo>+</mml:mo><mml:mi>&#x03B5;</mml:mi></mml:math></disp-formula></p>
<p><inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:mtext>&#x00A0;</mml:mtext><mml:mi>m</mml:mi></mml:math></inline-formula> denotes the coefficient vector representing the relationship between response variable Y and the variables in the feature set <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:math></inline-formula> In some datasets, the number of features is greater than the number of samples, resulting in poor regression performance. Therefore, the LASSO feature selection method is a promising solution to this problem [<xref ref-type="bibr" rid="ref-17">17</xref>]. The LASSO analyzes the importance of each feature <italic>f</italic> in <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. The logistic model L is represented as <xref ref-type="disp-formula" rid="eqn-2">Eq. (2)</xref>.
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x03B2;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msubsup><mml:mrow><mml:mo>&#x2211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msubsup><mml:msub><mml:mi>&#x03B2;</mml:mi><mml:mrow><mml:mi>j</mml:mi><mml:mtext>&#x00A0;</mml:mtext></mml:mrow></mml:msub><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></disp-formula></p>
<p><inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:msub><mml:mi>&#x03B2;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> is the intercept, and <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula> is the Regression coefficient associated with dataset features. In this study, the number of samples is greater than the number of features; therefore, <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:mi>n</mml:mi><mml:mo>&#x003E;</mml:mo><mml:mi>m</mml:mi></mml:math></inline-formula>. The penalized logistic lasso is given in <xref ref-type="disp-formula" rid="eqn-3">Eq. (3)</xref>.
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mrow><mml:mover><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>&#x03BB;</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mrow><mml:mover><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>&#x03BB;</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mtext mathvariant="italic">argma</mml:mtext></mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>&#x03B2;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x00A0;</mml:mtext><mml:mi>&#x03B2;</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mrow><mml:mo>{</mml:mo><mml:mi>l</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>&#x03B2;</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x00A0;</mml:mtext><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>p</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>l</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03BB;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mo>}</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:mi>&#x03BB;</mml:mi></mml:math></inline-formula> is the regularization parameter, and <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:mi>p</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>l</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03BB;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is defined in <xref ref-type="disp-formula" rid="eqn-4">Eq. (4)</xref>.
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mi>p</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>l</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03BB;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>&#x03BB;</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:msubsup><mml:mrow><mml:mo>&#x2211;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>|</mml:mo><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mi>&#x03B2;</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Some of the coefficients of <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:msub><mml:mrow><mml:mover><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>&#x03BB;</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are reduced to zero [<xref ref-type="bibr" rid="ref-18">18</xref>]. Reducing coefficients to zero eliminates the features with low correlation to the response variable. The features with nonzero coefficients are those listed in the derived dataset. The feature variables set <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext>&#x2032;</mml:mtext></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> identified by LASSO are highly correlated with the response variable. The coefficients of remaining <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext>&#x2032;&#x2032;</mml:mtext></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> reduces to zero and are eliminated from the feature set <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>.
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext>&#x2032;</mml:mtext></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext>&#x2032;&#x2032;</mml:mtext></mml:mrow></mml:mrow></mml:msubsup><mml:mo>}</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>The critical factor is the selection of penalization parameter &#x03BB;. The penalization parameter directly impacts the number of selected feature variables and the degree to which they are penalized to zero [<xref ref-type="bibr" rid="ref-16">16</xref>]. A higher value of &#x03BB; reduces all the coefficients of feature variable <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> to zero, and in turn, the model loses the most deterministic feature variable. A lower value of &#x03BB; that is almost zero includes redundant and noisy variables in the feature set <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext>&#x2032;</mml:mtext></mml:mrow></mml:mrow></mml:msubsup><mml:mo>.</mml:mo></mml:math></inline-formula> Although many different methods are available for <inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:mi>&#x03BB;</mml:mi></mml:math></inline-formula> selection, cross-validation is the most widely used method for optimum <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:mi>&#x03BB;</mml:mi></mml:math></inline-formula> value selection.</p>
<p>A feature selection step before classification shows significant improvement in classification performance. LASSO feature selection method selects the feature variable set <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext>&#x2032;</mml:mtext></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> that is highly correlated with the response variable Y. The feature set <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext>&#x2032;</mml:mtext></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> selected by LASSO matches with the vital signs medical practitioners diligently analyze to detect ASD in children at an early age.</p>
</sec>
<sec id="s2_3_2"><label>2.3.2</label><title>ASD Detection Algorithm with LASSO</title>
<p>The following subsection is the algorithm for the proposed methodology:</p>
<p>Input: QBASD dataset.</p>
<p>Output: Classification of the test sample data.</p>
<p><bold>Step 1:</bold> Convert the text dataset into a numerical dataset and represent it as QBASD.</p>
<p><bold>Step 2:</bold> The feature selection is carried out on the dataset QBASD using LASSO.</p>
<p><bold>Step 3:</bold> The reduced feature set QBASD&#x2019; is the input to the LR classifier.</p>
<p><bold>Step 4:</bold> Evaluate the proposed methodology using standard metrics.</p>
<p><bold>Step 5:</bold> Compare the proposed methodology with the other ML algorithms.</p>
</sec>
<sec id="s2_3_3"><label>2.3.3</label><title>Imperialist Competitive Algorithm</title>
<p>Atashpaz-Gargari et al. developed the ICA algorithm, a metaheuristic algorithm with improved convergence ability, in 2007 [<xref ref-type="bibr" rid="ref-19">19</xref>]. The ICA algorithm leverages the idea of colonization, a simulation of the political process of imperialism. The powerful countries overpowered the weaker countries with their military resources, making them part of their colonies. ICA is an optimization algorithm based on the concept of political conquer. ICA algorithms find their application in networking and industrial engineering. In Industrial engineering, ICA is an optimization algorithm that optimizes the problems on U-type stochastic assembly line balancing [<xref ref-type="bibr" rid="ref-20">20</xref>], model sequencing [<xref ref-type="bibr" rid="ref-21">21</xref>], assembly sequence planning [<xref ref-type="bibr" rid="ref-22">22</xref>], engineering design, and production planning [<xref ref-type="bibr" rid="ref-23">23</xref>]. One of the latest research works uses the ICA and Bat algorithm for feature selection before applying the ML algorithm for breast cancer prediction [<xref ref-type="bibr" rid="ref-24">24</xref>]. The algorithm considers all the entities in the population as countries. Some strong countries in the population are imperialist empires, and others are colonies of the selected imperialists. Initially, colonies were distributed to the imperialist states according to their power. In each competition, the colonies move towards the relevant imperialistic empire. The competition is assessed by the imperialistic empire&#x2019;s total cost and the percentage of the mean cost of colonies. The empires grew in power by attracting the colonies of competitor empires. The power of the empire is calculated based on the cost function. The empire that has power lesser than the previous competition is eliminated from the competition. As the rounds of competitions progress some empires become stronger in power, and others become weaker. This gradual process of imperialistic empires becoming stronger, some getting weaker, and finally converging as a single large empire is characteristic of an optimization algorithm [<xref ref-type="bibr" rid="ref-25">25</xref>]. This algorithm is effective in selecting the most discriminating features for the dataset. In the proposed study, there are 28 features in the dataset. Three significant features from the feature set are assigned as initial imperialistic states. The selected feature set extracted by the LASSO method are listed in <xref ref-type="table" rid="table-1">Table 1</xref>. <xref ref-type="table" rid="table-2">Table 2</xref> shows the list of features selected by ICA as the best-discriminating feature set for the QBASD dataset.</p>
<table-wrap id="table-1"><label>Table 1</label><caption><title>LASSO reduced feature set for QBASD dataset</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<tbody>
<tr>
<td align="left">Q1</td>
<td align="left">Q4</td>
<td align="left">Q11</td>
<td align="left">Q13</td>
<td align="left">Q18</td>
<td align="left">Q17</td>
<td align="left">Q22</td>
</tr>
</tbody>
</table>
</table-wrap><table-wrap id="table-2"><label>Table 2</label><caption><title>ICA reduced feature set for QBASD dataset</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<tbody>
<tr>
<td align="left">Q1</td>
<td align="left">Q4</td>
<td align="left">Q6</td>
<td align="left">Q8</td>
<td align="left">Q9</td>
<td align="left">Q13</td>
</tr>
<tr>
<td align="left">Q16</td>
<td align="left">Q18</td>
<td align="left">Q19</td>
<td align="left">Q21</td>
<td align="left">Q22</td>
<td align="left">Q23</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>According to the feature importance ranking, Q11, Q13, and Q22 are set as imperialistic states. The remaining 25 features are set as colonies. The significant features or countries based on their ability to increase the classification performance are retained in the imperialistic states. The cost function for the metaheuristic algorithm should be a multimodal function with many minima location and just one global minima. The metaheuristic algorithm tries to find the ideal solution in a landscape; hence, multimodal cost functions are suitable for testing the searchability of any metaheuristic algorithm. Michalewicz is a multimodal cost function suitable for problems with fewer global optimum solutions in the search space [<xref ref-type="bibr" rid="ref-26">26</xref>]. <inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is defined according to <xref ref-type="disp-formula" rid="eqn-6">Eq. (6)</xref>.
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:msubsup><mml:mrow><mml:mo>&#x2211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mi>sin</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:mi>sin</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mfrac><mml:mrow><mml:mi>i</mml:mi><mml:msubsup><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mi>&#x03C0;</mml:mi></mml:mfrac><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>m</mml:mi></mml:mrow></mml:msup><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn>10</mml:mn></mml:math></disp-formula>where <inline-formula id="ieqn-33"><mml:math id="mml-ieqn-33"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> varies between 0 to <inline-formula id="ieqn-34"><mml:math id="mml-ieqn-34"><mml:mi>&#x03C0;</mml:mi></mml:math></inline-formula>, <italic>n</italic> is the number of features in the search space. The algorithm finds the cost of all countries. The value is normalized by finding the difference of the <inline-formula id="ieqn-35"><mml:math id="mml-ieqn-35"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> of each country and the maximum of the <inline-formula id="ieqn-36"><mml:math id="mml-ieqn-36"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The total cost of the imperialist empire is computed by the sum of country&#x2019;s cost and <inline-formula id="ieqn-37"><mml:math id="mml-ieqn-37"><mml:mi>&#x03BB;</mml:mi></mml:math></inline-formula> times the mean cost of the imperialist state. The <inline-formula id="ieqn-38"><mml:math id="mml-ieqn-38"><mml:mi>&#x03BB;</mml:mi></mml:math></inline-formula> value is set as 0.03. This is 30&#x0025; of the mean cost of the imperialist empire contributes to the total cost. Normalized costs of the countries are computed based on the power of the state. The elimination is done based on the power ranking. The proposed algorithm sets the parameters as <inline-formula id="ieqn-39"><mml:math id="mml-ieqn-39"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>&#x2009;&#x003D;&#x2009;29, <inline-formula id="ieqn-40"><mml:math id="mml-ieqn-40"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>&#x2009;&#x003D;&#x2009;3, <inline-formula id="ieqn-41"><mml:math id="mml-ieqn-41"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>&#x2009;&#x003D;&#x2009;26, <inline-formula id="ieqn-42"><mml:math id="mml-ieqn-42"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>&#x2009;&#x003D;&#x2009;200, revolution rate&#x2009;&#x003D;&#x2009;0.3, and assimilation coefficient <inline-formula id="ieqn-43"><mml:math id="mml-ieqn-43"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula>&#x2009;&#x003D;&#x2009;2.</p>
</sec>
<sec id="s2_3_4"><label>2.3.4</label><title>Algorithm for Feature Selection Using ICA</title>
<p>Step 1: Set <inline-formula id="ieqn-44"><mml:math id="mml-ieqn-44"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> as the initial population of countries. Set <inline-formula id="ieqn-45"><mml:math id="mml-ieqn-45"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> as the number of decades.</p>
<p>Step 2: Set <inline-formula id="ieqn-46"><mml:math id="mml-ieqn-46"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the best population set as empires.</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Set <inline-formula id="ieqn-47"><mml:math id="mml-ieqn-47"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>l</mml:mi><mml:mtext>&#x00A0;</mml:mtext></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula></p>
<p>Step 3: Initialize the country list as a binary string with len&#x007B;<inline-formula id="ieqn-48"><mml:math id="mml-ieqn-48"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> &#x007D;</p>
<p>Step 4: Repeat until k&#x2009;&#x003C;&#x2009;<inline-formula id="ieqn-49"><mml:math id="mml-ieqn-49"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula></p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.1: For each of the empires</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.1.1: Assimilate colonies.</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;<inline-formula id="ieqn-50"><mml:math id="mml-ieqn-50"><mml:mrow><mml:mtext mathvariant="italic">EmpCo</mml:mtext></mml:mrow><mml:msub><mml:mi>l</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mtext mathvariant="italic">EmpCo</mml:mtext></mml:mrow><mml:msub><mml:mi>l</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mi>U</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>&#x03B2;</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula></p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;where <inline-formula id="ieqn-51"><mml:math id="mml-ieqn-51"><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mtext>&#x00A0;</mml:mtext><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>e</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mi>o</mml:mi><mml:mi>f</mml:mi><mml:mrow><mml:mtext>&#xA0;the&#xA0;</mml:mtext></mml:mrow><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>y</mml:mi></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:math></inline-formula>. <inline-formula id="ieqn-52"><mml:math id="mml-ieqn-52"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula>&#x2009;&#x003E;&#x2009;1</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.1.2: Compute the cost of colonies in the empire.</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;<inline-formula id="ieqn-53"><mml:math id="mml-ieqn-53"><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>&#x2212;</mml:mo><mml:munder><mml:mrow><mml:mo form="prefix">max</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula></p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;where f is the cost of nth imperialist</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.1.3: Compute the cost of new colony <inline-formula id="ieqn-54"><mml:math id="mml-ieqn-54"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula></p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.1.4: If <inline-formula id="ieqn-55"><mml:math id="mml-ieqn-55"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x003E;</mml:mo><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> then</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.1.5: Compute the total cost <inline-formula id="ieqn-56"><mml:math id="mml-ieqn-56"><mml:mi>T</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>C</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula></p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;<inline-formula id="ieqn-57"><mml:math id="mml-ieqn-57"><mml:mi>T</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>C</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:mi>&#x03BB;</mml:mi><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mrow><mml:msubsup><mml:mrow><mml:mo>&#x2211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msubsup><mml:mspace width="thinmathspace" /><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:math></inline-formula></p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.1.6: End of loop</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.2: Compute the distance between the empires.</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.3: If the distance&#x2009;&#x003C;&#x2009;threshold value</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.3.1: Unite the empires.</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.3.2: Update the position and cost of the colonies. The new position is <inline-formula id="ieqn-58"><mml:math id="mml-ieqn-58"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mi>w</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula></p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;<inline-formula id="ieqn-59"><mml:math id="mml-ieqn-59"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mi>w</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mi>U</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>&#x03B2;</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula></p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.4: Update the cost of new list of the empires</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;<inline-formula id="ieqn-60"><mml:math id="mml-ieqn-60"><mml:mi>T</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>C</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:mi>&#x03BB;</mml:mi><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mrow><mml:msubsup><mml:mrow><mml:mo>&#x2211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msubsup><mml:mspace width="thinmathspace" /><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:math></inline-formula></p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.5: Computer the Normalized cost</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;<inline-formula id="ieqn-61"><mml:math id="mml-ieqn-61"><mml:mrow><mml:mtext mathvariant="italic">NormT</mml:mtext></mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>T</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>C</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:munder><mml:mrow><mml:mo form="prefix">max</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder><mml:mo>&#x2061;</mml:mo><mml:mi>T</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>C</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula></p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.6: Compute the Power</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;<inline-formula id="ieqn-62"><mml:math id="mml-ieqn-62"><mml:mi>P</mml:mi><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">NormT</mml:mtext></mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mo>&#x2211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mi>p</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msubsup><mml:mrow><mml:mtext mathvariant="italic">NormT</mml:mtext></mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:math></inline-formula></p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.7: Eliminate the empire with the weakest power or no colonies.</p>
<p>&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;&#x2002;Step 4.8: If (<inline-formula id="ieqn-63"><mml:math id="mml-ieqn-63"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>&#x2009;&#x003D;&#x003D;&#x2009;1) then break from loop</p>
<p>&#x2002;&#x2002;&#x2002;end loop</p>
<p>Step 5: Select the best feature set <inline-formula id="ieqn-64"><mml:math id="mml-ieqn-64"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext>&#x2032;</mml:mtext></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula></p>
<p>Step 6: Classification is performed with the most common classifiers.</p>
<p>Step 7: Evaluate the performance of the classifiers with metrics of accuracy, precision, and F1-score.</p>
</sec>
<sec id="s2_3_5"><label>2.3.5</label><title>ASD Detection Algorithm with ICA</title>
<p><italic>Input</italic>: QBASD dataset.</p>
<p><italic>Output</italic>: Classification of the test data.</p>
<p>Step 1: Convert the text ASD dataset into a numerical dataset and represent it as <inline-formula id="ieqn-65"><mml:math id="mml-ieqn-65"><mml:mrow><mml:mtext mathvariant="italic">QBASD</mml:mtext></mml:mrow></mml:math></inline-formula>.</p>
<p>Step 2: The feature selection is made on dataset <inline-formula id="ieqn-66"><mml:math id="mml-ieqn-66"><mml:mrow><mml:mtext mathvariant="italic">QBASD</mml:mtext></mml:mrow></mml:math></inline-formula> using the ICT feature selection method.</p>
<p>Step 3: The reduced feature set is given as the input to the LR classifier.</p>
<p>Step 4: Evaluate the proposed methodology using standard metrics.</p>
<p>Step 5: Compare the proposed methodology with the other ML algorithms.</p>
</sec>
</sec>
<sec id="s2_4"><label>2.4</label><title>Classification</title>
<p>The classification task aims to classify the new data sample into one of the labeled classes based on the pattern of the training dataset. Given a dataset &#x0110; with a unique feature set <inline-formula id="ieqn-67"><mml:math id="mml-ieqn-67"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mtext>&#x00A0;</mml:mtext></mml:mrow></mml:msub><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mtext mathvariant="italic">given</mml:mtext></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mi>a</mml:mi><mml:mi>s</mml:mi><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>. The output response variable <inline-formula id="ieqn-68"><mml:math id="mml-ieqn-68"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mtext>&#x00A0;</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula> for every <inline-formula id="ieqn-69"><mml:math id="mml-ieqn-69"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is a zero or one. The <inline-formula id="ieqn-70"><mml:math id="mml-ieqn-70"><mml:mrow><mml:mtext>Y</mml:mtext></mml:mrow></mml:math></inline-formula> response variable represents the labeled class for the specific data sample. LR method computes the probability of the data sample belonging to a binary class [<xref ref-type="bibr" rid="ref-27">27</xref>].
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo fence="false" stretchy="false">|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:math></disp-formula>
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo fence="false" stretchy="false">|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>P</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo fence="false" stretchy="false">|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>.</mml:mo></mml:math></disp-formula></p>
<p>The linear model of the problem is <inline-formula id="ieqn-71"><mml:math id="mml-ieqn-71"><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mi>x</mml:mi><mml:mi>&#x03B2;</mml:mi><mml:mo>+</mml:mo><mml:mi>&#x03F5;</mml:mi></mml:math></inline-formula>, where y is the response variable column vector, <italic>x</italic> is the dataset matrix, <inline-formula id="ieqn-72"><mml:math id="mml-ieqn-72"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula> is the parameter, and <inline-formula id="ieqn-73"><mml:math id="mml-ieqn-73"><mml:mi>&#x03F5;</mml:mi></mml:math></inline-formula> is the error. In the equation, y is a random variable with a probability distribution <inline-formula id="ieqn-74"><mml:math id="mml-ieqn-74"><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="left left" rowspacing=".2em" columnspacing="1em" displaystyle="false"><mml:mtr><mml:mtd><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mtd><mml:mtd><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mtd><mml:mtd><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:msub><mml:mi>&#x03F5;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="left left" rowspacing=".2em" columnspacing="1em" displaystyle="false"><mml:mtr><mml:mtd><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mtd><mml:mtd><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mtext>&#x00A0;</mml:mtext><mml:mi>w</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mtext mathvariant="italic">probability</mml:mtext></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mtd><mml:mtd><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mtext>&#x00A0;</mml:mtext><mml:mi>w</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mtext mathvariant="italic">probabilty</mml:mtext></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>The logistic function is
<disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block"><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mtext>&#x00A0;&#x00A0;</mml:mtext><mml:mo fence="false" stretchy="false">|</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mo>,</mml:mo><mml:mtext>&#x00A0;</mml:mtext></mml:mrow></mml:msub><mml:mi>&#x03B2;</mml:mi><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>&#x03B2;</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow></mml:msup><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow></mml:msup><mml:mi>&#x03B2;</mml:mi></mml:mrow></mml:mfrac><mml:mspace width="thinmathspace" /><mml:mi>f</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:math></disp-formula></p>
<p>The logit transformation is
<disp-formula id="eqn-12"><label>(12)</label><mml:math id="mml-eqn-12" display="block"><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>l</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>The logit function implicitly places a separating hyperplane in the input space between the two instances [<xref ref-type="bibr" rid="ref-28">28</xref>]. Decision tree algorithms is a supervised learning algorithm that makes effective classification of the data based on multiple covariates. The decision tree classifier is a tree-structured classifier suitable for classifying medical data. The decision tree algorithm selects the most discriminating feature from the dataset and sets one of these features as the tree&#x2019;s root. The tree is built by choosing the best attributes in the feature set as the decision nodes. As the tree grows, it splits the data samples into groups based on the decision nodes. The leaf nodes divide the data into classes. In medical data, there are chances of developing skewed trees; hence the decision tree is the best classification method to split the heavily skewed trees into ranges [<xref ref-type="bibr" rid="ref-29">29</xref>]. The root node classifies the dataset into disjoint sets. Selecting relevant features for each disjoint set and applying the same procedure constructs the complete tree. Any decision node generates nonoverlapping sub-datasets that are finally grouped as labeled classes by leaf nodes [<xref ref-type="bibr" rid="ref-30">30</xref>]. Based on the features, the decision tree can classify the data sample as ASD or non-ASD. KNN is a classification algorithm that labels the test data based on the similarity index of the nearby k, an odd number of labeled data samples. The test data is classified as a class frequently occurring among the k data samples close to the test data. The RF is the most popular ensemble method that creates multiple base learners to classify a new data sample. The base learner models are decision trees. Dataset D has feature set f and sample size s. The row sampling with replacement is a technique that selects multiple data samples from D as input to the multiple base learners denoted as DTi. The feature sampling selects random features to be an input to the DTi. The base learners are trained with data samples. The trained base learner classifies the new test data sample and gives the output. As each base learner predicts different outputs, the final decision is based on a majority voting scheme. Row sampling and feature sampling improve the classification accuracy. The multiple decision trees convert new sample data with high variance to low variance [<xref ref-type="bibr" rid="ref-31">31</xref>]. <xref ref-type="fig" rid="fig-2">Fig. 2</xref> represents the flow chart for the proposed ASD detection system with the ICA algorithm.</p>
<fig id="fig-2"><label>Figure 2</label><caption><title>Flow chart for the proposed ASD detection system with ICA algorithm</title></caption><graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_40874-fig-2.tif"/></fig>
</sec>
</sec>
<sec id="s3"><label>3</label><title>Results and Discussions</title>
<p>The feature selection method, a preprocessing phase to classification, escalates the model&#x2019;s performance by avoiding the overfitting of the data [<xref ref-type="bibr" rid="ref-32">32</xref>]. In this research, an experimental study analyzes the model performance with the proposed feature selection algorithm. The proposed algorithms are implemented in Python using the dataset QBASD. The classification accuracy of common classifiers on the QBASD without feature reduction are tabulated in <xref ref-type="table" rid="table-3">Table 3</xref>.</p>
<table-wrap id="table-3"><label>Table 3</label><caption><title>Classification metrics of ASD and non-ASD for QBASD dataset</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left" rowspan="2">Classifiers</th>
<th align="center" colspan="2">Precision score</th>
<th align="center" colspan="2">F1-score</th>
<th align="center" colspan="2">Recall rate</th>
<th align="left" rowspan="2">Accuracy of the model</th>
<th align="left" rowspan="2">AUC of the model</th>
</tr>
<tr>
<th align="left">ASD</th>
<th align="left">Non-ASD</th>
<th align="left">ASD</th>
<th align="left">Non-ASD</th>
<th align="left">ASD</th>
<th align="left">Non-ASD</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Decision tree</td>
<td align="left">92</td>
<td align="left">90</td>
<td align="left">92</td>
<td align="left">90</td>
<td align="left">92</td>
<td align="left">90</td>
<td align="left">90.9</td>
<td align="left">90.83</td>
</tr>
<tr>
<td align="left">LR</td>
<td align="left">83</td>
<td align="left">80</td>
<td align="left">83</td>
<td align="left">80</td>
<td align="left">83</td>
<td align="left">80</td>
<td align="left">81.8</td>
<td align="left">81.66</td>
</tr>
<tr>
<td align="left">KNN (n&#x2009;&#x003D;&#x2009;5)</td>
<td align="left">73</td>
<td align="left">64</td>
<td align="left">70</td>
<td align="left">67</td>
<td align="left">67</td>
<td align="left">70</td>
<td align="left">68.1</td>
<td align="left">68.33</td>
</tr>
<tr>
<td align="left">RF</td>
<td align="left">79</td>
<td align="left">88</td>
<td align="left">85</td>
<td align="left">78</td>
<td align="left">92</td>
<td align="left">70</td>
<td align="left">81.8</td>
<td align="left">80.83</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The receiver operating characteristics (ROC) curve is the graph showing the performance of the classification model. The graph plots the false positive rate along the x-axis and the true positive rate along the y-axis. The area under the curve (AUC) measures the area under the ROC curve. <xref ref-type="fig" rid="fig-3">Fig. 3</xref> shows the ROC curve results for the QBASD dataset. <xref ref-type="fig" rid="fig-3">Fig. 3a</xref> shows the decision tree classifier has the highest performance metric compared to other classifiers. The decision tree performs best as the features at each node classify the dataset into two sets. LR performance could be better than the performance of the decision tree, as it is a more efficient prediction model than the classification model. KNN is a victim of the curse of dimensionality; hence, the performance is inferior for a complete QBASD dataset. High k values result in underfitting; low k values will result in overfitting. In RF, the recall rate is high as the true positives are high and false negatives are negligible for ASD class detection. The precision of the ASD detection model is low as false positives are high.</p>
<fig id="fig-3"><label>Figure 3</label><caption><title>ROC curve for the classifiers based on QBASD (a) decision tree classifier (b) KNN classifier (c) LR classifier (d) RF classifier</title></caption><graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_40874-fig-3a.tif"/><graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_40874-fig-3b.tif"/></fig>
<p><xref ref-type="table" rid="table-4">Table 4</xref> shows the classification metrics for the proposed ASD detection model with the ICA feature reduction algorithm. The LR and RF classifiers outperform the other classifiers. RF has the advantage of being robust to outliers; hence this algorithm performs well with QBASD data. The data may have outliers as the data is based on the child&#x2019;s behavior and social interactions.</p>
<table-wrap id="table-4"><label>Table 4</label><caption><title>Classification metrics for the proposed ASD detection algorithm with ICA on QBASD dataset</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left" rowspan="2">Classifiers</th>
<th align="center" colspan="2">Precision</th>
<th align="center" colspan="2">F1-score</th>
<th align="center" colspan="2">Recall</th>
<th align="left" rowspan="2">Accuracy</th>
<th align="left" rowspan="2">AUC</th>
</tr>
<tr>
<th align="left">ASD</th>
<th align="left">Non-ASD</th>
<th align="left">ASD</th>
<th align="left">Non-ASD</th>
<th align="left">ASD</th>
<th align="left">Non-ASD</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Decision tree</td>
<td align="left">73</td>
<td align="left">100</td>
<td align="left">85</td>
<td align="left">78</td>
<td align="left">100</td>
<td align="left">64</td>
<td align="left">82</td>
<td align="left">0.81</td>
</tr>
<tr>
<td align="left">KNN (n&#x2009;&#x003D;&#x2009;5)</td>
<td align="left">90</td>
<td align="left">83</td>
<td align="left">86</td>
<td align="left">87</td>
<td align="left">82</td>
<td align="left">91</td>
<td align="left">86</td>
<td align="left">0.86</td>
</tr>
<tr>
<td align="left">LR</td>
<td align="left">100</td>
<td align="left">100</td>
<td align="left">100</td>
<td align="left">100</td>
<td align="left">100</td>
<td align="left">100</td>
<td align="left">100</td>
<td align="left">1</td>
</tr>
<tr>
<td align="left">RF</td>
<td align="left">92</td>
<td align="left">100</td>
<td align="left">96</td>
<td align="left">95</td>
<td align="left">100</td>
<td align="left">91</td>
<td align="left">95</td>
<td align="left">0.95</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><xref ref-type="fig" rid="fig-4">Fig. 4</xref> shows the ROC curves after implementing the LASSO feature reduction algorithm and building classifiers in Python. The decision tree classifier works on a greedy approach; the decision at each level affects the next level, so the features affect the tree&#x2019;s shape. The classifier performance could be better for small datasets. KNN is sensitive to outliers so a single outlier can change the classification boundary. It performs poorly as there may be outliers in the reduced dataset. LR performs exceptionally well for linear separable datasets. QBASD is a simple, linearly separable dataset; hence, LR performs exceptionally well. The Random Forest can handle outliers by binning the variables. It performs feature selection and builds its decision trees; therefore, the accuracy is high.</p>
<fig id="fig-4"><label>Figure 4</label><caption><title>ROC curve for LASSO-based algorithm on QBASD dataset (a) decision tree classifier (b) KNN classifier (c) LR classifier (d) RF classifier</title></caption><graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_40874-fig-4.tif"/></fig>
<p><xref ref-type="table" rid="table-5">Table 5</xref> shows the classification metrics for the proposed ASD detection model with the LASSO feature reduction algorithm. LR shows improved performance compared to other classifiers. LR has the risk of overfitting in high-dimensional datasets. LR model shows increased accuracy for the QBASD dataset reduced by the LASSO method as the few features selected are highly correlated with the target class.</p>
<table-wrap id="table-5"><label>Table 5</label><caption><title>Classification metrics for the proposed ASD detection algorithm with LASSO on QBASD dataset</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left" rowspan="2">Classifiers</th>
<th align="center" colspan="2">Precision</th>
<th align="center" colspan="2">F1-score</th>
<th align="center" colspan="2">Recall</th>
<th align="left" rowspan="2">Accuracy</th>
<th align="left" rowspan="2">AUC</th>
</tr>
<tr>
<th align="left">ASD</th>
<th align="left">Non-ASD</th>
<th align="left">ASD</th>
<th align="left">Non-ASD</th>
<th align="left">ASD</th>
<th align="left">Non-ASD</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Decision tree</td>
<td align="left">83</td>
<td align="left">70</td>
<td align="left">80</td>
<td align="left">74</td>
<td align="left">77</td>
<td align="left">78</td>
<td align="left">77</td>
<td align="left">0.77</td>
</tr>
<tr>
<td align="left">KNN (n&#x2009;&#x003D;&#x2009;5)</td>
<td align="left">75</td>
<td align="left">86</td>
<td align="left">75</td>
<td align="left">86</td>
<td align="left">75</td>
<td align="left">86</td>
<td align="left">82</td>
<td align="left">0.81</td>
</tr>
<tr>
<td align="left">LR</td>
<td align="left">91</td>
<td align="left">100</td>
<td align="left">95</td>
<td align="left">96</td>
<td align="left">100</td>
<td align="left">92</td>
<td align="left">95</td>
<td align="left">0.95</td>
</tr>
<tr>
<td align="left">RF</td>
<td align="left">71</td>
<td align="left">100</td>
<td align="left">83</td>
<td align="left">80</td>
<td align="left">100</td>
<td align="left">67</td>
<td align="left">82</td>
<td align="left">0.83</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><xref ref-type="fig" rid="fig-5">Fig. 5</xref> shows the ROC curves after implementing the ICA feature reduction algorithm and building classifiers in Python. The ROC curve for the decision tree shows a poor recall rate as the false negatives are high in the dataset. The precision rate is affected by the false positive rate. The LR shows high precision and recall rate as the false positives and negative counts are less for the classification model.</p>
<fig id="fig-5"><label>Figure 5</label><caption><title>ROC curve for LASSO-based algorithm on QBASD dataset (a) decision tree classifier (b) KNN classifier (c) LR classifier (d) RF classifier</title></caption><graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_40874-fig-5.tif"/></fig>
<p>The RF classifier gives a poor precision rate as the false positives are high. A simple KNN classifier is robust with noisy data and performs well compared to sophisticated classifiers [<xref ref-type="bibr" rid="ref-33">33</xref>]. KNN gives average accuracy for the ICA-based ASD detection algorithm. The optimum value of k also has an impact on the accuracy of the model. This study chooses the k value as five based on the trial-and-error method. The precision value is lower for non-ASD class detection than for ASD. In the medical field, precision in diagnosis is a significant factor. The LR gives improved accuracy, F1 score, and precision compared to other classifiers. The feature selection by LASSO and ICA modifies the dataset as a crisp dataset with few independent, uncorrelated features. Hence LR algorithm shows high performance for data with the feature set having independent variables [<xref ref-type="bibr" rid="ref-28">28</xref>]. The reduced feature set includes features that make the dataset linearly separable, and LR gives improved results with QBASD [<xref ref-type="bibr" rid="ref-34">34</xref>].</p>
</sec>
<sec id="s4"><label>4</label><title>Exploratory Data Analysis Phase</title>
<p>The data visualization of the QBASD dataset reveals some interesting facts regarding the questions in the QBASD questionnaire and ASD detection. <xref ref-type="table" rid="table-6">Table 6</xref> tabulates the essential questions related to the significant ASD signs in children under five. The CADx proposed in this study can detect ASD in children under five years as the symptoms analyzed are specifically for the age group of 3 months to five years. <xref ref-type="fig" rid="fig-6">Fig. 6</xref> shows the data visualizations of the correlation of features of the QBASD dataset. The figures show how strongly and weakly the features are correlated with the response variables. All the significant features of the QBASD dataset are listed in <xref ref-type="table" rid="table-6">Table 6</xref>.</p>
<table-wrap id="table-6"><label>Table 6</label><caption><title>Important features in QBASD dataset</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left">Question No.</th>
<th align="left">Question description</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Q4</td>
<td align="left">Does your child show interest in playing or interacting with another child?</td>
</tr>
<tr>
<td align="left">Q6</td>
<td align="left">Does the child smile when you smile at it?</td>
</tr>
<tr>
<td align="left">Q8</td>
<td align="left">Does your child use his/her index finger to point to indicate interest in something?</td>
</tr>
<tr>
<td align="left">Q9</td>
<td align="left">Does your child look at you in your eye for a second or two?</td>
</tr>
<tr>
<td align="left">Q13</td>
<td align="left">Does your child make unusual noises?</td>
</tr>
<tr>
<td align="left">Q14</td>
<td align="left">Is your child free of stereotyped repetitive movements?</td>
</tr>
<tr>
<td align="left">Q15</td>
<td align="left">Does your child have Hyper/Hypo behavior?</td>
</tr>
<tr>
<td align="left">Q16</td>
<td align="left">Does your child engage in self-injurious behavior?</td>
</tr>
<tr>
<td align="left">Q18</td>
<td align="left">Does the child never follow your gaze?</td>
</tr>
<tr>
<td align="left">Q21</td>
<td align="left">Does your child delay to respond to your call?</td>
</tr>
<tr>
<td align="left">Q22</td>
<td align="left">Does your child have inconsistent attention?</td>
</tr>
<tr>
<td align="left">Q23</td>
<td align="left">Does the child have unusual memory?</td>
</tr>
<tr>
<td align="left">Q25</td>
<td align="left">Did the mother have a deficiency in essential nutrients and fatty acids during pregnancy?</td>
</tr>
<tr>
<td align="left">Q26</td>
<td align="left">Is the mother under medications such as antidepressant drugs?</td>
</tr>
<tr>
<td align="left">Q29</td>
<td align="left">Is the child diagnosed with ASD?</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-6"><label>Figure 6</label><caption><title>Data visualization on QBASD dataset (a) data visualization of questions Q4 and Q6 (b) data visualization of questions Q8 and Q6 (c) data visualization of questions Q13 and Q14 (d) data visualization of questions Q15 and Q16 (e) data visualization of questions Q21 and Q22 (f) data visualization of questions Q25 and Q26 (g) data visualization of Q5 and Q29 (h) data visualization of Q18 and Q29 (i) data visualization of question Q22 and Q29 (j) data visualization of questions Q23 to Q29</title></caption><graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_40874-fig-6a.tif"/>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_40874-fig-6b.tif"/></fig>
<p><xref ref-type="fig" rid="fig-6">Fig. 6a</xref> represents the visualization for questions Q4 and Q6 in the QBASD dataset. If the child shows restricted interest in playing or interacting with another child, irrespective of the smile factor, the child is classified as belonging to the ASD class. In contrast, if the child shows interest in playing with other children, the child belongs to the non-ASD class. <xref ref-type="fig" rid="fig-6">Fig. 6b</xref> is the visualization of questions Q8 and Q9.</p>

<p>A child with ASD often has difficulty pointing to objects with the index finger. <xref ref-type="fig" rid="fig-6">Fig. 6b</xref> shows a child that cannot point the finger at anything interesting irrespective of being able to maintain eye contact, is diagnosed with ASD. A child capable of pointing fingers to indicate interest in objects and maintaining eye contact for a second or two falls under non-ASD class. <xref ref-type="fig" rid="fig-6">Fig. 6c</xref> is a visual analysis of Q13 and Q14. Children who make unusual noises and have stereotyped movements are classified as ASD. However, all ASD children do not need to make an unusual noise. Some children detected with ASD are free of stereotyped repetitive movements.</p>

<p><xref ref-type="fig" rid="fig-6">Fig. 6d</xref> is the visualization of Q15 and Q16. The graph shows that the child&#x2019;s hyperactivity and self-injurious behavior classifies it as ASD. The child with no symptoms of hyperactivity or self-injurious behavior belongs to the non-ASD class. Children with hyperactivity could have attention deficit hyperactivity disorder (ADHD), not mandatorily ASD. In the visualization, non-ASD children are found to have symptoms of hyperactivity. <xref ref-type="fig" rid="fig-6">Fig. 6e</xref> is the visualization of Q21 and Q22. Most children having inconsistent attention are classified as having ASD. The child&#x2019;s delay in responding to the call is a essential biomarker for ASD detection. The visualization of questions Q25 and Q26 is shown in <xref ref-type="fig" rid="fig-6">Fig. 6f</xref>.</p>

<p>The visualization&#x2019;s conclusions are unclear on whether nutrient deficiency during pregnancy can cause ASD. There are very few samples where mothers are on medication; hence, it is difficult to conclude whether the mother&#x2019;s medications cause ASD in the child. <xref ref-type="fig" rid="fig-6">Fig. 6g</xref> shows the most significant symptom of responding to social cues as a biomarker for ASD detection. Classification of a child as ASD or non-ASD depends on the ability to respond to the social cue. <xref ref-type="fig" rid="fig-6">Fig. 6h</xref> shows the strong correlation between the biomarker Q18 and ASD detection. The swarm plot shows that the children unable to follow the gaze are diagnosed with ASD.</p>

<p><xref ref-type="fig" rid="fig-6">Fig. 6i</xref> shows inconsistent attention is found more in ASD children than the non-ASD children. <xref ref-type="fig" rid="fig-6">Fig. 6j</xref> is the swarm plot between Q23 to Q29. Children with ASD have unusual memory, but children without ASD have a good memory. Hence, it cannot be a robust independent biomarker for ASD detection. The common signs of autism include not responding to social cues, not following the parent&#x2019;s gaze, not pointing to objects, not following simple instructions, repetitive movements, unusual memory, inconsistent attention, and not maintaining eye contact [<xref ref-type="bibr" rid="ref-35">35</xref>]. The visualization shows the strong correlation between the signs of autism specified by experts and those automatically detected by the proposed ASD detection system.</p>

</sec>
<sec id="s5"><label>5</label><title>Conclusions</title>
<p>In this research, we have experimentally studied the performance of the proposed automated detection tool for ASD detection. The proposed CADx selects the best features from the QBASD dataset using the ICA feature selection algorithm. The performance of the model built with LASSO as the feature selection method and LR classifier gives 95&#x0025; accuracy for ASD detection. The evaluation of models with standard metrics shows that the ICA-based ASD detection algorithm provides 100&#x0025; accuracy with LR as the classifier. The proposed CADx can detect ASD in children under five years, as the features included in the dataset are signs of ASD for children under five. The logistic regression as a classifier gives high accuracy as it can handle outliers. LR is suitable for linearly separable datasets. The model shows improved accuracy compared to the state-of-the-art methodologies. The exploratory data analysis phase shows the relations between vital symptoms of ASD identified in the study and collected as a dataset. The visualization of the dataset reveals that the features selected by the ICA algorithm are significant features for ASD detection at an early age. This research is novel as the dataset is a self-collected dataset from special schools for autism. The future direction of research is to study the neuroimages to detect autism.</p>
</sec>
</body>
<back>
<ack>
<p>We sincerely thank Kaumaram Prashanthi Academy, a special education school in Tamil Nadu, India for their valuable support in data collection. We thank Parvathi Ravichandran, Coordinator, WVS Special School, Koundampalayam, Coimbatore, Tamil Nadu, India, for their suggestions in preparing the questionnaire and valuable support in data collection. We thank Dr. Sofia Devagnanam, founder and director of Liztoz Preschool and Litz Child development center, Coimbatore, Tamil Nadu, India for her valuable inputs in preparing the questionnaire for the data collection of QBASD dataset.</p>
</ack>
<sec><title>Funding Statement</title>
<p>The authors extend their appreciation to the Deputyship for Research &#x0026; Innovation, Ministry of Education in Saudi Arabia for funding this research work through the Project Number (IF2-PSAU-2022/01/22043).</p></sec>
<sec><title>Author Contributions</title>
<p>Study conception and design: Shabana R. Ziyad; Data collection: Shabana R. Ziyad; Analysis and interpretation of results: Shabana R. Ziyad, I. A. Saeed, Liyakathunisa; Manuscript preparation: Shabana R. Ziyad, Liyakathunisa, I. A. Saeed; Review &#x0026; editing: Shabana R. Ziyad, Eman Aljohani.</p></sec>
<sec sec-type="data-availability"><title>Availability of Data and Materials</title>
<p>The data collected is a confidential dataset that was self-collected from special education schools.</p></sec>
<sec><title>Ethics Approval</title>
<p>The study involves the responses from parents of ASD and non-ASD children studying in schools. The dataset used in this study related to Project Number IF2-PSAU-2022/01/22043, has received IRB approval from the Ethical Review and Approval Committee, Prince Sattam bin Abdulaziz, Al Kharj. The reference ID for approval is SCBR-085-2022 dated 16/11/2022.</p></sec>
<sec sec-type="COI-statement"><title>Conflicts of Interest</title>
<p>The authors declare that they have no conflicts of interest to report regarding the present study.</p></sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="web">&#x201C;<article-title>Down Syndrome and Autism Spectrum Disorder (DS-ASD</article-title>),&#x201D; Autism Speaks, <year>2022</year>. [Online]. Available: <ext-link ext-link-type="uri" xlink:href="https://www.autismspeaks.org/down-syndrome-and-autism-spectrum-disorder-ds-asd">https://www.autismspeaks.org/down-syndrome-and-autism-spectrum-disorder-ds-asd</ext-link> <comment>
(accessed on 02/12/2022)</comment>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="web">&#x201C;<article-title>Autism Statistics and Facts</article-title>,&#x201D; Autism Speaks, <year>2022</year>. [Online]. Available: <ext-link ext-link-type="uri" xlink:href="https://www.autismspeaks.org/autism-statistics-asd">https://www.autismspeaks.org/autism-statistics-asd</ext-link> <comment>(accessed on 02/12/2022)</comment>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="web">&#x201C;<article-title>The Kids First</article-title>,&#x201D; Kids first, <year>2022</year>. [Online]. Available: <ext-link ext-link-type="uri" xlink:href="https://kids-first.com.au/5-ways-speech-therapy-can-help-children-with-autism/">https://kids-first.com.au/5-ways-speech-therapy-can-help-children-with-autism/</ext-link> <comment>(accessed on 02/12/2022)</comment>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Chuthapisith</surname></string-name> and <string-name><given-names>N.</given-names> <surname>Ruangdaragano</surname></string-name></person-group>, &#x201C;<chapter-title>Early detection of autism spectrum disorders</chapter-title>,&#x201D; in <source>Autism Spectrum Disorders: The Role of Genetics in Diagnosis and Treatment</source>, <edition>1</edition><sup>st</sup> ed., <publisher-loc>USA</publisher-loc>: <publisher-name>IntechOpen</publisher-name>, <year>2011</year>. [Online]. Available: <ext-link ext-link-type="uri" xlink:href="https://www.intechopen.com/chapters/17274">https://www.intechopen.com/chapters/17274</ext-link> <comment>(accessed on 02/12/2022)</comment>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Baygin</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Dogan</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Tuncer</surname></string-name>, <string-name><given-names>P. D.</given-names> <surname>Barua</surname></string-name>, <string-name><given-names>O.</given-names> <surname>Faust</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Automated ASD detection using hybrid deep lightweight features extracted from EEG signals</article-title>,&#x201D; <source>Computers in Biology and Medicine</source>, vol. <volume>134</volume>, pp. <fpage>104548</fpage>, <year>2021</year>; <pub-id pub-id-type="pmid">34119923</pub-id></mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>O.</given-names> <surname>Dekhil</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Ali</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Nakieb</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Shalaby</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Soliman</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>A personalized autism diagnosis CAD system using a fusion of structural MRI and resting-state functional MRI data</article-title>,&#x201D; <source>Frontiers in Psychiatry</source>, vol. <volume>10</volume>, pp. <fpage>392</fpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Eni</surname></string-name>, <string-name><given-names>I.</given-names> <surname>Dinstein</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Ilan</surname></string-name>, <string-name><given-names>I.</given-names> <surname>Menashe</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Meiri</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Estimating autism severity in young children from speech signals using a deep neural network</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>8</volume>, pp. <fpage>139489</fpage>&#x2013;<lpage>139500</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>Drimalla</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Scheffer</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Landwehr</surname></string-name>, <string-name><given-names>I.</given-names> <surname>Baskow</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Roepke</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Towards the automatic detection of social biomarkers in autism spectrum disorder: Introducing the simulated interaction task (SIT)</article-title>,&#x201D; <source>Digital Medicine</source>, vol. <volume>3</volume>, pp. <fpage>25</fpage>, <year>2020</year>; <pub-id pub-id-type="pmid">32140568</pub-id></mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Alca&#x00F1;iz</surname></string-name>, <string-name><given-names>I. A.</given-names> <surname>Chicchi-Giglioli</surname></string-name>, <string-name><given-names>L. A.</given-names> <surname>Carrasco-Ribelles</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Mar&#x00ED;n-Morales</surname></string-name>, <string-name><given-names>M. E.</given-names> <surname>Minissi</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Eye gaze as a biomarker in the recognition of autism spectrum disorder using virtual reality and machine learning: A proof of concept for diagnosis</article-title>,&#x201D; <source>Autism Research</source>, vol. <volume>15</volume>, pp. <fpage>131</fpage>&#x2013;<lpage>145</lpage>, <year>2022</year>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Zhao</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Tang</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Qu</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Hu</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Classification of children with autism and typical development using eye-tracking data from face-to-face conversations: Machine learning model development and performance evaluation</article-title>,&#x201D; <source>Journal of Medical Internet Research</source>, vol. <volume>23</volume>, no. <issue>8</issue>, pp. <fpage>e29328</fpage>, <year>2021</year>; <pub-id pub-id-type="pmid">34435957</pub-id></mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>T.</given-names> <surname>Talkar</surname></string-name>, <string-name><given-names>J. R.</given-names> <surname>Williamson</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Hannon</surname></string-name>, <string-name><given-names>H. M.</given-names> <surname>Rao</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Yuditskaya</surname></string-name></person-group>, &#x201C;<article-title>Assessment of speech and fine motor coordination in children with autism spectrum disorder</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>8</volume>, pp. <fpage>127535</fpage>&#x2013;<lpage>127545</lpage>, <year>2020</year>; <pub-id pub-id-type="pmid">33747676</pub-id></mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A. Z.</given-names> <surname>Guo</surname></string-name></person-group>, &#x201C;<article-title>Automated autism detection based on characterizing observable patterns from photos</article-title>,&#x201D; <source>IEEE Transaction on Affective Computing</source>, vol. <volume>14</volume>, pp. <fpage>836</fpage>&#x2013;<lpage>841</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Grzadzinski</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Amso</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Landa</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Watson</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Guralnick</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Pre-symptomatic intervention for autism spectrum disorder (ASD): Defining a research agenda</article-title>,&#x201D; <source>Journal of Neurodevelopment Disorder</source>, vol. <volume>13</volume>, pp. <fpage>49</fpage>, <year>2021</year>; <pub-id pub-id-type="pmid">34654371</pub-id></mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G. S.</given-names> <surname>Thejas</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Garg</surname></string-name>, <string-name><given-names>S. S.</given-names> <surname>Iyengar</surname></string-name>, <string-name><given-names>N. R.</given-names> <surname>Sunitha</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Badrinath</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Metric and accuracy ranked feature inclusion: Hybrids of filter and wrapper feature selection approaches</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>9</volume>, pp. <fpage>128687</fpage>&#x2013;<lpage>128701</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>X.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Guo</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Shen</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Zhou</surname></string-name> and <string-name><given-names>X.</given-names> <surname>Duan</surname></string-name></person-group>, &#x201C;<article-title>Input feature selection method based on feature set equivalence and mutual information gain maximization</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>7</volume>, pp. <fpage>151525</fpage>&#x2013;<lpage>151538</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="web"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Freijeiro-Gonz&#x00E1;lez</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Febrero-Bande</surname></string-name> and <string-name><given-names>W.</given-names> <surname>Gonz&#x00E1;lez-Manteiga</surname></string-name></person-group>, &#x201C;<article-title>A critical review of LASSO and its derivatives for variable selection under dependence among covariates</article-title>,&#x201D; <source>International Statistical Review</source>, <year>2021</year>. [Online]. Available: <ext-link ext-link-type="uri" xlink:href="https://onlinelibrary.wiley.com/doi/full/10.1111/insr.12469">https://onlinelibrary.wiley.com/doi/full/10.1111/insr.12469</ext-link> <comment>(accessed on 21/12/2022)</comment>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Tibshirani</surname></string-name></person-group>, &#x201C;<article-title>Regression shrinkage and selection via the lasso: A retrospective</article-title>,&#x201D; <source>Journal of Statistical Methodology</source>, vol. <volume>73</volume>, pp. <fpage>273</fpage>&#x2013;<lpage>282</lpage>, <year>2011</year>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>E.</given-names> <surname>Courtois</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Tubert-Bitter</surname></string-name> and <string-name><given-names>I.</given-names> <surname>Ahmed</surname></string-name></person-group>, &#x201C;<article-title>New adaptive lasso approaches for variable selection in automated pharmacovigilance signal detection</article-title>,&#x201D; <source>BMC Medical Research Methodology</source>, vol. <volume>21</volume>, pp. <fpage>271</fpage>, <year>2021</year>; <pub-id pub-id-type="pmid">34852782</pub-id></mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>E.</given-names> <surname>Atashpaz-Gargari</surname></string-name> and <string-name><given-names>C.</given-names> <surname>Lucas</surname></string-name></person-group>, &#x201C;<article-title>Imperialist competitive algorithm: An algorithm for optimization inspired by imperialistic competition</article-title>,&#x201D; in <conf-name>Proc. of IEEE Congress of Evolutionary Computation</conf-name>, <conf-loc>Singapore</conf-loc>, pp. <fpage>4661</fpage>&#x2013;<lpage>4667</lpage>, <year>2007</year>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Hosseini</surname></string-name> and <string-name><given-names>A.</given-names> <surname>Al Khaled</surname></string-name></person-group>, &#x201C;<article-title>A survey on the imperialist competitive algorithm metaheuristic: Implementation in engineering domain and directions for future research</article-title>,&#x201D; <source>Applied Soft Computing</source>, vol. <volume>24</volume>, pp. <fpage>1078</fpage>&#x2013;<lpage>1094</lpage>, <year>2014</year>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Lian</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Gao</surname></string-name> and <string-name><given-names>X.</given-names> <surname>Shao</surname></string-name></person-group>, &#x201C;<article-title>A modified colonial competitive algorithm for the mixed-model U-line balancing and sequencing problem</article-title>,&#x201D; <source>International Journal of Production Research</source>, vol. <volume>50</volume>, pp. <fpage>1</fpage>&#x2013;<lpage>15</lpage>, <year>2012</year>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>B.</given-names> <surname>Deepak</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Gunji</surname></string-name>, <string-name><given-names>M. V. A. R.</given-names> <surname>Bahubalendruni</surname></string-name> and <string-name><given-names>B.</given-names> <surname>Biswal</surname></string-name></person-group>, &#x201C;<article-title>Assembly sequence planning using soft computing methods: A review</article-title>,&#x201D; in <source>Proc. of Institution of Mechanical Engineers, Part E: Journal of Process Mechanical Engineering</source>, vol. <volume>233</volume>, no. <issue>3</issue>, pp. <fpage>653</fpage>&#x2013;<lpage>683</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>Sarayloo</surname></string-name> and <string-name><given-names>R.</given-names> <surname>Tavakkoli-Moghaddam</surname></string-name></person-group>, &#x201C;<article-title>Imperialistic competitive algorithm for solving a dynamic cell formation problem with production planning</article-title>,&#x201D; in <conf-name>Proc. of Int. Conf. of Intelligent Computing</conf-name>, <conf-loc>Heidelberg, Berlin</conf-loc>, vol. <volume>6215</volume>, pp. <fpage>266</fpage>&#x2013;<lpage>276</lpage>, <year>2010</year>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Karimi</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Ghodratnama</surname></string-name> and <string-name><given-names>R.</given-names> <surname>Tavakkoli-Moghaddam</surname></string-name></person-group>, &#x201C;<article-title>Two new feature selection methods based on learn-heuristic techniques for breast cancer prediction: A comprehensive analysis</article-title>,&#x201D; <source>Annals of Operation Research</source>, <year>2022</year>. [Online]. Available: <pub-id pub-id-type="doi">10.1007/s10479-022-04933-8</pub-id></mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Kaveh</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Talatahari</surname></string-name></person-group>, &#x201C;<article-title>Imperialist competitive algorithm for engineering design problems</article-title>,&#x201D; <source>Asian Journal of Civil Engineering</source>, vol. <volume>11</volume>, pp. <fpage>675</fpage>&#x2013;<lpage>697</lpage>, <year>2010</year>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Hussain</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Salleh</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Cheng</surname></string-name> and <string-name><given-names>R.</given-names> <surname>Naseem</surname></string-name></person-group>, &#x201C;<article-title>Common benchmark functions for metaheuristic evaluation: A review</article-title>,&#x201D; <source>International Journal of Informatics Visualization</source>, vol. <volume>1</volume>, pp. <fpage>218</fpage>&#x2013;<lpage>223</lpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Dreiseitl</surname></string-name> and <string-name><given-names>L.</given-names> <surname>Ohno-Machado</surname></string-name></person-group>, &#x201C;<article-title>Logistic regression and artificial neural network classification models: A methodology review</article-title>,&#x201D; <source>Journal of Biomedical Informatics</source>, vol. <volume>35</volume>, pp. <fpage>352</fpage>&#x2013;<lpage>359</lpage>, <year>2002</year>; <pub-id pub-id-type="pmid">12968784</pub-id></mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Maalouf</surname></string-name></person-group>, &#x201C;<article-title>Logistic regression in data analysis: An overview</article-title>,&#x201D; <source>International Journal of Data Analysis Techniques and Strategies</source>, vol. <volume>3</volume>, pp. <fpage>281</fpage>&#x2013;<lpage>299</lpage>, <year>2011</year>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Song</surname></string-name> and <string-name><given-names>Y.</given-names> <surname>Lu</surname></string-name></person-group>, &#x201C;<article-title>Decision tree methods: Applications for classification and prediction</article-title>,&#x201D; <source>Shanghai Archives of Psychiatry</source>, vol. <volume>27</volume>, pp. <fpage>130</fpage>&#x2013;<lpage>135</lpage>, <year>2015</year>; <pub-id pub-id-type="pmid">26120265</pub-id></mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>F&#x00FC;rnkranz</surname></string-name></person-group>, &#x201C;<chapter-title>Decision tree</chapter-title>,&#x201D; in <source>Encyclopedia of Machine Learning and Data Mining</source>, <publisher-loc>Boston, USA</publisher-loc>: <publisher-name>MA, Springer</publisher-name>, pp. <fpage>330</fpage>&#x2013;<lpage>335</lpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="thesis"><person-group person-group-type="author"><string-name><given-names>S. R.</given-names> <surname>Ziyad</surname></string-name></person-group>, &#x201C;<article-title>Early lung cancer detection a new automated approach with improved diagnostic performance</article-title>,&#x201D; <comment>Ph.D. dissertation</comment>, <publisher-name>Avinashilingam University for Women</publisher-name>, <publisher-loc>India</publisher-loc>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Latha</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Muthu</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Dhanalakshmi</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Kumar</surname></string-name>, <string-name><given-names>K. W.</given-names> <surname>Lai</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Emerging feature extraction techniques for machine learning-based classification of carotid artery ultrasound images</article-title>,&#x201D; <source>Computational Intelligence and Neuroscience</source>, vol. <volume>2022</volume>, pp. <fpage>1847981</fpage>, <year>2022</year>; <pub-id pub-id-type="pmid">35602622</pub-id></mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>Liu</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Zhang</surname></string-name></person-group>, &#x201C;<article-title>Noisy data elimination using mutual k-nearest neighbor for classification mining</article-title>,&#x201D; <source>Journals of Systems and Software</source>, vol. <volume>85</volume>, pp. <fpage>1067</fpage>&#x2013;<lpage>1074</lpage>, <year>2012</year>.</mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Shen</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Gao</surname></string-name></person-group>, &#x201C;<article-title>A solution to separation and multicollinearity in multiple logistic regression</article-title>,&#x201D; <source>Journal of Data Science</source>, vol. <volume>6</volume>, pp. <fpage>515</fpage>&#x2013;<lpage>531</lpage>, <year>2008</year>; <pub-id pub-id-type="pmid">20376286</pub-id></mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="web">&#x201C;<article-title>Signs of Autism</article-title>,&#x201D; Autism Association of Western Australia, <year>2022</year>. [Online]. Available: <ext-link ext-link-type="uri" xlink:href="https://www.autism.org.au/what-is-autism/signs-of-autism/">https://www.autism.org.au/what-is-autism/signs-of-autism/</ext-link> <comment>(accessed on 11/01/2023)</comment>.</mixed-citation></ref>
</ref-list>
</back></article>