<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CSSE</journal-id>
<journal-id journal-id-type="nlm-ta">CSSE</journal-id>
<journal-id journal-id-type="publisher-id">CSSE</journal-id>
<journal-title-group>
<journal-title>Computer Systems Science &#x0026; Engineering</journal-title>
</journal-title-group>
<issn pub-type="ppub">0267-6192</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">17536</article-id>
<article-id pub-id-type="doi">10.32604/csse.2021.017536</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Hybrid Sooty Tern Optimization and Differential Evolution for Feature Selection</article-title><alt-title alt-title-type="left-running-head">Hybrid Sooty Tern Optimization and Differential Evolution for Feature Selection</alt-title><alt-title alt-title-type="right-running-head">Hybrid Sooty Tern Optimization and Differential Evolution for Feature Selection</alt-title>
</title-group>
<contrib-group content-type="authors">
<contrib id="author-1" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Jia</surname><given-names>Heming</given-names></name>
<xref ref-type="aff" rid="aff-1">1</xref>
<xref ref-type="aff" rid="aff-2">2</xref><email>jiaheminglucky99@126.com</email>
</contrib>
<contrib id="author-2" contrib-type="author">
<name name-style="western"><surname>Li</surname><given-names>Yao</given-names></name>
<xref ref-type="aff" rid="aff-2">2</xref>
</contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Sun</surname><given-names>Kangjian</given-names></name>
<xref ref-type="aff" rid="aff-2">2</xref>
</contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Cao</surname><given-names>Ning</given-names></name>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
<contrib id="author-5" contrib-type="author">
<name name-style="western"><surname>Zhou</surname><given-names>Helen Min</given-names></name>
<xref ref-type="aff" rid="aff-3">3</xref>
</contrib>
<aff id="aff-1"><label>1</label><institution>College of Information Engineering, Sanming University</institution>, <addr-line>Sanming, 365004</addr-line>, <country>China</country></aff>
<aff id="aff-2"><label>2</label><institution>College of Mechanical and Electrical Engineering, Northeast Forestry University</institution>, <addr-line>Harbin, 150040</addr-line>, <country>China</country></aff>
<aff id="aff-3"><label>3</label><institution>School of Engineering, Manukau Institute of Technology</institution>, <addr-line>Auckland, 2241</addr-line>, <country>New Zealand</country></aff>
</contrib-group><author-notes><corresp id="cor1">&#x002A;Corresponding Author: Heming Jia. Email: <email>jiaheminglucky99@126.com</email></corresp></author-notes>
<pub-date pub-type="epub" date-type="pub" iso-8601-date="2021-07-29"><day>29</day>
<month>07</month>
<year>2021</year></pub-date>
<volume>39</volume>
<issue>3</issue>
<fpage>321</fpage>
<lpage>335</lpage>
<history>
<date date-type="received"><day>02</day><month>2</month><year>2021</year></date>
<date date-type="accepted"><day>20</day><month>3</month><year>2021</year></date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2021 Jia et al.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Jia et al.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CSSE_17536.pdf"></self-uri>
<abstract>
<p>In this paper, a hybrid model based on sooty tern optimization algorithm (STOA) is proposed to optimize the parameters of the support vector machine (SVM) and identify the best feature sets simultaneously. Feature selection is an essential process of data preprocessing, and it aims to find the most relevant subset of features. In recent years, it has been applied in many practical domains of intelligent systems. The application of SVM in many fields has proved its effectiveness in classification tasks of various types. Its performance is mainly determined by the kernel type and its parameters. One of the most challenging process in machine learning is feature selection, intending to select effective and representative features. The main disadvantages of feature selection processes included in classical optimization algorithm are local optimal stagnation and slow convergence. Therefore, the hybrid model proposed in this paper merges the STOA and differential evolution (DE) to improve the search efficiency and convergence rate. A series of experiments are conducted on 12 datasets from the UCI repository to comprehensively and objectively evaluate the performance of the proposed method. The superiority of the proposed method is illustrated from different aspects, such as the classification accuracy, convergence performance, reduced feature dimensionality, standard deviation (STD), and computation time.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>sooty tern optimization algorithm</kwd>
<kwd>hybrid optimization</kwd>
<kwd>feature selection</kwd>
<kwd>support vector machine</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>Many data-driven solutions from the fields of data mining and machine learning have been proposed to tackle the vast and complex data. In the field of data science, the classification is an important yet challenging task since the data to be processed is becoming increasingly complex [<xref ref-type="bibr" rid="ref-1">1</xref>]. In 2019, K. Kaur et al. proposed a decision tree-based method to predict the failure, lead time and health degree of hard disk drives [<xref ref-type="bibr" rid="ref-2">2</xref>]. In 2020, Zhu et al. applied the improved naive Bayes algorithm to perform software defect prediction for within-/cross-project scenarios. Experimental results showed that the proposed method had the better predictive ability [<xref ref-type="bibr" rid="ref-3">3</xref>]. In the same year, Zhang et al. applied feature-weighted gradient decent k-nearest neighbor for selecting promising projects accurately [<xref ref-type="bibr" rid="ref-4">4</xref>]. Furthermore, a support vector machine was used to detect the number of wide-band signals by Zhen which leads to improved performance [<xref ref-type="bibr" rid="ref-5">5</xref>]. However, due to the variety and complexity of the data, the data classification tasks in various fields are still facing challenges. With the aim of efficient and accurate data processing, researchers have proposed various methods to incorporate feature selection to facilitate the classification tasks.</p>
<p>To better solve the problem of the combination of feature selection and classification methods, researchers introduced the optimization algorithms to optimize the kernel parameters of SVM. Chapelle et al. proposed the gradient descent method for parameter selection [<xref ref-type="bibr" rid="ref-6">6</xref>]. Yu et al. introduced a classification method based on two-side cross-domain collaborative filtering, which can better build a classification model in the target domain by inferring intrinsic users and features efficiently [<xref ref-type="bibr" rid="ref-7">7</xref>]. Meanwhile, in recent years, scholars have also begun to combine feature selection with optimization algorithms to improve classification accuracy and efficiency. Zhang et al. are the first to put forward the feature selection method using multi-objective particle swarm optimization, which shows high competitiveness compared with the traditional single objective feature selection [<xref ref-type="bibr" rid="ref-8">8</xref>]. Jia et al. combined the feature selection and spotted hyena optimization in 2019, which can improve the accuracy and reduce redundant data [<xref ref-type="bibr" rid="ref-9">9</xref>]. Baliarsingh et al. also applied the emperor penguin optimization (EPO) to the classification method in the same year to deal with the medical data, which considerably settled the complicated and challenging data problems in the same year [<xref ref-type="bibr" rid="ref-10">10</xref>]. Therefore, these studies inspire us to apply the practical optimization algorithms to feature selection for feature selection to further improve the classification accuracy.</p>
<p>On the other hand, the sooty tern optimization algorithm (STOA) mimics the behavior of sooty tern in nature. Since it was presented by Gaurav Dhiman et al. in 2019 [<xref ref-type="bibr" rid="ref-11">11</xref>], this method has been widely used in many fields, such as financial stress prediction, feature selection, signal processing, etc. The STOA algorithm still needs to be further improved to better deal with practical problems. Thus, the local search ability of the STOA algorithm should be focused on. Another excellent optimization method is differential evolution (DE) [<xref ref-type="bibr" rid="ref-12">12</xref>]. It can improve the search efficiency and maintain the population diversity while DE was introduced into other algorithms. Xiong et al. proposed a hybrid method named DE/WOA to extract the proper parameters of photovoltaic models [<xref ref-type="bibr" rid="ref-13">13</xref>]. Moreover, Jia et al. presented the model that combined GOA and DE for multilevel satellite image segmentation, which improved the speed and accuracy of image segmentation [<xref ref-type="bibr" rid="ref-14">14</xref>]. Therefore, it can be known that the DE algorithm was suitable for solving the problem of insufficient local search and local optimum entrapment in the traditional STOA algorithm.</p>
<p>The main contributions of this paper are as follows: Firstly, according to the concept of average fitness value, this paper proposes the STOA-DE algorithm, which provides stronger convergence ability and faster convergence speed compared with the traditional STOA algorithm; Secondly, the STOA-DE algorithm can be applied to SVM with the feature selection process to optimize the parameters of SVM and binary features simultaneously; Finally, the proposed model is verified on the classic UCI data sets in this paper. Furthermore, the empirical confirms that the proposed method can effectively identify the useful features, thus contributing to better classification accuracy. In other words, the STOA-DE algorithm has an advantage in completing the data classification and has a wide range of engineering applications.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Basic Algorithm</title>
<p>A hybrid STOA algorithm with DE is described in detail in this section. Firstly, the STOA and DE algorithm are introduced. Then, the hybrid model is explained profoundly. Last, the proposed algorithm called STOA-DE is applied in the correlative domain.</p>
<sec id="s2_1">
<label>2.1</label>
<title>Sooty Tern Optimization Algorithm</title>
<p>The STOA algorithm is inspired by the behavior of sooty tern in nature. It was firstly proposed by Gaurav Dhiman for industrial engineering problems [<xref ref-type="bibr" rid="ref-11">11</xref>]. Sooty terns are omnivorous and eat earthworms, insects, fish, and so on. Compared with other bionic optimization algorithms, the highlight of STOA is its exploration and exploitation.</p>
<sec id="s2_1_1">
<label>2.1.1</label>
<title>Migration Behavior (Exploration)</title>
<p>Migrating behavior, the exploration part, is defined as the following three aspects:</p>
<p>&#x2022; Collision avoidance:</p>
<p><inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mi>A</mml:mi></mml:msub></mml:mrow></mml:math>
</inline-formula>is used to calculate the new position to avoid collision between adjacent search agents.</p>
<p><disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover><mml:mo>&#x003D;</mml:mo><mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mi>A</mml:mi></mml:msub></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>Z</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover></mml:math>
</inline-formula> is the position of the search agent without colliding with other search agents, <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover></mml:math>
</inline-formula> indicates the current location of the search agent, <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mi>Z</mml:mi></mml:math>
</inline-formula> represents the current iteration and <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mi>A</mml:mi></mml:msub></mml:mrow></mml:math>
</inline-formula> shows the search agent movement in a given search space.</p>
<p><disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mi>A</mml:mi></mml:msub></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>f</mml:mi></mml:msub></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>Z</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>f</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mi>M</mml:mi><mml:mi>a</mml:mi><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math>
</disp-formula></p>
<p>Where,</p>
<p><disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:mi>Z</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:mi>M</mml:mi><mml:mi>a</mml:mi><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>f</mml:mi></mml:msub></mml:mrow></mml:math>
</inline-formula> is the control variable to adjust <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mi>A</mml:mi></mml:msub></mml:mrow></mml:math>
</inline-formula>, and it is linearly reduced from <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>f</mml:mi></mml:msub></mml:mrow></mml:math>
</inline-formula> to 0. Meanwhile, the value of <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>f</mml:mi></mml:msub></mml:mrow></mml:math>
</inline-formula> is set to 2 in this paper.</p>
<p>&#x2022; Converge towards the direction of the best neighbor:</p>
<p>The search agents move towards the best neighbor after collision avoidance.</p>
<p><disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover><mml:mo>&#x003D;</mml:mo><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>b</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>Z</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>Z</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math>
</disp-formula></p>
<p>where <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover></mml:math>
</inline-formula> represents that the search agents in different positions <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover></mml:math>
</inline-formula> move towards the best fittest search agent <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>b</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover></mml:math>
</inline-formula>. <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow></mml:math>
</inline-formula> is responsible for better exploration, which is defined as follows:</p>
<p><disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:mn>0.5</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math>
</inline-formula> is a random number between <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula>.</p>
<p>&#x2022; Update towards the best search agent:</p>
<p>Finally, the search agents update their position toward the direction of the best sooty terns.</p>
<p><disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover><mml:mo>&#x003D;</mml:mo><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover><mml:mo>&#x002B;</mml:mo><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover></mml:math>
</inline-formula> represents the distance between the search agent and the fittest search agent.</p>
</sec>
<sec id="s2_1_2">
<label>2.1.2</label>
<title>Attacking Behavior (Exploitation<italic>)</italic></title>
<p>During migration, sooty terns can adjust their speed and angle of attack. They increase their altitude by their wings. In the case of attacking the prey, their spiral behaviors are defined as follows [<xref ref-type="bibr" rid="ref-15">15</xref>]:</p>
<p><disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:msup><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>&#x003D;</mml:mo><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>u</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mi>sin</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math>
</disp-formula></p>
<p><disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:msup><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>&#x003D;</mml:mo><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>u</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mi>cos</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math>
</disp-formula></p>
<p><disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:msup><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>&#x003D;</mml:mo><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>u</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mi>i</mml:mi></mml:math>
</disp-formula></p>
<p><disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>u</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:mi>u</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mi>v</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>u</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math>
</inline-formula> indicates the radius of each spiral and <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:mi>i</mml:mi></mml:math>
</inline-formula> suggests the variable between <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mi>&#x03C0;</mml:mi><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula>. <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:mi>u</mml:mi></mml:math>
</inline-formula> and <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:mi>v</mml:mi></mml:math>
</inline-formula> are constants that define the shape of spiral, while <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:mi>e</mml:mi></mml:math>
</inline-formula> is the base of the natural logarithm. Furthermore, it is assumed that the value of the constants <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:mi>u</mml:mi></mml:math>
</inline-formula> and <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:mi>v</mml:mi></mml:math>
</inline-formula> is 1 in this paper. Therefore, the position of the search agent will update as follows:</p>
<p><disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block"><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>Z</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>&#x002B;</mml:mo><mml:msup><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup><mml:mo>&#x002B;</mml:mo><mml:msup><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2032;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mover><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>b</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">&#x2192;</mml:mo></mml:mover><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>Z</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math>
</disp-formula></p>
</sec>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Differential Evolution</title>
<p>As a simple but effective method, DE has attracted wide attention over the past few decades since it was proposed by Storn and Price in 1997 [<xref ref-type="bibr" rid="ref-12">12</xref>]. There are three steps in DE: mutation, crossover and selection. Scaling factor <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:mi>S</mml:mi><mml:mi>F</mml:mi></mml:math>
</inline-formula> and crossover probability <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:mi>C</mml:mi><mml:mi>R</mml:mi></mml:math>
</inline-formula> are two significant parameters that can influence the exploration and exploitation in optimization.</p>
<sec id="s2_2_1">
<label>2.2.1</label>
<title>Mutation</title>
<p>The mutation operation is mathematically defined as follows:</p>
<p><disp-formula id="eqn-12"><label>(12)</label><mml:math id="mml-eqn-12" display="block"><mml:msubsup><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x003D;</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>r</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mi>t</mml:mi></mml:msubsup><mml:mo>&#x002B;</mml:mo><mml:mi>S</mml:mi><mml:mi>F</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>r</mml:mi><mml:mn>2</mml:mn></mml:mrow><mml:mi>t</mml:mi></mml:msubsup><mml:mo>&#x2212;</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>r</mml:mi><mml:mn>3</mml:mn></mml:mrow><mml:mi>t</mml:mi></mml:msubsup></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math>
</disp-formula></p>
<p>where <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:msubsup><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math>
</inline-formula> is the mutant individual in the <inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:math>
</inline-formula> iteration. <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>r</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mi>t</mml:mi></mml:msubsup></mml:math>
</inline-formula>, <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>r</mml:mi><mml:mn>2</mml:mn></mml:mrow><mml:mi>t</mml:mi></mml:msubsup></mml:math>
</inline-formula> and <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>r</mml:mi><mml:mn>3</mml:mn></mml:mrow><mml:mi>t</mml:mi></mml:msubsup></mml:math>
</inline-formula> represent three different individuals in the population. In detail, <inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:math>
</inline-formula>, <inline-formula id="ieqn-33"><mml:math id="mml-ieqn-33"><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:math>
</inline-formula> and <inline-formula id="ieqn-34"><mml:math id="mml-ieqn-34"><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mn>3</mml:mn></mml:msub></mml:mrow></mml:math>
</inline-formula> are also different. Furthermore, <inline-formula id="ieqn-35"><mml:math id="mml-ieqn-35"><mml:mi>S</mml:mi><mml:mi>F</mml:mi></mml:math>
</inline-formula> is a constant here.</p>
</sec>
<sec id="s2_2_2">
<label>2.2.2</label>
<title>Crossover</title>
<p>After the mutation operation, the trial individual <inline-formula id="ieqn-36"><mml:math id="mml-ieqn-36"><mml:msubsup><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math>
</inline-formula> is selected from the current individual <inline-formula id="ieqn-37"><mml:math id="mml-ieqn-37"><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:msubsup></mml:math>
</inline-formula> or mutant individual <inline-formula id="ieqn-38"><mml:math id="mml-ieqn-38"><mml:msubsup><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math>
</inline-formula> to improve the population diversity. The crossover operation is calculated as:</p>
<p><disp-formula id="eqn-13"><label>(13)</label><mml:math id="mml-eqn-13" display="block"><mml:msubsup><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x003D;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnspacing="1em" rowspacing="4pt"><mml:mtr><mml:mtd columnalign="left"><mml:msubsup><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mspace width="thickmathspace"></mml:mspace><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi>C</mml:mi><mml:mi>R</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="left"><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:msubsup><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mspace width="thickmathspace"></mml:mspace><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi><mml:mo>&#x003E;</mml:mo><mml:mi>C</mml:mi><mml:mi>R</mml:mi></mml:mtd></mml:mtr></mml:mtable><mml:mo stretchy="true" symmetric="true" fence="true"></mml:mo></mml:mrow></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-39"><mml:math id="mml-ieqn-39"><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi></mml:math>
</inline-formula> is a random number from 0 to 1. Moreover, <inline-formula id="ieqn-40"><mml:math id="mml-ieqn-40"><mml:mi>C</mml:mi><mml:mi>R</mml:mi></mml:math>
</inline-formula> is a constant which represents the crossover probability.</p>
</sec>
<sec id="s2_2_3">
<label>2.2.3</label>
<title>Selection</title>
<p>In the selection operation, the trial individual <inline-formula id="ieqn-41"><mml:math id="mml-ieqn-41"><mml:msubsup><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math>
</inline-formula> is compared to the current individual <inline-formula id="ieqn-42"><mml:math id="mml-ieqn-42"><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:msubsup></mml:math>
</inline-formula> to obtain the <inline-formula id="ieqn-43"><mml:math id="mml-ieqn-43"><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:math>
</inline-formula> generation individuals, and the selection operation is mathematically expressed as follows:</p>
<p><disp-formula id="eqn-14"><label>(14)</label><mml:math id="mml-eqn-14" display="block"><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x003D;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnspacing="1em" rowspacing="4pt"><mml:mtr><mml:mtd columnalign="left"><mml:msubsup><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mspace width="thickmathspace"></mml:mspace><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msubsup><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x003C;</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:msubsup></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="left"><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:msubsup><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mspace width="thickmathspace"></mml:mspace><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>w</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi></mml:mtd></mml:mtr></mml:mtable><mml:mo stretchy="true" symmetric="true" fence="true"></mml:mo></mml:mrow></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-44"><mml:math id="mml-ieqn-44"><mml:mi>f</mml:mi></mml:math>
</inline-formula> represents the objective function of the optimization to be solved.</p>
</sec>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>Hybrid Algorithm of STOA and DE (STOA-DE)</title>
<p>STOA is a meta-heuristic algorithm proposed in 2019, which has applied in many industrial fields. However, it still has some disadvantages, such as non-equalizing exploration-exploitation, slow convergence rate, and low population diversity. DE is a simple but powerful algorithm, which introduced into the STOA algorithm that can enhance the local search ability. Because the balance between exploration and exploitation is essential for any meta-heuristic algorithm, this paper combines the STOA and DE to improve the local search capability and search efficiency, and maintain population diversity in the later iteration. The hybrid model in this paper evaluates the average fitness value firstly, and the average fitness value represents the overall quality of the current objective solution. For the minimization problem if the fitness function value of the individual is less than the average fitness value, then the adjacent search area around the particle is promising. In other words, the hybrid model needs to strengthen the local search strategy. On the contrary, if the individual fitness function value is better than the average fitness value, the regional search strategy is not adopted [<xref ref-type="bibr" rid="ref-16">16</xref>].</p>
<p>The global optimization of STOA is used in the STOA-DE algorithm, which can improve obviously the search ability in an extensive range. At the same time, the advantages of the DE algorithm are combined in local convergence, which can reduce the possibility of optimal local trap and deepen the local search ability. After the combination, the balance between exploration and exploitation is improved. Then the accuracy of the algorithm and the convergence speed are improved, the convergence ability is enhanced, and the population diversity can be maintained in the later iteration.</p>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>The Proposed Model</title>
<sec id="s3_1">
<label>3.1</label>
<title>Support Vector Machine</title>
<p>Support vector machine (SVM) is a non-linear binary classifier developed by Vapnik [<xref ref-type="bibr" rid="ref-17">17</xref>]. Furthermore, it constructs a linear separation hyper-plane in a high-dimensional vector space. The model is defined as the largest interval classifier in the feature space, and then it is transformed into the solution of convex quadratic programming problems. Compared with other machine learning methods, SVM is widely used in supervised learning and classification because of its high computational efficiency and excellent application ability. And in the linearly separable data sets, SVM creates the optimal separation hyper-plane to classify the samples. It is supposed that the data set <inline-formula id="ieqn-45"><mml:math id="mml-ieqn-45"><mml:mi>T</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="true" symmetric="true" fence="true"></mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mo>&#x22EF;</mml:mo><mml:mo>,</mml:mo><mml:mrow><mml:mo stretchy="true" symmetric="true" fence="true"></mml:mo><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>}</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mi>n</mml:mi></mml:msup></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mrow><mml:mo stretchy="true" symmetric="true" fence="true"></mml:mo><mml:mn>1</mml:mn><mml:mo>}</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="true" symmetric="true" fence="true"></mml:mo></mml:mrow></mml:math>
</inline-formula> is linearly separable. As shown in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>, hollow circles and solid circles represent two types of data sets. <inline-formula id="ieqn-46"><mml:math id="mml-ieqn-46"><mml:mi>H</mml:mi></mml:math>
</inline-formula> is the optimal hyper-plane, <inline-formula id="ieqn-47"><mml:math id="mml-ieqn-47"><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:math>
</inline-formula> and <inline-formula id="ieqn-48"><mml:math id="mml-ieqn-48"><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:math>
</inline-formula> are the boundaries of the two classes of samples, the interval between <inline-formula id="ieqn-49"><mml:math id="mml-ieqn-49"><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:math>
</inline-formula> and <inline-formula id="ieqn-50"><mml:math id="mml-ieqn-50"><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:math>
</inline-formula> is called classification interval, and the points falling on <inline-formula id="ieqn-51"><mml:math id="mml-ieqn-51"><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:math>
</inline-formula> and <inline-formula id="ieqn-52"><mml:math id="mml-ieqn-52"><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:math>
</inline-formula> are called support vector.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>SVM optimal hyper-plane diagram</title></caption>
<graphic mimetype="image" mime-subtype="png" xlink:href="CSSE_17536-fig-1.png"/>
</fig>
<p>Although the linear separation hyper-plane can achieve the optimal classification, in most cases, the data points belonging to different categories cannot be separated clearly, and the linear type will lead to a large number of wrong classifications. Therefore, it is necessary to map the original feature space to a higher dimensional space to find a hyper-plane that can correctly separate the data points. The kernel functions have several forms as follows [<xref ref-type="bibr" rid="ref-18">18</xref>]:</p>
<p><disp-formula id="eqn-15"><label>(15)</label><mml:math id="mml-eqn-15" display="block"><mml:mi>K</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mo>&#x22C5;</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math>
</disp-formula></p>
<p><disp-formula id="eqn-16"><label>(16)</label><mml:math id="mml-eqn-16" display="block"><mml:mi>K</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>a</mml:mi><mml:mo>&#x002B;</mml:mo><mml:mi>r</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mo>&#x22C5;</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mi>Q</mml:mi></mml:msup></mml:mrow><mml:mo>,</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x2265;</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>r</mml:mi><mml:mo>&#x003E;</mml:mo><mml:mn>0</mml:mn></mml:math>
</disp-formula></p>
<p><disp-formula id="eqn-17"><label>(17)</label><mml:math id="mml-eqn-17" display="block"><mml:mi>K</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:mi>exp</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B3;</mml:mi><mml:mrow><mml:mo symmetric="true">&#x2016;</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="true" symmetric="true" fence="true"></mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo symmetric="true">&#x2016;</mml:mo></mml:mrow></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mrow><mml:mo stretchy="true" symmetric="true" fence="true"></mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-53"><mml:math id="mml-ieqn-53"><mml:mi>K</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math>
</inline-formula> is the kernel function whose value is the inner product of the two vectors <inline-formula id="ieqn-54"><mml:math id="mml-ieqn-54"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math>
</inline-formula> and <inline-formula id="ieqn-55"><mml:math id="mml-ieqn-55"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math>
</inline-formula>. As a result of the linear kernel function mainly solves the linear separated problems, and the polynomial kernel function has many parameters to adjust, so we choose the RBF kernel function, which can be mapped to a higher dimension.</p>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Feature Selection and Binary Processing</title>
<p>Feature selection is a method to transform the high-dimensional data to low-dimensional data by finding the optimal feature subset from the initial feature space according to a specific criterion [<xref ref-type="bibr" rid="ref-19">19</xref>]. The evaluation criteria are mainly determined by the classification accuracy and the number of selected features. Furthermore, feature space generally includes three elements: relevant feature, irrelevant feature and redundant feature. According to the feature framework proposed by Dash in 1997 [<xref ref-type="bibr" rid="ref-20">20</xref>], feature selection mainly consists of generating feature subset, evaluating feature subset, stopping criterion and result verification. When the stop criterion is reached, the generation of the new feature subset will be stopped, and the optimal feature subset will be output at this time. Otherwise, the new feature subset will be generated until the stop criterion is reached. In this paper, the random search strategy is selected as the search strategy and the iterations are chosen as the stop criterion. In other words, the algorithm will stop when the iterations set in the experiment are reached.</p>
<p>The essence of feature selection is a binary optimization, so binary scheme should be set when we use the optimization algorithm to deal with the feature selection problem. Because the solution of feature selection is limited to <inline-formula id="ieqn-56"><mml:math id="mml-ieqn-56"><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mrow><mml:mo stretchy="true" symmetric="true" fence="true"></mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="true" symmetric="true" fence="true"></mml:mo><mml:mn>1</mml:mn><mml:mo>}</mml:mo></mml:mrow></mml:math>
</inline-formula>, &#x201C;0&#x201D; indicates that this feature is not selected, and &#x201C;1&#x201D; indicates that this feature is selected. However, the range of data values is uneven in the original data set, ranging from <inline-formula id="ieqn-57"><mml:math id="mml-ieqn-57"><mml:mn>0</mml:mn><mml:mo>&#x223C;</mml:mo><mml:mn>1</mml:mn></mml:math>
</inline-formula> to more than 10 million, which will seriously affect the classification result of SVM. Therefore, it is necessary to preprocess the data set. In order to normalize the data to the range of <inline-formula id="ieqn-58"><mml:math id="mml-ieqn-58"><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula>, the following formula is used for processing:</p>
<p><disp-formula id="eqn-18"><label>(18)</label><mml:math id="mml-eqn-18" display="block"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:mstyle scriptlevel="0" displaystyle="true"><mml:mrow><mml:mfrac><mml:mrow><mml:mi>X</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mo form="prefix" movablelimits="true">min</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mo form="prefix" movablelimits="true">max</mml:mo></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mo form="prefix" movablelimits="true">min</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:mrow></mml:mfrac></mml:mrow></mml:mstyle></mml:math>
</disp-formula></p>
<p>Where, <inline-formula id="ieqn-59"><mml:math id="mml-ieqn-59"><mml:mi>X</mml:mi></mml:math>
</inline-formula> represents the original data, <inline-formula id="ieqn-60"><mml:math id="mml-ieqn-60"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math>
</inline-formula> means the normalized data, <inline-formula id="ieqn-61"><mml:math id="mml-ieqn-61"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mrow><mml:mi mathvariant="normal">i</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:mrow></mml:mrow></mml:msub></mml:mrow></mml:math>
</inline-formula> and <inline-formula id="ieqn-62"><mml:math id="mml-ieqn-62"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mo form="prefix" movablelimits="true">max</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:math>
</inline-formula> represent the minimum and maximum values of this feature value range respectively.</p>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>STOA-DE for Optimizing SVM and Feature Selection</title>
<p>The two parameters <inline-formula id="ieqn-63"><mml:math id="mml-ieqn-63"><mml:mi>C</mml:mi></mml:math>
</inline-formula> and <inline-formula id="ieqn-64"><mml:math id="mml-ieqn-64"><mml:mi>&#x03B3;</mml:mi></mml:math>
</inline-formula> need to be determined when using the RBF kernel function to construct the classification model of the support vector machine. The cost <inline-formula id="ieqn-65"><mml:math id="mml-ieqn-65"><mml:mi>C</mml:mi></mml:math>
</inline-formula> represents the tolerance of error in the classification process. The larger <inline-formula id="ieqn-66"><mml:math id="mml-ieqn-66"><mml:mi>C</mml:mi></mml:math>
</inline-formula> is, the more intolerable the classification error is, while the smaller the <inline-formula id="ieqn-67"><mml:math id="mml-ieqn-67"><mml:mi>C</mml:mi></mml:math>
</inline-formula> is, the larger the error is, then the problem of under-fitting will occur. The kernel parameter <inline-formula id="ieqn-68"><mml:math id="mml-ieqn-68"><mml:mi>&#x03B3;</mml:mi></mml:math>
</inline-formula> controls the width of kernel function, and improper choice will still lead to incorrect classification. Therefore, the classification results of support vector machines are closely related to the selection of <inline-formula id="ieqn-69"><mml:math id="mml-ieqn-69"><mml:mi>C</mml:mi></mml:math>
</inline-formula> and <inline-formula id="ieqn-70"><mml:math id="mml-ieqn-70"><mml:mi>&#x03B3;</mml:mi></mml:math>
</inline-formula> parameters.</p>
<p>In the traditional model, SVM firstly optimizes two parameters according to the exclusive features and then selects the features, which leads to that the key elements are not selected in the actual feature selection process. Thus, the data classification is not ideal. On the contrary, if the feature selection is carried out at first and then the parameters are optimized, the second optimization will be needed in each training process, which consumed too much time cost and is difficult to be applied to practical problems. Therefore, this paper proposed a method that combined parameters optimization and feature selection of SVM. The search dimensions are as follows: the cost <inline-formula id="ieqn-71"><mml:math id="mml-ieqn-71"><mml:mi>C</mml:mi></mml:math>
</inline-formula>, the kernel parameter <inline-formula id="ieqn-72"><mml:math id="mml-ieqn-72"><mml:mi>&#x03B3;</mml:mi></mml:math>
</inline-formula> and the binary feature strings.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Schematic of search dimensions for each individual</title></caption>
<graphic mimetype="image" mime-subtype="png" xlink:href="CSSE_17536-fig-2.png"/>
</fig>
<p>As shown in <xref ref-type="fig" rid="fig-2">Fig. 2</xref>, the two dimensions are used to search the cost <inline-formula id="ieqn-73"><mml:math id="mml-ieqn-73"><mml:mi>C</mml:mi></mml:math>
</inline-formula> and the kernel parameter <inline-formula id="ieqn-74"><mml:math id="mml-ieqn-74"><mml:mi>&#x03B3;</mml:mi></mml:math>
</inline-formula>. The remaining sizes are chosen to search for each binary feature in the data set. <inline-formula id="ieqn-75"><mml:math id="mml-ieqn-75"><mml:mi>n</mml:mi></mml:math>
</inline-formula> is the number of components in the data set. The method proposed in this paper utilizes an optimization algorithm to optimize all dimensions simultaneously. For the two parameters of SVM, the particle usually searches for its optimal value according to the optimization algorithm. As the same time, for the <inline-formula id="ieqn-76"><mml:math id="mml-ieqn-76"><mml:mi>n</mml:mi></mml:math>
</inline-formula> features of the data set, it is necessary to normalize the data set so that the whole data are normalized between <inline-formula id="ieqn-77"><mml:math id="mml-ieqn-77"><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula>. That is to say, if the solution of <inline-formula id="ieqn-78"><mml:math id="mml-ieqn-78"><mml:mrow><mml:msub><mml:mi>l</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:math>
</inline-formula>, <inline-formula id="ieqn-79"><mml:math id="mml-ieqn-79"><mml:mrow><mml:msub><mml:mi>l</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:math>
</inline-formula>, <inline-formula id="ieqn-80"><mml:math id="mml-ieqn-80"><mml:mo>&#x22EF;</mml:mo></mml:math>
</inline-formula>, <inline-formula id="ieqn-81"><mml:math id="mml-ieqn-81"><mml:mrow><mml:msub><mml:mi>l</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:math>
</inline-formula> is more significant than 0.5, the feature is selected, and its value is 1 [<xref ref-type="bibr" rid="ref-18">18</xref>]. Finally, the two parameters and the selected features are input into SVM together, then the fitness value is calculated by cross-validation.</p>
<p>The proposed model in this paper optimized the two parameters of SVM and carried out the feature selection process simultaneously. It can ensure the accuracy of the selected features, avoid missing key components and reduce redundant features, thus improving the classification accuracy. Compared with the method of firstly selecting features, the way in this paper reduced the running time of the algorithm to a certain extent. Therefore, simultaneous feature selection and parameter optimization are more desirable. Meanwhile, the flow chart of the simultaneous optimization and feature selection based on the STOA-DE algorithm is as follows:</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>The flow chart of the proposed model</title></caption>
<graphic mimetype="image" mime-subtype="png" xlink:href="CSSE_17536-fig-3.png"/>
</fig>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Experiments and Results</title>
<sec id="s4_1">
<label>4.1</label>
<title>The Experimental Setup</title>
<p>The 12 classic UCI data sets (including four high-dimensional data sets) are used to prove the effectiveness of the STOA-DE algorithm from the average classification accuracy, the average selection size, average fitness, standard deviation and average running time in this paper [<xref ref-type="bibr" rid="ref-21">21</xref>]. Meanwhile, in order to ensure the objectivity and comprehensiveness of the experiments, the other algorithms which have been applied in the feature selection field are selected for comparison in this paper. The detailed information about each data set is shown in <xref ref-type="table" rid="table-1">Tab. 1</xref>.</p>
<table-wrap id="table-1"><label>Table 1</label>
<caption>
<title>The data sets used in the experiments</title></caption>
<table><colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>No.</th>
<th>Data set</th>
<th>Features</th>
<th>Samples</th>
<th>Categories</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>Iris</td>
<td>4</td>
<td>150</td>
<td>3</td>
</tr>
<tr>
<td>2</td>
<td>Immunotherapy</td>
<td>8</td>
<td>90</td>
<td>2</td>
</tr>
<tr>
<td>3</td>
<td>Tic-Tac-Toe</td>
<td>9</td>
<td>958</td>
<td>2</td>
</tr>
<tr>
<td>4</td>
<td>Wine</td>
<td>13</td>
<td>178</td>
<td>3</td>
</tr>
<tr>
<td>5</td>
<td>Zoo</td>
<td>17</td>
<td>101</td>
<td>7</td>
</tr>
<tr>
<td>6</td>
<td>Dermatology</td>
<td>33</td>
<td>366</td>
<td>6</td>
</tr>
<tr>
<td>7</td>
<td>Ionosphere</td>
<td>34</td>
<td>351</td>
<td>2</td>
</tr>
<tr>
<td>8</td>
<td>Divorce predictors</td>
<td>54</td>
<td>170</td>
<td>2</td>
</tr>
<tr>
<td>9</td>
<td>Urban Land Cover</td>
<td>148</td>
<td>168</td>
<td>9</td>
</tr>
<tr>
<td>10</td>
<td>Arrhythmia</td>
<td>279</td>
<td>452</td>
<td>16</td>
</tr>
<tr>
<td>11</td>
<td>LSVT Voice</td>
<td>309</td>
<td>126</td>
<td>2</td>
</tr>
<tr>
<td>12</td>
<td>Detect Malacious</td>
<td>513</td>
<td>373</td>
<td>2</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In the experiments, the other six algorithms are selected as the comparison algorithm. The population size is set to 30. The number of runs is 30. The maximum of iterations is 100. All the experimental series are carried out on MATLAB R2014b, and the computer is configured as Intel(R) Core (TM) i5-5200U CPU @2.20GHz, using Microsoft Windows 8 system. Also, the parameter settings of other algorithms are shown in <xref ref-type="table" rid="table-2">Tab. 2</xref>.</p>
<table-wrap id="table-2"><label>Table 2</label>
<caption>
<title>Parameters of the compared algorithms</title></caption>
<table><colgroup>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Algorithms</th>
<th>Parameters</th>
<th>Reference</th>
</tr>
</thead>
<tbody>
<tr>
<td>STOA-DE</td>
<td><inline-formula id="ieqn-82"><mml:math id="mml-ieqn-82"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>f</mml:mi></mml:msub></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:mn>2</mml:mn></mml:math>
</inline-formula>, <inline-formula id="ieqn-83"><mml:math id="mml-ieqn-83"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>0.5</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula>, <inline-formula id="ieqn-84"><mml:math id="mml-ieqn-84"><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>1</mml:mn></mml:math>
</inline-formula>, <inline-formula id="ieqn-85"><mml:math id="mml-ieqn-85"><mml:mi>C</mml:mi><mml:mi>R</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>0.9</mml:mn></mml:math>
</inline-formula>, <inline-formula id="ieqn-86"><mml:math id="mml-ieqn-86"><mml:mi>S</mml:mi><mml:mi>F</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>0.5</mml:mn></mml:math>
</inline-formula></td>
<td>&#x2013;</td>
</tr>
<tr>
<td>STOA</td>
<td><inline-formula id="ieqn-87"><mml:math id="mml-ieqn-87"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>f</mml:mi></mml:msub></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:mn>2</mml:mn></mml:math>
</inline-formula>, <inline-formula id="ieqn-88"><mml:math id="mml-ieqn-88"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>0.5</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula>, <inline-formula id="ieqn-89"><mml:math id="mml-ieqn-89"><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>1</mml:mn></mml:math>
</inline-formula></td>
<td>[<xref ref-type="bibr" rid="ref-11">11</xref>]</td>
</tr>
<tr>
<td>DE</td>
<td><inline-formula id="ieqn-90"><mml:math id="mml-ieqn-90"><mml:mi>C</mml:mi><mml:mi>R</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>0.9</mml:mn></mml:math>
</inline-formula>, <inline-formula id="ieqn-91"><mml:math id="mml-ieqn-91"><mml:mi>S</mml:mi><mml:mi>F</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>0.5</mml:mn></mml:math>
</inline-formula></td>
<td>[<xref ref-type="bibr" rid="ref-12">12</xref>]</td>
</tr>
<tr>
<td>PSO</td>
<td><inline-formula id="ieqn-92"><mml:math id="mml-ieqn-92"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:mn>1.5</mml:mn></mml:math>
</inline-formula>, <inline-formula id="ieqn-93"><mml:math id="mml-ieqn-93"><mml:mi>&#x03C9;</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>0.75</mml:mn></mml:math>
</inline-formula>, <inline-formula id="ieqn-94"><mml:math id="mml-ieqn-94"><mml:mi>v</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula>, <inline-formula id="ieqn-95"><mml:math id="mml-ieqn-95"><mml:mi>a</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>2</mml:mn></mml:math>
</inline-formula></td>
<td>[<xref ref-type="bibr" rid="ref-22">22</xref>]</td>
</tr>
<tr>
<td>MFO</td>
<td><inline-formula id="ieqn-96"><mml:math id="mml-ieqn-96"><mml:mi>b</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>1.0</mml:mn></mml:math>
</inline-formula>, <inline-formula id="ieqn-97"><mml:math id="mml-ieqn-97"><mml:mi>t</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula>, <inline-formula id="ieqn-98"><mml:math id="mml-ieqn-98"><mml:mi>r</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula>, <inline-formula id="ieqn-99"><mml:math id="mml-ieqn-99"><mml:mi>c</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>2.0</mml:mn></mml:math>
</inline-formula></td>
<td>[<xref ref-type="bibr" rid="ref-23">23</xref>]</td>
</tr>
<tr>
<td>SHO</td>
<td><inline-formula id="ieqn-100"><mml:math id="mml-ieqn-100"><mml:mi>h</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>5</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula>, <inline-formula id="ieqn-101"><mml:math id="mml-ieqn-101"><mml:mi>M</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mn>0.5</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula></td>
<td>[<xref ref-type="bibr" rid="ref-9">9</xref>]</td>
</tr>
<tr>
<td>EPO</td>
<td><inline-formula id="ieqn-102"><mml:math id="mml-ieqn-102"><mml:mi>M</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>2</mml:mn></mml:math>
</inline-formula>, <inline-formula id="ieqn-103"><mml:math id="mml-ieqn-103"><mml:mi>f</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mn>3</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula>, <inline-formula id="ieqn-104"><mml:math id="mml-ieqn-104"><mml:mi>l</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mn>1.5</mml:mn><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:math>
</inline-formula></td>
<td>[<xref ref-type="bibr" rid="ref-10">10</xref>]</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Evaluation Criteria</title>
<p>The following criteria are used to evaluate the performance of each optimization algorithm when it is run.</p>
<p>Average classification accuracy: Refers to the average classification accuracy of the data sets in the experiment. The higher the average classification accuracy, the better the classification effect. And the mathematical expression is as follows:</p>
<p><disp-formula id="eqn-19"><label>(19)</label><mml:math id="mml-eqn-19" display="block"><mml:mi>M</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mstyle scriptlevel="0" displaystyle="true"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>M</mml:mi></mml:mfrac></mml:mrow><mml:munderover><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:munderover><mml:mrow><mml:mi>A</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:mi>u</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>y</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mstyle></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-105"><mml:math id="mml-ieqn-105"><mml:mi>A</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:mi>u</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>y</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math>
</inline-formula> represents the classification accuracy in the <inline-formula id="ieqn-106"><mml:math id="mml-ieqn-106"><mml:mrow><mml:msup><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math>
</inline-formula> experiment, and <inline-formula id="ieqn-107"><mml:math id="mml-ieqn-107"><mml:mi>M</mml:mi></mml:math>
</inline-formula> indicates the number of runs.</p>
<p>Average selection size: Represents the average value of the selected features during the experiment. The fewer the selected features, the more pronounced the effect of removing the irrelevant and redundant features is. Thus, the formula is mathematically indicated as follows:</p>
<p><disp-formula id="eqn-20"><label>(20)</label><mml:math id="mml-eqn-20" display="block"><mml:mi>M</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mstyle scriptlevel="0" displaystyle="true"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>M</mml:mi></mml:mfrac></mml:mrow><mml:munderover><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:munderover><mml:mrow><mml:mi>S</mml:mi><mml:mi>i</mml:mi><mml:mi>z</mml:mi><mml:mi>e</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mstyle></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-108"><mml:math id="mml-ieqn-108"><mml:mi>S</mml:mi><mml:mi>i</mml:mi><mml:mi>z</mml:mi><mml:mi>e</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math>
</inline-formula> represents the number of features selected by each algorithm in the <inline-formula id="ieqn-109"><mml:math id="mml-ieqn-109"><mml:mrow><mml:msup><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math>
</inline-formula> experiment.</p>
<p>Fitness function: The two main objectives of feature selection are classification accuracy and the selected features. The ideal result is that the selected features are less, and the classification accuracy is higher. Therefore, this paper evaluates the performance of the proposed algorithm on feature selection according to these two criteria. Meanwhile, the fitness function is represented as follows:</p>
<p><disp-formula id="eqn-21"><label>(21)</label><mml:math id="mml-eqn-21" display="block"><mml:mi>F</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mrow><mml:msub><mml:mi>&#x03B3;</mml:mi><mml:mi>R</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x002B;</mml:mo><mml:mi>&#x03B2;</mml:mi><mml:mstyle scriptlevel="0" displaystyle="true"><mml:mrow><mml:mfrac><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>R</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>N</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow></mml:mstyle></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-110"><mml:math id="mml-ieqn-110"><mml:mi>&#x03B1;</mml:mi></mml:math>
</inline-formula> represents the proportion of classification accuracy and <inline-formula id="ieqn-111"><mml:math id="mml-ieqn-111"><mml:mi>&#x03B1;</mml:mi></mml:math>
</inline-formula> is 0.99 in this paper [<xref ref-type="bibr" rid="ref-24">24</xref>]. Meanwhile, <inline-formula id="ieqn-112"><mml:math id="mml-ieqn-112"><mml:mrow><mml:msub><mml:mi>&#x03B3;</mml:mi><mml:mi>R</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math>
</inline-formula> means the classification error rate and is expressed as follows <xref ref-type="disp-formula" rid="eqn-22">Eq.(22)</xref>. Where, <inline-formula id="ieqn-113"><mml:math id="mml-ieqn-113"><mml:mi>A</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:mi>u</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>y</mml:mi></mml:math>
</inline-formula> is the classification accuracy. The parameter <inline-formula id="ieqn-114"><mml:math id="mml-ieqn-114"><mml:mi>&#x03B2;</mml:mi></mml:math>
</inline-formula> is the importance of selected features, representing the weight of selected features in the fitness function, where <inline-formula id="ieqn-115"><mml:math id="mml-ieqn-115"><mml:mi>&#x03B2;</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B1;</mml:mi></mml:math>
</inline-formula>, <inline-formula id="ieqn-116"><mml:math id="mml-ieqn-116"><mml:mi>R</mml:mi></mml:math>
</inline-formula> represents the length of the selected feature subset, as same as the <inline-formula id="ieqn-117"><mml:math id="mml-ieqn-117"><mml:mi>S</mml:mi><mml:mi>i</mml:mi><mml:mi>z</mml:mi><mml:mi>e</mml:mi></mml:math>
</inline-formula> mentioned above, and <inline-formula id="ieqn-118"><mml:math id="mml-ieqn-118"><mml:mi>N</mml:mi></mml:math>
</inline-formula> represents the full features of the data set.</p>
<p><disp-formula id="eqn-22"><label>(22)</label><mml:math id="mml-eqn-22" display="block"><mml:mrow><mml:msub><mml:mi>&#x03B3;</mml:mi><mml:mi>R</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x003D;</mml:mo><mml:mn>1</mml:mn><mml:mrow><mml:mo>&#x2212;</mml:mo></mml:mrow><mml:mi>A</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:mi>u</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>y</mml:mi></mml:math>
</disp-formula></p>
<p>Average fitness: Represents the average value obtained by repeated calculation of the algorithm in the experiment. The smaller the average fitness is, the better the ability of feature selection in balancing to enhance classification accuracy and reduce the selected features is. It can be expressed as:</p>
<p><disp-formula id="eqn-23"><label>(23)</label><mml:math id="mml-eqn-23" display="block"><mml:mi>M</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mstyle scriptlevel="0" displaystyle="true"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>M</mml:mi></mml:mfrac></mml:mrow><mml:munderover><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:munderover><mml:mrow><mml:mi>F</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mstyle></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-119"><mml:math id="mml-ieqn-119"><mml:mi>F</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math>
</inline-formula> shows the fitness of these optimization algorithms in the <inline-formula id="ieqn-120"><mml:math id="mml-ieqn-120"><mml:mrow><mml:msup><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math>
</inline-formula> experiment.</p>
<p>Statistical standard deviation (std): Means the stability of the optimization algorithm in the experiments. The smaller the standard deviation is, the better the stability is. Furthermore, it is shown as follows:</p>
<p><disp-formula id="eqn-24"><label>(24)</label><mml:math id="mml-eqn-24" display="block"><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>d</mml:mi><mml:mo>&#x003D;</mml:mo><mml:msqrt><mml:mstyle scriptlevel="0" displaystyle="true"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>M</mml:mi></mml:mfrac></mml:mrow><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>F</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>M</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mrow></mml:mstyle></mml:msqrt></mml:math>
</disp-formula></p>
<p>Average running time: Represents the average time spent in the whole experiment. As is known to all, the time cost is also significant in engineering practice, so the average running time is introduced to the evaluation criteria to evaluate the superiority of the proposed method better. And the calculation formula is as follows:</p>
<p><disp-formula id="eqn-25"><label>(25)</label><mml:math id="mml-eqn-25" display="block"><mml:mi>M</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mstyle scriptlevel="0" displaystyle="true"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>M</mml:mi></mml:mfrac></mml:mrow><mml:munderover><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x003D;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:munderover><mml:mrow><mml:mi>R</mml:mi><mml:mi>u</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mi>e</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mstyle></mml:math>
</disp-formula></p>
<p>Where <inline-formula id="ieqn-121"><mml:math id="mml-ieqn-121"><mml:mi>R</mml:mi><mml:mi>u</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mi>e</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math>
</inline-formula> shows the running time of the algorithm in the <inline-formula id="ieqn-122"><mml:math id="mml-ieqn-122"><mml:mrow><mml:msup><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math>
</inline-formula> experiment.</p>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Classical UCI Data Set Experiments</title>
<p>It can be seen from the experimental results of classification accuracy in <xref ref-type="table" rid="table-3">Tab. 3</xref>, except for LSVT Voice, the STOA-DE algorithm has the best performance in classification accuracy and divides the data set accurately. Meanwhile, the figure shows that the whole classification accuracy of LSVT Voice is not good. In other words, this experiment is an accidental event and not representative. Furthermore, it can be noted that the proposed method in this paper achieves 100% classification accuracy in both the Tic-Tac-Toe and Divorce predictors, and obtains 99.73% in Detect Malacious. Therefore, it can be proved that the proposed method has competition in simultaneous feature selection and support vector machines.</p>
<p><xref ref-type="table" rid="table-4">Tab. 4</xref> shows the average feature selection size during the experiment. According to the figure, it can be verified that the feature selection size is relatively small by using the proposed STOA-DE algorithm in most cases. Although the proposed algorithm did not obtain the optimal result in the Wine, Forest types and Dermatology, the STOA-DE algorithm was the most outstanding among the test of data sets larger than 100 dimensions. Therefore, compared with other algorithms, it can be found that the proposed model in this paper is superior in processing the data dimension reduction problem.</p>
<table-wrap id="table-3"><label>Table 3</label>
<caption>
<title>The average classification accuracy of each algorithm</title></caption>
<table><colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th rowspan="2">Data set</th>
<th colspan="7">The average classification accuracy(%)</th>
</tr>
<tr>
<th>STOA-DE</th>
<th>STOA</th>
<th>DE</th>
<th>PSO</th>
<th>MFO</th>
<th>SHO</th>
<th>EPO</th>
</tr>
</thead>
<tbody>
<tr>
<td>Iris</td>
<td><bold>98.67</bold></td>
<td>98.43</td>
<td>98.67</td>
<td>98.00</td>
<td>94.00</td>
<td>98.12</td>
<td><bold>98.67</bold></td>
</tr>
<tr>
<td>Immunotherapy</td>
<td><bold>90.22</bold></td>
<td>85.56</td>
<td>78.89</td>
<td>81.44</td>
<td>81.11</td>
<td>79.28</td>
<td>78.67</td>
</tr>
<tr>
<td>Tic-Tac-Toe</td>
<td><bold>100.00</bold></td>
<td>83.40</td>
<td>68.68</td>
<td>77.24</td>
<td>78.08</td>
<td>78.18</td>
<td><bold>100.00</bold></td>
</tr>
<tr>
<td>Wine</td>
<td><bold>96.83</bold></td>
<td>90.30</td>
<td>71.35</td>
<td>73.00</td>
<td>56.18</td>
<td>71.20</td>
<td>72.06</td>
</tr>
<tr>
<td>Zoo</td>
<td><bold>99.04</bold></td>
<td>97.03</td>
<td>97.03</td>
<td>97.03</td>
<td>87.13</td>
<td>95.05</td>
<td>98.80</td>
</tr>
<tr>
<td>Dermatology</td>
<td><bold>97.81</bold></td>
<td>94.90</td>
<td>96.99</td>
<td>96.72</td>
<td>53.55</td>
<td>35.18</td>
<td>94.67</td>
</tr>
<tr>
<td>Ionosphere</td>
<td><bold>96.69</bold></td>
<td>93.73</td>
<td>95.73</td>
<td>65.24</td>
<td>66.10</td>
<td>64.67</td>
<td>96.01</td>
</tr>
<tr>
<td>Divorce predictors</td>
<td><bold>100.00</bold></td>
<td>98.41</td>
<td>67.65</td>
<td>98.24</td>
<td>79.41</td>
<td>67.65</td>
<td>98.12</td>
</tr>
<tr>
<td>Urban Land Cover</td>
<td><bold>72.64</bold></td>
<td>60.15</td>
<td>50.98</td>
<td>17.26</td>
<td>61.83</td>
<td>38.57</td>
<td>40.63</td>
</tr>
<tr>
<td>Arrhythmia</td>
<td><bold>74.84</bold></td>
<td>69.03</td>
<td>54.84</td>
<td>58.71</td>
<td>67.89</td>
<td>74.19</td>
<td>58.71</td>
</tr>
<tr>
<td>LSVT Voice</td>
<td><bold>66.67</bold></td>
<td><bold>66.67</bold></td>
<td><bold>66.67</bold></td>
<td><bold>66.67</bold></td>
<td><bold>66.67</bold></td>
<td><bold>66.67</bold></td>
<td><bold>66.67</bold></td>
</tr>
<tr>
<td>Detect Malacious</td>
<td><bold>99.73</bold></td>
<td>99.46</td>
<td>99.26</td>
<td>99.20</td>
<td>89.37</td>
<td>81.77</td>
<td>99.73</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap id="table-4"><label>Table 4</label>
<caption>
<title>The average selection size of each algorithm</title></caption>
<table><colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th rowspan="2">Data set</th>
<th colspan="7">The average selection size</th>
</tr>
<tr>
<th>STOA-DE</th>
<th>STOA</th>
<th>DE</th>
<th>PSO</th>
<th>MFO</th>
<th>SHO</th>
<th>EPO</th>
</tr>
</thead>
<tbody>
<tr>
<td>Iris</td>
<td><bold>1.10</bold></td>
<td>1.90</td>
<td>1.95</td>
<td>1.47</td>
<td>1.23</td>
<td>1.67</td>
<td>1.86</td>
</tr>
<tr>
<td>Immunotherapy</td>
<td><bold>1.46</bold></td>
<td>1.73</td>
<td>4.03</td>
<td>2.83</td>
<td>2.02</td>
<td>3.33</td>
<td>3.63</td>
</tr>
<tr>
<td>Tic-Tac-Toe</td>
<td><bold>3.67</bold></td>
<td>4.23</td>
<td>4.83</td>
<td>5.29</td>
<td>6.25</td>
<td>8.22</td>
<td>7.03</td>
</tr>
<tr>
<td>Wine</td>
<td>4.06</td>
<td>4.23</td>
<td>13.36</td>
<td>6.67</td>
<td>3.78</td>
<td><bold>3.66</bold></td>
<td>5.40</td>
</tr>
<tr>
<td>Zoo</td>
<td><bold>2.72</bold></td>
<td>7.66</td>
<td>3.27</td>
<td>7.31</td>
<td>3.49</td>
<td>6.52</td>
<td>7.11</td>
</tr>
<tr>
<td>Dermatology</td>
<td>10.81</td>
<td>13.33</td>
<td>14.18</td>
<td>14.40</td>
<td>14.14</td>
<td>11.57</td>
<td><bold>9.57</bold></td>
</tr>
<tr>
<td>Ionosphere</td>
<td><bold>10.53</bold></td>
<td>16.23</td>
<td>11.75</td>
<td>14.07</td>
<td>13.45</td>
<td>19.43</td>
<td>15.87</td>
</tr>
<tr>
<td>Divorce predictors</td>
<td><bold>20.47</bold></td>
<td>23.50</td>
<td>26.39</td>
<td>26.70</td>
<td>21.68</td>
<td>21.29</td>
<td>24.70</td>
</tr>
<tr>
<td>Urban Land Cover</td>
<td><bold>45.74</bold></td>
<td>73.56</td>
<td>76.41</td>
<td>77.86</td>
<td>58.93</td>
<td>67.84</td>
<td>69.75</td>
</tr>
<tr>
<td>Arrhythmia</td>
<td><bold>131.74</bold></td>
<td>139.26</td>
<td>152.84</td>
<td>149.37</td>
<td>137.42</td>
<td>189.63</td>
<td>172.63</td>
</tr>
<tr>
<td>LSVT Voice</td>
<td><bold>151.47</bold></td>
<td>153.27</td>
<td>161.29</td>
<td>152.74</td>
<td>152.46</td>
<td>161.85</td>
<td>162.36</td>
</tr>
<tr>
<td>Detect Malacious</td>
<td><bold>115.84</bold></td>
<td>271.28</td>
<td>289.32</td>
<td>264.53</td>
<td>117.62</td>
<td>267.49</td>
<td>317.27</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>It can be found from <xref ref-type="table" rid="table-5">Tab. 5</xref> that the time advantage of the proposed algorithm is not very obvious because the proposed method is the hybrid of STOA and DE, so its time is slightly less than the traditional STOA algorithm. However, it can be seen that the proposed algorithm is still hopeful even though it is not the fastest. Because the STOA algorithm converges quickly and is easy to mature early, the running time of the hybrid model has improved compared with the traditional DE. At the same time, it can be noted that owing to the superiority of the STOA algorithm, the proposed STOA-DE algorithm still has time superiority compared with other algorithms in most cases. Therefore, the proposed model in this paper has potential.</p>
<table-wrap id="table-5"><label>Table 5</label>
<caption>
<title>The average time of each algorithm</title></caption>
<table><colgroup>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th rowspan="2">Data set</th>
<th colspan="7">The average time (s)</th>
</tr>
<tr>
<th>STOA-DE</th>
<th>STOA</th>
<th>DE</th>
<th>PSO</th>
<th>MFO</th>
<th>SHO</th>
<th>EPO</th>
</tr>
</thead>
<tbody>
<tr>
<td>Iris</td>
<td>12.71</td>
<td><bold>12.09</bold></td>
<td>13.54</td>
<td>14.36</td>
<td>19.49</td>
<td>14.41</td>
<td>15.28</td>
</tr>
<tr>
<td>Immunotherapy</td>
<td>7.09</td>
<td><bold>6.95</bold></td>
<td>8.44</td>
<td>7.52</td>
<td>6.50</td>
<td>9.22</td>
<td>10.23</td>
</tr>
<tr>
<td>Tic-Tac-Toe</td>
<td>169.67</td>
<td><bold>169.29</bold></td>
<td>229.35</td>
<td>171.52</td>
<td>347.95</td>
<td>189.53</td>
<td>181.08</td>
</tr>
<tr>
<td>Wine</td>
<td>40.23</td>
<td><bold>39.67</bold></td>
<td>44.17</td>
<td><bold>39.67</bold></td>
<td>32.05</td>
<td>51.26</td>
<td>48.81</td>
</tr>
<tr>
<td>Zoo</td>
<td>16.95</td>
<td><bold>16.25</bold></td>
<td>28.71</td>
<td>16.49</td>
<td>24.42</td>
<td>19.45</td>
<td>20.38</td>
</tr>
<tr>
<td>Dermatology</td>
<td>97.69</td>
<td><bold>93.76</bold></td>
<td>117.11</td>
<td>97.74</td>
<td>202.98</td>
<td>115.87</td>
<td>117.21</td>
</tr>
<tr>
<td>Ionosphere</td>
<td>74.74</td>
<td><bold>71.91</bold></td>
<td>84.83</td>
<td>85.09</td>
<td>112.62</td>
<td>81.92</td>
<td>86.96</td>
</tr>
<tr>
<td>Divorce predictors</td>
<td>29.17</td>
<td><bold>27.26</bold></td>
<td>38.82</td>
<td>32.63</td>
<td>29.68</td>
<td>32.87</td>
<td>42.41</td>
</tr>
<tr>
<td>Urban Land Cover</td>
<td>186.15</td>
<td><bold>185.79</bold></td>
<td>197.37</td>
<td>188.76</td>
<td>186.57</td>
<td>207.50</td>
<td>192.15</td>
</tr>
<tr>
<td>Arrhythmia</td>
<td><bold>4132.40</bold></td>
<td>4286.94</td>
<td>4690.48</td>
<td>4802.38</td>
<td>5037.96</td>
<td>5129.23</td>
<td>4582.08</td>
</tr>
<tr>
<td>LSVT Voice</td>
<td>110.32</td>
<td>104.39</td>
<td>115.47</td>
<td>105.10</td>
<td>108.29</td>
<td><bold>102.43</bold></td>
<td>103.71</td>
</tr>
<tr>
<td>Detect Malacious</td>
<td>151.86</td>
<td><bold>109.94</bold></td>
<td>502.79</td>
<td>837.45</td>
<td>683.28</td>
<td>829.48</td>
<td>664.20</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Combined with the above mentioned average classification accuracy and the average selection size, the value of fitness function is evaluated in <xref ref-type="fig" rid="fig-4">Fig. 4</xref> and <xref ref-type="fig" rid="fig-5">5</xref>. It can be seen from the figures that the proposed algorithm has advantages in mean fitness value and standard deviation in different dimensions. Therefore, it can be proved that simultaneous feature selection and support vector machine parameters optimization based on the STOA-DE algorithm has excellent accuracy and excellent stability. <xref ref-type="fig" rid="fig-6">Fig. 6</xref> shows the convergence curve drawn from the fitness value of the last experiment in 30 runs, which completely expresses the convergence process of each data set. From these results, it can be found that no matter how the dimensions of data features are, the STOA-DE algorithm still shows faster convergence speed, higher convergence precision and more vital convergence ability. Therefore, the feasibility of the proposed method can be proved more clearly and accurately through experiments.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>The average fitness of each algorithm</title></caption>
<graphic mimetype="image" mime-subtype="png" xlink:href="CSSE_17536-fig-4.png"/>
</fig>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>The standard deviation of fitness of each algorithm</title></caption>
<graphic mimetype="image" mime-subtype="png" xlink:href="CSSE_17536-fig-5.png"/>
</fig>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>The convergence curve of fitness of each algorithm</title></caption>
<graphic mimetype="image" mime-subtype="png" xlink:href="CSSE_17536-fig-6.png"/>
</fig>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Conclusion</title>
<p>Aiming at the imbalanced exploration and exploitation and the low population diversity in the traditional STOA algorithm, the STOA-DE algorithm is proposed in this paper to improve the local search ability to obtain better solutions. Moreover, the combination of optimization algorithm with SVM and feature selection can optimize the two parameters of SVM and select feature simultaneously. The proposed method enhances the ability of data analysis and learning. Through the experiments of classic UCI data sets, it can be known that the optimal search ability of the proposed method has more advantages and can effectively complete the data classification work. For future research, we can further study the hybrid model of optimization algorithm to be better applied to the field of data preprocessing.</p>
</sec>
</body>
<back>
<ack>
<p>This research is based upon works supported by Grant 19YG02, Sanming University.</p>
</ack><fn-group>
<fn fn-type="other">
<p><bold>Funding Statement: </bold>Sanming University introduces high-level talents to start scientific research funding support project (20YG14, 20YG01), Guiding science and technology projects in Sanming City (2020-G-61, 2020-S-39), Educational research projects of young and middle-aged teachers in Fujian Province (JAT200618, JAT200638), Scientific research and development fund of Sanming University(B202009, B202029).</p>
</fn>
<fn fn-type="conflict">
<p><bold>Conflicts of Interest: </bold>The authors declare that they have no conflicts of interest to report regarding the present study.</p>
</fn>
</fn-group>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>W. D.</given-names> <surname>Jiang</surname></string-name>, <string-name><given-names>T. A.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Sun</surname></string-name>, <string-name><given-names>Y. C.</given-names> <surname>Li</surname></string-name>, <string-name><given-names>Y. X.</given-names> <surname>Tang</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>The analysis of china&#x2019;s integrity situation based on big data</article-title>,&#x201D; <source>Journal on Big Data</source>, vol. <volume>1</volume>, no. <issue>3</issue>, pp. <fpage>117</fpage>&#x2013;<lpage>134</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Kaur</surname></string-name> and <string-name><given-names>K.</given-names> <surname>Kaur</surname></string-name></person-group>, &#x201C;<article-title>Failure prediction, lead time estimation and health degree assessment for hard disk drives using voting based decision trees</article-title>,&#x201D; <source>Computers, Materials &#x0026; Continua</source>, vol. <volume>60</volume>, no. <issue>3</issue>, pp. <fpage>913</fpage>&#x2013;<lpage>946</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Zhu</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Ying</surname></string-name> and <string-name><given-names>X.</given-names> <surname>Wang</surname></string-name></person-group>, &#x201C;<article-title>Within-project and cross-project software defect prediction based on improved transfer naive Bayes algorithm</article-title>,&#x201D; <source>Computers, Materials &#x0026; Continua</source>, vol. <volume>63</volume>, no. <issue>2</issue>, pp. <fpage>891</fpage>&#x2013;<lpage>910</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Yao</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Hu</surname></string-name> and <string-name><given-names>T.</given-names> <surname>Sch&#x00F8;tt</surname></string-name></person-group>, &#x201C;<article-title>Applying feature-weighted gradient decent k-nearest neighbor to select promising projects for scientific funding</article-title>,&#x201D; <source>Computers, Materials &#x0026; Continua</source>, vol. <volume>64</volume>, no. <issue>3</issue>, pp. <fpage>1741</fpage>&#x2013;<lpage>1753</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Zhen</surname></string-name></person-group>, &#x201C;<article-title>Detection of number of wideband signals based on support vector machine</article-title>,&#x201D; <source>Computers, Materials &#x0026; Continua</source>, vol. <volume>62</volume>, no. <issue>3</issue>, pp. <fpage>445</fpage>&#x2013;<lpage>455</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>O.</given-names> <surname>Chapelle</surname></string-name>, <string-name><given-names>V.</given-names> <surname>Vapnik</surname></string-name>, <string-name><given-names>O.</given-names> <surname>Bousquet</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Mukherjee</surname></string-name></person-group>, &#x201C;<article-title>Choosing multiple parameters for support vector machines</article-title>,&#x201D; <source>Machine Learning</source>, vol. <volume>46</volume>, no. <issue>1&#x2013;3</issue>, pp. <fpage>131</fpage>&#x2013;<lpage>159</lpage>, <year>2002</year>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>X.</given-names> <surname>Yu</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Chu</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Jiang</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Guo</surname></string-name> and <string-name><given-names>D.</given-names> <surname>W.Gong</surname></string-name></person-group>, &#x201C;<article-title>SVMs classification based two-side Cross domain collaborative filtering by inferring intrinsic user and item features</article-title>,&#x201D; <source>Knowledge-Based Systems</source>, vol. <volume>141</volume>, no. <issue>1</issue>, pp. <fpage>80</fpage>&#x2013;<lpage>91</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>D. W.</given-names> <surname>Gong</surname></string-name> and <string-name><given-names>J.</given-names> <surname>Cheng</surname></string-name></person-group>, &#x201C;<article-title>Multi-objective particle swarm optimization approach for cost-based feature selection in classification</article-title>,&#x201D; <source>IEEE/ACM Trans. on Computational Biology and Bioinformatics</source>, vol. <volume>14</volume>, no. <issue>1</issue>, pp. <fpage>64</fpage>&#x2013;<lpage>75</lpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H. M.</given-names> <surname>Jia</surname></string-name>, <string-name><given-names>J. D.</given-names> <surname>Li</surname></string-name>, <string-name><given-names>W. L.</given-names> <surname>Song</surname></string-name>, <string-name><given-names>X. X.</given-names> <surname>Peng</surname></string-name>, <string-name><given-names>C. B.</given-names> <surname>Lang</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Spotted hyena optimization algorithm with simulated annealing for feature selection</article-title>,&#x201D; <source>IEEE ACCESS</source>, vol. 7, pp. <fpage>71943</fpage>&#x2013;<lpage>71962</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S. K.</given-names> <surname>Baliarsingh</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Ding</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Vipsita</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Bakshi</surname></string-name></person-group>, &#x201C;<article-title>A memetic algorithm using emperor penguin and social engineering optimization for medical data classification</article-title>,&#x201D; <source>Applied Soft Computing</source>, vol. <volume>85</volume>, no. <issue>1</issue>, pp. <fpage>105773</fpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G.</given-names> <surname>Dhiman</surname></string-name> and <string-name><given-names>A.</given-names> <surname>Kaur</surname></string-name></person-group>, &#x201C;<article-title>STOA: A bio-inspired based optimization algorithm for industrial engineering problems</article-title>,&#x201D; <source>Engineering Applications of Artificial Intelligence</source>, vol. 82, pp. <fpage>148</fpage>&#x2013;<lpage>174</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Storn</surname></string-name> and <string-name><given-names>K.</given-names> <surname>Price</surname></string-name></person-group>, &#x201C;<article-title>Differential evolution&#x2014;A simple and efficient heuristic for global optimization over continuous spaces</article-title>,&#x201D; <source>Journal of Global Optimization</source>, vol. <volume>11</volume>, no. <issue>4</issue>, pp. <fpage>341</fpage>&#x2013;<lpage>359</lpage>, <year>1997</year>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G.</given-names> <surname>Xiong</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Yuan</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Shi</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>He</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>He etal, Parameter extraction of solar photovoltaic models by means of a hybrid differential evolution with whale optimization algorithm</article-title>,&#x201D; <source>Solar Energy</source>, vol. <volume>176</volume>, no. <issue>4</issue>, pp. <fpage>742</fpage>&#x2013;<lpage>761</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H. M.</given-names> <surname>Jia</surname></string-name>, <string-name><given-names>C. B.</given-names> <surname>Lang</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Oliva</surname></string-name>, <string-name><given-names>W. L.</given-names> <surname>Song</surname></string-name> and <string-name><given-names>X. X.</given-names> <surname>Peng</surname></string-name></person-group>, &#x201C;<article-title>Hybrid grasshopper optimization algorithm and differential evolution for multilevel satellite image segmentation</article-title>,&#x201D; <source>Remote Sensing</source>, vol. <volume>11</volume>, no. <issue>9</issue>, pp. <fpage>1134</fpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Tamura</surname></string-name> and <string-name><given-names>K.</given-names> <surname>Yasuda</surname></string-name></person-group>, &#x201C;<article-title>The spiral optimization algorithm: convergence conditions and settings</article-title>,&#x201D; <source>IEEE Transactions on Systems, Man, and Cybernetics: Systems</source>, vol. <volume>50</volume>, no. <issue>1</issue>, pp. <fpage>360</fpage>&#x2013;<lpage>375</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H. M.</given-names> <surname>Jia</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Li</surname></string-name>, <string-name><given-names>C. B.</given-names> <surname>Lang</surname></string-name>, <string-name><given-names>X. X.</given-names> <surname>Peng</surname></string-name>, <string-name><given-names>K. J.</given-names> <surname>Sun</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Hybrid grasshopper optimization algorithm and differential evolution for global optimization</article-title>,&#x201D; <source>Journal of Intelligent &#x0026; Fuzzy Systems</source>, vol. 37, no. 5, pp. <fpage>1</fpage>&#x2013;<lpage>12</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Cortes</surname></string-name> and <string-name><given-names>V. </given-names> <surname>Vapnik </surname></string-name></person-group>, &#x201C;<article-title>Support-vector networks</article-title>,&#x201D; <source>Machine Learning</source>, vol. <volume>20</volume>, no. <issue>3</issue>, pp. <fpage>273</fpage>&#x2013;<lpage>297</lpage>, <year>1995</year>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Ibrahim</surname></string-name>, <string-name><given-names>M. A.</given-names> <surname>Ala&#x2019;</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Hossam</surname></string-name>, <string-name><given-names>A. H.</given-names> <surname>Mohammad</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Seyedali</surname></string-name> <etal>et al.</etal></person-group><italic>,</italic> &#x201C;<article-title>Simultaneous feature selection and support vector machine optimization using the grasshopper optimization algorithm</article-title>,&#x201D; <source>Cognitive Computation</source>, vol. <volume>10</volume>, no. <issue>3</issue>, pp. <fpage>478</fpage>&#x2013;<lpage>495</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z. L.</given-names> <surname>Cai</surname></string-name> and <string-name><given-names>W.</given-names> <surname>Zhu</surname></string-name></person-group>, &#x201C;<article-title>Feature selection for multi-label classification using neighborhood preservation</article-title>,&#x201D; <source>IEEE/CAA Journal of Automatica Sinica</source>, vol. <volume>5</volume>, no. <issue>1</issue>, pp. <fpage>320</fpage>&#x2013;<lpage>330</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Dash</surname></string-name> and <string-name><given-names>H.</given-names> <surname>Liu</surname></string-name></person-group>, &#x201C;<article-title>Feature selection for classificarion</article-title>,&#x201D; <year>1997</year>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Blake</surname></string-name></person-group>, &#x201C;<article-title>UCI repository of machine learning databases</article-title>,&#x201D; 2021. [Online]. Available: <uri>http://www.ics.uci.edu/?mlearn/MLRepository.html</uri>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Liu</surname></string-name> and <string-name><given-names>Y.</given-names> <surname>Xiong</surname></string-name></person-group>, &#x201C;<article-title>Method of Parameters Optimization in SVM based on PSO</article-title>,&#x201D; <source>Trans. on Computer Science &#x0026; Technology</source>, vol. <volume>2</volume>, no. <issue>1</issue>, pp. <fpage>9</fpage>&#x2013;<lpage>16</lpage>, <year>2013</year>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Mirjalili</surname></string-name></person-group>, &#x201C;<article-title>Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm</article-title>,&#x201D; <source>Knowledge-Based Systems</source>, vol. <volume>89</volume>, pp. <fpage>228</fpage>&#x2013;<lpage>249</lpage>, <year>2015</year>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>E.</given-names> <surname>Emary</surname></string-name>, <string-name><given-names>H. M.</given-names> <surname>Zawbaa</surname></string-name> and <string-name><given-names>A. E.</given-names> <surname>Hassanien</surname></string-name></person-group>, &#x201C;<article-title>Binary ant lion approaches for feature selection</article-title>,&#x201D; <source>Neurocomputing</source>, vol. <volume>213</volume>, no. <issue>6</issue>, pp. <fpage>54</fpage>&#x2013;<lpage>65</lpage>, <year>2016</year>.</mixed-citation></ref>
</ref-list>
</back>
</article>