<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">55080</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2024.055080</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Multi-Label Feature Selection Based on Improved Ant Colony Optimization Algorithm with Dynamic Redundancy and Label Dependence</article-title>
<alt-title alt-title-type="left-running-head">Multi-Label Feature Selection Based on Improved Ant Colony Optimization Algorithm with Dynamic Redundancy and Label Dependence</alt-title>
<alt-title alt-title-type="right-running-head">Multi-Label Feature Selection Based on Improved Ant Colony Optimization Algorithm with Dynamic Redundancy and Label Dependence</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Cai</surname><given-names>Ting</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-2" contrib-type="author">
<name name-style="western"><surname>Ye</surname><given-names>Chun</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-3" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Ye</surname><given-names>Zhiwei</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><email>hgcsyzw@hbut.edu.cn</email></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Chen</surname><given-names>Ziyuan</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-5" contrib-type="author">
<name name-style="western"><surname>Mei</surname><given-names>Mengqing</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-6" contrib-type="author">
<name name-style="western"><surname>Zhang</surname><given-names>Haichao</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-7" contrib-type="author">
<name name-style="western"><surname>Bai</surname><given-names>Wanfang</given-names></name><xref ref-type="aff" rid="aff-2">2</xref></contrib>
<contrib id="author-8" contrib-type="author">
<name name-style="western"><surname>Zhang</surname><given-names>Peng</given-names></name><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<aff id="aff-1"><label>1</label><institution>School of Computer, Hubei University of Technology</institution>, <addr-line>Wuhan, 430068</addr-line>, <country>China</country></aff>
<aff id="aff-2"><label>2</label><institution>Xining Big Data Service Administration</institution>, <addr-line>Xining, 810000</addr-line>, <country>China</country></aff>
<aff id="aff-3"><label>3</label><institution>Wuhan Fiberhome Technical Services Co., Ltd.</institution>, <addr-line>Wuhan, 430205</addr-line>, <country>China</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Zhiwei Ye. Email: <email>hgcsyzw@hbut.edu.cn</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2024</year></pub-date>
<pub-date date-type="pub" publication-format="electronic"><day>15</day><month>10</month><year>2024</year></pub-date>
<volume>81</volume>
<issue>1</issue>
<fpage>1157</fpage>
<lpage>1175</lpage>
<history>
<date date-type="received">
<day>16</day>
<month>6</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>01</day>
<month>9</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2024 The Authors.</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Published by Tech Science Press.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_55080.pdf"></self-uri>
<abstract>
<p>The world produces vast quantities of high-dimensional multi-semantic data. However, extracting valuable information from such a large amount of high-dimensional and multi-label data is undoubtedly arduous and challenging. Feature selection aims to mitigate the adverse impacts of high dimensionality in multi-label data by eliminating redundant and irrelevant features. The ant colony optimization algorithm has demonstrated encouraging outcomes in multi-label feature selection, because of its simplicity, efficiency, and similarity to reinforcement learning. Nevertheless, existing methods do not consider crucial correlation information, such as dynamic redundancy and label correlation. To tackle these concerns, the paper proposes a multi-label feature selection technique based on ant colony optimization algorithm (MFACO), focusing on dynamic redundancy and label correlation. Initially, the dynamic redundancy is assessed between the selected feature subset and potential features. Meanwhile, the ant colony optimization algorithm extracts label correlation from the label set, which is then combined into the heuristic factor as label weights. Experimental results demonstrate that our proposed strategies can effectively enhance the optimal search ability of ant colony, outperforming the other algorithms involved in the paper.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Multi-label feature selection</kwd>
<kwd>ant colony optimization algorithm</kwd>
<kwd>dynamic redundancy</kwd>
<kwd>high-dimensional data</kwd>
<kwd>label correlation</kwd>
</kwd-group>
<funding-group>
<award-group id="awg1">
<funding-source>National Natural Science Foundation of China</funding-source>
<award-id>62376089</award-id>
<award-id>62302153</award-id>
<award-id>62302154</award-id>
<award-id>62202147</award-id>
</award-group>
<award-group id="awg2">
<funding-source>key Research and Development Program of Hubei Province</funding-source>
<award-id>2023BEB024</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>Multi-label datasets often involve intricate scenarios and learning tasks that fall outside the realm of single-label learning, thus presenting a new obstacle in machine learning. In this case, multi-label learning has been developed, expanding the association pattern from one instance corresponding to one label in single-label learning to one instance corresponding to a collection of labels [<xref ref-type="bibr" rid="ref-1">1</xref>]. This broadens the application scenarios of single-label learning and enhances the accuracy and efficiency in predicting complex multi-semantic information. Another characteristic of multi-label data is its high feature dimensionality. The discriminative features in high-dimensional data [<xref ref-type="bibr" rid="ref-2">2</xref>] are strongly correlated with numerous labels. However, the features that affect multi-label classification occupy only a portion, or even a small portion, of the feature set, while the remaining feature set is inundated with numerous redundant and irrelevant features. These redundant or irrelevant features not only increase the computational cost of machine learning algorithms but also waste valuable computing resources. Additionally, they consume significant storage space and increase storage requirements.</p>
<p>Feature selection is the direct removal of redundant and irrelevant features from the original feature set, thereby achieving a reduction of the feature set. The technique preserves the original features, leading to an improved understanding of data and better performance of learning algorithms [<xref ref-type="bibr" rid="ref-3">3</xref>]. Feature selection is widely used in supervised machine-learning paradigms [<xref ref-type="bibr" rid="ref-4">4</xref>&#x2013;<xref ref-type="bibr" rid="ref-8">8</xref>]. In feature selection, it is crucial to obtain the optimal feature subset by finding those highly correlated with labels and non-redundant. Therefore, it is essential to consider both of these factors simultaneously to ensure that the selected features have a high predictive power and some interpretability. For example, Yang et al. [<xref ref-type="bibr" rid="ref-9">9</xref>] proposed a multi-strategy assisted multi-objective WOA (MSMOWOA) to address feature selection. Ye et al. [<xref ref-type="bibr" rid="ref-10">10</xref>] proposed a novel ensemble framework for intrusion detection. It used an improved Hybrid Breeding Optimization (HBO) to address the cruse of dimensionality and improve the Intrusion Detection System (IDS) performance. More and more evolution computation or swarm-intelligence based optimization algorithms are applied for feature selection.</p>
<p>Common swarm intelligence optimization algorithms include Particle Swarm Optimization (PSO) [<xref ref-type="bibr" rid="ref-11">11</xref>], Grey Wolf Optimizer (GWO) [<xref ref-type="bibr" rid="ref-12">12</xref>], Ant Colony Optimization (ACO) [<xref ref-type="bibr" rid="ref-13">13</xref>], Whale Optimization Algorithms (WOA) [<xref ref-type="bibr" rid="ref-14">14</xref>], and Chimp Optimization Algorithm [<xref ref-type="bibr" rid="ref-15">15</xref>]. Compared to other swarm intelligence optimization algorithms that require a combination of learning algorithms to find the optimal feature subset, ACO can transform the feature selection process into a graph search where ants navigate through nodes representing features, thus offering greater flexibility. When dealing with high-dimensional data, filtered methods using only the internal structural information of the data are more efficient and do not require a combination of learning algorithms. Furthermore, these methods can be integrated into the heuristic factors of the ACO algorithm [<xref ref-type="bibr" rid="ref-16">16</xref>,<xref ref-type="bibr" rid="ref-17">17</xref>] to enhance search efficiency. Additionally, ACO can directly optimize filtered methods for multi-label feature selection [<xref ref-type="bibr" rid="ref-18">18</xref>&#x2013;<xref ref-type="bibr" rid="ref-20">20</xref>]. The ACO algorithm demonstrates commendable efficiency when optimizing filtered multi-label feature selection techniques. Nevertheless, it does not fully capitalize on the correlational information between features and labels, along with that between labels. For filtered methods, fully utilizing the information inherent in the data is crucial for identifying the optimal subset of features [<xref ref-type="bibr" rid="ref-21">21</xref>]. At the same time, the ACO algorithm struggles to pinpoint the ideal solution when dealing with high-dimensional problems. Consequently, it is worth further investigating how to augment the search optimization ability of ant colonies, which could, in turn, enhance the performance of multi-label feature selection.</p>
<p>Some practice and research have been done, for example, Paniri et al. proposed a filtered technique, denominated as a multi-label feature selection algorithm based on ant colony optimization (MLACO) [<xref ref-type="bibr" rid="ref-18">18</xref>]. This method marked the pioneering application of a filter approach to enhance multi-label feature selection using the ACO algorithm. It conducts a search within the feature space, iteratively identifying the most pertinent features with minimal redundancy in relation to class labels. After the introduction of MLACO, Paniri et al. advanced an innovative approach termed Ant Colony Optimization plus Temporal Difference Reinforcement Learning (ANT-TD) [<xref ref-type="bibr" rid="ref-19">19</xref>], which converted the inherent heuristic search process of an ant colony into a Markov chain framework. The conversion utilized temporal difference reinforcement learning, thereby facilitating a continuous learning paradigm of ants throughout the search process. Concurrent with these developments, Hatami et al. embarked on the creation of a weighted feature map predicated on mutual information [<xref ref-type="bibr" rid="ref-20">20</xref>]. This endeavor utilized mutual information as the cornerstone for designing an innovative adaptive function tailored to the pheromone update process, thereby enhancing the efficacy of the feature selection process. Kakarash et al. developed a KNN Graph to meticulously portray all features along with the interconnections between them [<xref ref-type="bibr" rid="ref-22">22</xref>]. Subsequently, density peak clustering is used to group similar features into one cluster. To identify the least redundant features, the probability of the ant colony staying in one cluster during the search process is lower than jumping into another cluster.</p>
<p>Based on the previous foundation, in the paper, the ACO algorithm is used to construct a search space based on relevance information. The feature selection process is converted into the search procedures of ACO to obtain the optimal subset of features containing discriminative features. This enhancement is designed to augment the dimensionality reduction capabilities for high-dimensional data within multi-label scenarios and to increase the efficiency of multi-label learning. The main contributions of the proposed method are as follows:</p>
<p>1) The proposed method is an improvement upon ACO-based multi-label feature selection that incorporates dynamic redundancy and label correlation. The existing multi-label feature selection methods based on ACO have achieved a certain effect, they ignore the dynamic redundancy between the feature subset and the selected feature in the optimization process and also ignore a feature of multi-label data: label correlation. Unlike existing methods, which primarily focus on the static redundancy among features, our approach addresses the dynamic redundancy that emerges as ant colonies navigate and it accounts for a key aspect of multi-label data, namely label correlation. The paper uses ACO to search for correlations among labels. Then these correlations are transformed into label weights and incorporated into heuristic factors. By embracing dynamic redundancy and label correlation, our enhancement strategy significantly improves the performance of multi-label feature selection using ACO.</p>
<p>2) Experiments are performed on seven benchmark datasets from multiple fields, including text categorization and image annotation. Our method performance is evaluated using four criteria, including Hamming loss (HL), Average-precision (AP), One-error loss (0&#x2013;1), and Micro-F1. These assessments demonstrate that our method outperforms other feature selection methods.</p>
<p>The rest of the paper is organized as follows: <xref ref-type="sec" rid="s2">Section 2</xref> introduces the multi-label feature selection, elaborating on their integration to address the multi-label classification problem. <xref ref-type="sec" rid="s3">Section 3</xref> describes the primary ACO, highlighting the principles of dynamic redundancy and label correlation. <xref ref-type="sec" rid="s4">Section 4</xref> offers a comprehensive explanation of the proposed method entire process. <xref ref-type="sec" rid="s5">Section 5</xref> summarizes the experimental outcomes achieved with the proposed method across seven datasets, as well as a comparative analysis with existing techniques. Finally, <xref ref-type="sec" rid="s6">Section 6</xref> outlines potential directions for future research endeavors.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Work</title>
<p>This section delineates an overview of multi-label classification and multi-label feature selection along with a discussion of two widely used methods for tackling challenges in multi-label classification challenges.</p>
<sec id="s2_1">
<label>2.1</label>
<title>Overview of Multi-Label Classification</title>
<p>There are generally two approaches for addressing multi-label classification issues: problem transformation and algorithmic adaptation. Problem transformation involves converting the multi-label classification task into multiple single-label classification tasks, allowing the use of established, robust single-label classification algorithms. For instance, the Binary Relevance (BR) method transforms the multi-label problem into a series of binary classification tasks [<xref ref-type="bibr" rid="ref-23">23</xref>]. The BR algorithm uses the entire dataset and trains a binary classifier for each label. When training a classifier for a specific label, only the information pertinent to that label is considered, and other labels are ignored. This approach, however, has a notable disadvantage: it overlooks label correlation. Consequently, label combinations that are dependent on one another may not be predicted in certain datasets. In contrast, the Classifier Chains (CC) algorithm mitigates these issues, delivering improved classification performance with the added benefit of a low time complexity of the BR algorithm [<xref ref-type="bibr" rid="ref-24">24</xref>]. The CC algorithm enhances the BR algorithm by transitioning from individually training binary classifiers to a sequential training process. However, a notable limitation of this method is the difficulty and complexity involved in detecting such correlations, rendering them non-trivial to ascertain. The algorithm adaptation method diverges from the problem transformation approach in its fundamental premise: rather than altering the multi-label classification problems or data, it enhances existing single-label classification algorithms to accommodate and resolve multi-label classification issues directly. The advantage of this strategy is that it circumvents the potential loss of information that can occur during the transformation process. Multi-Label Learning-K-Nearest Neighbor (ML-KNN), which is a multi-label iteration of the K-Nearest Neighbor (KNN) [<xref ref-type="bibr" rid="ref-25">25</xref>], serves as an example of this approach.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Multi-Label Feature Selection</title>
<p>Feature selection [<xref ref-type="bibr" rid="ref-3">3</xref>] is one of the most important techniques used in machine learning to preprocess data, aimed at curating a subset of the most pertinent features from a vast pool to facilitate accurate classification. Essentially, feature selection can be regarded as a combinatorial optimization problem. It has been applied in different fields, including traveling thief problems [<xref ref-type="bibr" rid="ref-26">26</xref>], text feature selection [<xref ref-type="bibr" rid="ref-27">27</xref>], cancer classification [<xref ref-type="bibr" rid="ref-28">28</xref>], and network traffic identification [<xref ref-type="bibr" rid="ref-29">29</xref>]. Meanwhile, feature selection can be combined with different intelligent optimization algorithms to solve practical problems, such as the Greylag Goose Optimization Algorithm [<xref ref-type="bibr" rid="ref-30">30</xref>], the Hybrid Sine-Cosine Chimp Optimization Algorithm [<xref ref-type="bibr" rid="ref-31">31</xref>] and the WOA [<xref ref-type="bibr" rid="ref-32">32</xref>]. For multi-label data, the filtered subset of features enables the classifier to assign multiple labels to individual instances effectively. In feature selection, it is crucial to identify the optimal subset of features by finding those that are both highly correlated with labels and non-redundant. For example, Asilian Bidgoli et al. [<xref ref-type="bibr" rid="ref-33">33</xref>] proposed an enhanced iteration of the super-multi-objective binary Non-Dominated Sorting Genetic Algorithm-&#x2162; (NSGA-III) algorithm, which optimized the Hamming loss, feature subset size, feature label relevance, and computational complexity of features concurrently. Hashemi et al. [<xref ref-type="bibr" rid="ref-34">34</xref>] introduced the Multi-label Graph-based Feature Selection (MGFS). They first calculated the correlation distance between features and each class label, creating the Correlation Distance Matrix (CDM). They constructed a complete weighted feature map using the Euclidean distance applied to the CDM. Finally, the importance of the graph nodes was determined using the weighted PageRank algorithm of mean. Gonzalez-Lopez et al. [<xref ref-type="bibr" rid="ref-35">35</xref>] utilized two techniques, Euclidean Paradigm Maximization and Geometric Mean Maximization, to select features. The experiments affirmed that the proposed distributed approach effectively selected features with maximum relevance and minimum redundancy, while fully respecting predefined time constraints.</p>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Preliminaries</title>
<p>In this section, the implementation of the ACO algorithm in the realm of feature selection is succinctly elucidated.</p>
<sec id="s3_1">
<label>3.1</label>
<title>Ant Colony Optimization Algorithm for Feature Selection</title>
<p>The ACO algorithm is a widely used optimization technique that imitates the foraging pattern of real-life ants. These ants maintain specific search patterns through a special chemical substance called pheromones. In recent years, this algorithm has seen an increased application in the realm of multi-label feature selection. Ants are a common sight in nature and follow specific patterns when searching for food. They maintain these patterns through the use of a special chemical called pheromones. In nature, ants aim to take the shortest path between their nest and their food source. As they travel, they deposit pheromones on the path, which gradually evaporate, leading to a decrease in their concentration along all pathways. Volatilization is a crucial process that aids in overcoming the challenge of being trapped in local minima and enhances exploration of the search space. Therefore, the concentration of pheromones can significantly influence the path that ants follow, making them more likely to choose paths with higher pheromone concentrations. As a result, ants gradually tend to traverse the shortest path. ACO algorithms frequently convert feature selection processes into the process of an ant colony wandering through a graph. The graph is generally composed of the correlation between features and labels, as well as the redundancy between features. The wandering process of the ant colony is primarily impacted by the pheromone update rule and the state transfer rule.</p>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Dynamic Redundancy and Label Correlation</title>
<p>In the feature selection process of ACO, the choice of the subsequent feature usually hinges on the redundancy between it and the previous one. However, this static redundancy strategy will result in poor prediction accuracy for certain datasets.</p>
<p>To evaluate the redundancy of previously chosen features and potential ones, a dynamic computational strategy is proposed as in <xref ref-type="disp-formula" rid="eqn-1">Eq. (1)</xref>. Suppose that <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mspace width="negativethinmathspace" /><mml:mo>&#x2212;</mml:mo><mml:mspace width="negativethinmathspace" /><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:math></inline-formula> an ant currently selects a feature set with feature number:
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mi>a</mml:mi><mml:mi>v</mml:mi><mml:mi>g</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:munder><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:msubsup><mml:mi>F</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:munder><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>,</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2212;</mml:mo><mml:msubsup><mml:mi>F</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:math></disp-formula>where <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> consists of unreachable features to be selected, <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:mi>X</mml:mi></mml:math></inline-formula> is the original set of features, <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mrow><mml:mo>(</mml:mo><mml:mi>m</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denotes the number of elements in the set <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:msubsup><mml:mi>F</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, and <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:mrow><mml:mtext>sim</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the similarity between two elements, <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. <xref ref-type="disp-formula" rid="eqn-1">Eq. (1)</xref> illustrates the average redundancy between the candidate features and the selected features, which will vary throughout the wandering process.</p>
<p>Label relevance is paramount in multi-label learning. To improve the performance of multi-label learning through feature selection, the impact of label correlation needs to be considered. For filtered methods, information theory techniques are often utilized to measure label correlation. The pairwise correlations between labels are computed using symmetric uncertainty in information theory. The formula for symmetric uncertainty is given as <xref ref-type="disp-formula" rid="eqn-2">Eq. (2)</xref>:
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mi>S</mml:mi><mml:mi>U</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>M</mml:mi><mml:mi>I</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula>
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munderover><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mi>lg</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munderover><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:munderover><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mi>lg</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:mi>M</mml:mi><mml:mi>I</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p><xref ref-type="disp-formula" rid="eqn-2">Eq. (2)</xref> calculates the pairwise correlation between labels, serving as a heuristic factor for ACO. Here, <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> defines the entropy definition of the random variable <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mi>X</mml:mi></mml:math></inline-formula>. <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denotes the conditional entropy, which measures the residual uncertainty of random variable <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:mrow><mml:mi>X</mml:mi></mml:mrow></mml:math></inline-formula> given <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:mrow><mml:mi>Y</mml:mi></mml:mrow></mml:math></inline-formula> and <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:mi>M</mml:mi><mml:mi>I</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, quantifies the information that one variable receives from the other and indicates the amount of information shared by the variables <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mi>Y</mml:mi></mml:math></inline-formula>. The term <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the probability of <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the joint probability of <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>.</p>
<p>The state transition rule is employed during runtime to construct ant paths:
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:msubsup><mml:mi>P</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="center left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>&#x03C4;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x03B2;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>u</mml:mi><mml:mo>&#x2208;</mml:mo><mml:msubsup><mml:mi>N</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:munder><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>&#x03C4;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x03B2;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi mathvariant="normal">&#x2200;</mml:mi><mml:mi>i</mml:mi><mml:mo>&#x2208;</mml:mo><mml:msubsup><mml:mi>N</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mspace width="thinmathspace" /><mml:mi>g</mml:mi><mml:mo>&#x003E;</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>w</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:munder><mml:mrow><mml:mtext>argmax</mml:mtext></mml:mrow><mml:mrow><mml:mi>u</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>&#x03C4;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x03B2;</mml:mi></mml:mrow></mml:msup><mml:mo>}</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x00A0;</mml:mtext><mml:mi>g</mml:mi><mml:mo>&#x003C;</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></disp-formula>where <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:msub><mml:mrow><mml:mi mathvariant="normal">&#x03C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the pheromone at time <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:mi>t</mml:mi></mml:math></inline-formula>, <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:mrow><mml:mi mathvariant="normal">&#x03B7;</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is a heuristic factor which generally consists of the correlation of features, <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:mrow><mml:mi mathvariant="normal">&#x03B7;</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mtext>SU</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:msubsup><mml:mi>N</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is the set of unvisited labels of ant <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:mi>k</mml:mi></mml:math></inline-formula>, <inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> is a constant used to balance exploration and exploitation, <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:mi>g</mml:mi></mml:math></inline-formula> is a random value. The parameter <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula> distributed in <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:mrow><mml:mo>[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mspace width="thinmathspace" /><mml:mn>1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> controls the trade-off between the pheromone value and the heuristic information. If <inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:mi>&#x03B2;</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:math></inline-formula>, the influence of the heuristic information is ignored. And the ant&#x0027;s path selection process is only related to the pheromone, making decisions based only on the previous action. And the pheromone is updated according to <xref ref-type="disp-formula" rid="eqn-8">Eq. (8)</xref>:
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:msub><mml:mi>&#x03C4;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03C1;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msubsup><mml:mi>&#x03C4;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>L</mml:mi><mml:mi>C</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:munderover><mml:mi>L</mml:mi><mml:mi>C</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula>where <inline-formula id="ieqn-33"><mml:math id="mml-ieqn-33"><mml:mi>q</mml:mi></mml:math></inline-formula> is the number of raw tag, <inline-formula id="ieqn-34"><mml:math id="mml-ieqn-34"><mml:msubsup><mml:mrow><mml:mi mathvariant="normal">&#x03C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula id="ieqn-35"><mml:math id="mml-ieqn-35"><mml:msub><mml:mrow><mml:mi mathvariant="normal">&#x03C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are the pheromone values of tag <inline-formula id="ieqn-36"><mml:math id="mml-ieqn-36"><mml:mi>i</mml:mi></mml:math></inline-formula> at <inline-formula id="ieqn-37"><mml:math id="mml-ieqn-37"><mml:mi>t</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-38"><mml:math id="mml-ieqn-38"><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> moments. Respectively, <inline-formula id="ieqn-39"><mml:math id="mml-ieqn-39"><mml:mi>&#x03C1;</mml:mi></mml:math></inline-formula> is the pheromone evaporation parameter, and <inline-formula id="ieqn-40"><mml:math id="mml-ieqn-40"><mml:mi>L</mml:mi><mml:mi>C</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the counter corresponding to the tag <inline-formula id="ieqn-41"><mml:math id="mml-ieqn-41"><mml:mi>i</mml:mi></mml:math></inline-formula>. The weight vector for label <inline-formula id="ieqn-42"><mml:math id="mml-ieqn-42"><mml:mrow><mml:mi>l</mml:mi><mml:mi>b</mml:mi></mml:mrow></mml:math></inline-formula> is attained at the conclusion of ant colony iteration, as demonstrated in <xref ref-type="disp-formula" rid="eqn-9">Eq. (9)</xref>:
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:mi>l</mml:mi><mml:mi>b</mml:mi><mml:mo>=</mml:mo><mml:mi>&#x03C4;</mml:mi></mml:math></disp-formula></p>
<fig id="fig-7">
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_55080-fig-7.tif"/>
</fig>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>The Proposed Method</title>
<p>The paper proposes a multi-label feature selection based on improved ACO by dynamic redundancy and label correlation. The essence of this approach is to explore the correlation between labels using an improved ACO, convert the correlation into label weights, and integrate them into heuristic factors.</p>
<sec id="s4_1">
<label>4.1</label>
<title>Relevancy Calculation</title>
<p>In this subsection, feature relevancy and feature label relevancy are computed separately, and then the label relevancy obtained from the ACO algorithm is integrated into the feature label relevancy. Cosine similarity and Pearson correlation coefficient are used in calculating the correlations. Cosine similarity is a mathematical measure, as shown in <xref ref-type="disp-formula" rid="eqn-10">Eq. (10)</xref>:
<disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:mi>cos</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>B</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>cos</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>A</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>A</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mo>&#x2217;</mml:mo><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>B</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula>where <inline-formula id="ieqn-56"><mml:math id="mml-ieqn-56"><mml:mi>&#x03B8;</mml:mi></mml:math></inline-formula> is the angle between the nonzero vectors <inline-formula id="ieqn-57"><mml:math id="mml-ieqn-57"><mml:mi>A</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-58"><mml:math id="mml-ieqn-58"><mml:mi>B</mml:mi></mml:math></inline-formula>, <inline-formula id="ieqn-59"><mml:math id="mml-ieqn-59"><mml:mo stretchy="false">&#x2225;</mml:mo><mml:mspace width="negativethinmathspace" /><mml:mi>A</mml:mi><mml:mspace width="negativethinmathspace" /><mml:mo stretchy="false">&#x2225;</mml:mo></mml:math></inline-formula> and <inline-formula id="ieqn-60"><mml:math id="mml-ieqn-60"><mml:mo stretchy="false">&#x2225;</mml:mo><mml:mspace width="negativethinmathspace" /><mml:mi>B</mml:mi><mml:mspace width="negativethinmathspace" /><mml:mo stretchy="false">&#x2225;</mml:mo></mml:math></inline-formula> are their respective magnitude.</p>
<p>The Pearson correlation coefficient is one of the most popular similarity measures, calculated as shown in <xref ref-type="disp-formula" rid="eqn-11">Eq. (11)</xref>:
<disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block"><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>B</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mfrac><mml:mrow><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munderover><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:mover><mml:mi>A</mml:mi><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>B</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:mover><mml:mi>B</mml:mi><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msqrt><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munderover><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:mover><mml:mi>A</mml:mi><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:msqrt><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msqrt><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munderover><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>B</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:mover><mml:mi>B</mml:mi><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:msqrt><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>|</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-61"><mml:math id="mml-ieqn-61"><mml:mi>A</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-62"><mml:math id="mml-ieqn-62"><mml:mi>B</mml:mi></mml:math></inline-formula> are two n-dimensional (number of instances) features vectors, <inline-formula id="ieqn-63"><mml:math id="mml-ieqn-63"><mml:mover><mml:mi>A</mml:mi><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> and <inline-formula id="ieqn-64"><mml:math id="mml-ieqn-64"><mml:mover><mml:mi>B</mml:mi><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> are the average of the feature in the entire dataset. If they are linearly correlated, the value of the correlation coefficient will be closer to one, otherwise if they are independent then the value is zero.</p>
<p>To separately evaluate the redundancy between features and the correlation between features and class labels, the redundancy matrix <inline-formula id="ieqn-65"><mml:math id="mml-ieqn-65"><mml:mi>f</mml:mi><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi></mml:math></inline-formula> and the incidence matrix <inline-formula id="ieqn-66"><mml:math id="mml-ieqn-66"><mml:mi>f</mml:mi><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi></mml:math></inline-formula> for <inline-formula id="ieqn-67"><mml:math id="mml-ieqn-67"><mml:mi>m</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>m</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-68"><mml:math id="mml-ieqn-68"><mml:mi>m</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>q</mml:mi></mml:math></inline-formula> dimensions are computed. Additionally, the label correlations obtained from ACO in the previous section are integrated into the correlation matrix. The integration process illustrated in <xref ref-type="disp-formula" rid="eqn-12">Eq. (12)</xref>:
<disp-formula id="eqn-12"><label>(12)</label><mml:math id="mml-eqn-12" display="block"><mml:mi>f</mml:mi><mml:mi>l</mml:mi><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mi>l</mml:mi><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi><mml:mo>&#x2217;</mml:mo><mml:mi>l</mml:mi><mml:mi>b</mml:mi></mml:math></disp-formula></p>
<p>The proposed approach weights the relevance between each feature and all class labels through labels, preserving and incorporating all feature-label relevance.</p>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Subset Construction</title>
<p>The construction of the subset is also the process of information accumulation, feedback, and iteration. The coding method used pheromones as feature weights, that is, it obtains feature ordering by sorting the pheromone values, thus obtaining the feature subset. During the process of wandering of ant colony, the characteristics of the next visit are determined by state transition rules, which encompass two strategies, a probabilistic transfer strategy, and a greedy transfer strategy:
<disp-formula id="eqn-13"><label>(13)</label><mml:math id="mml-eqn-13" display="block"><mml:msubsup><mml:mi>P</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="center right" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>&#x03C4;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x03B1;</mml:mi></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x03B2;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>u</mml:mi><mml:mo>&#x2208;</mml:mo><mml:msubsup><mml:mi>N</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:munder><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>&#x03C4;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x03B1;</mml:mi></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x03B2;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi mathvariant="normal">&#x2200;</mml:mi><mml:mi>i</mml:mi><mml:mo>&#x2208;</mml:mo><mml:msubsup><mml:mi>N</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mspace width="thinmathspace" /><mml:mi>g</mml:mi><mml:mo>&#x003E;</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>w</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-14"><label>(14)</label><mml:math id="mml-eqn-14" display="block"><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:munder><mml:mrow><mml:mtext>argmax</mml:mtext></mml:mrow><mml:mrow><mml:mi>u</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mo>{</mml:mo><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>&#x03C4;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x03B1;</mml:mi></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x03B2;</mml:mi></mml:mrow></mml:msup><mml:mo>}</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mspace width="thinmathspace" /><mml:mi>g</mml:mi><mml:mo>&#x003C;</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></disp-formula>
<disp-formula id="eqn-15"><label>(15)</label><mml:math id="mml-eqn-15" display="block"><mml:msub><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>f</mml:mi><mml:mi>l</mml:mi><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2217;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mi>a</mml:mi><mml:mi>v</mml:mi><mml:mi>g</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula>
<disp-formula id="eqn-16"><label>(16)</label><mml:math id="mml-eqn-16" display="block"><mml:msub><mml:mi>&#x03B7;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mi>l</mml:mi><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p><disp-formula id="eqn-17"><label>(17)</label><mml:math id="mml-eqn-17" display="block"><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mfrac><mml:mi>k</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mi>F</mml:mi></mml:mrow></mml:mfrac><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:math></disp-formula>where <inline-formula id="ieqn-69"><mml:math id="mml-ieqn-69"><mml:msubsup><mml:mi>N</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is the set of features that have not been visited by the ant <inline-formula id="ieqn-70"><mml:math id="mml-ieqn-70"><mml:mi>k</mml:mi></mml:math></inline-formula>; <inline-formula id="ieqn-71"><mml:math id="mml-ieqn-71"><mml:msub><mml:mi>&#x03C4;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the value of the pheromone of the feature <inline-formula id="ieqn-72"><mml:math id="mml-ieqn-72"><mml:mi>i</mml:mi></mml:math></inline-formula> related to the pheromone in the <inline-formula id="ieqn-73"><mml:math id="mml-ieqn-73"><mml:mi>t</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:math></inline-formula> generation; the parameter <inline-formula id="ieqn-74"><mml:math id="mml-ieqn-74"><mml:mrow><mml:mtext>g</mml:mtext></mml:mrow></mml:math></inline-formula> is a random variable uniformly distributed in <inline-formula id="ieqn-75"><mml:math id="mml-ieqn-75"><mml:mrow><mml:mo>[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>, and <inline-formula id="ieqn-76"><mml:math id="mml-ieqn-76"><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> is an adaptive value with iteration; <inline-formula id="ieqn-77"><mml:math id="mml-ieqn-77"><mml:mi>N</mml:mi><mml:mi>F</mml:mi></mml:math></inline-formula> is the number of features that need to be selected by the ant in each iteration; <inline-formula id="ieqn-78"><mml:math id="mml-ieqn-78"><mml:mi>r</mml:mi></mml:math></inline-formula> is the parameter controlling the decay rate. Through the probabilistic transfer rule defined in <xref ref-type="disp-formula" rid="eqn-13">Eq. (13)</xref>, the ant has the probability to select any feature that has not yet been visited; through the greedy rule in <xref ref-type="disp-formula" rid="eqn-14">Eq. (14)</xref>, the ant can select the feature with the lowest static redundancy and dynamic redundancy among unvisited features, as well as the highest feature-label relevance. The frequency with which an ant visits a feature will be stored in a vector called a feature counter (FC). Each time the ant traverses a feature, it adds one to the FC value corresponding to that feature. At the end of each iteration, <xref ref-type="disp-formula" rid="eqn-18">Eq. (18)</xref> updates the amount of pheromone for each feature.</p>
<p><disp-formula id="eqn-18"><label>(18)</label><mml:math id="mml-eqn-18" display="block"><mml:msub><mml:mi>&#x03C4;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03C1;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msubsup><mml:mi>&#x03C4;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>F</mml:mi><mml:mi>C</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:munderover><mml:mi>F</mml:mi><mml:mi>C</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula>where <inline-formula id="ieqn-79"><mml:math id="mml-ieqn-79"><mml:mi>m</mml:mi></mml:math></inline-formula> is the number of features, <inline-formula id="ieqn-80"><mml:math id="mml-ieqn-80"><mml:msubsup><mml:mrow><mml:mi mathvariant="normal">&#x03C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula id="ieqn-81"><mml:math id="mml-ieqn-81"><mml:msub><mml:mrow><mml:mi mathvariant="normal">&#x03C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are the pheromone values of tag <inline-formula id="ieqn-82"><mml:math id="mml-ieqn-82"><mml:mi>i</mml:mi></mml:math></inline-formula> at <inline-formula id="ieqn-83"><mml:math id="mml-ieqn-83"><mml:mi>t</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-84"><mml:math id="mml-ieqn-84"><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> moments. Respectively, <inline-formula id="ieqn-85"><mml:math id="mml-ieqn-85"><mml:mi>&#x03C1;</mml:mi></mml:math></inline-formula> is the pheromone evaporation parameter, and <inline-formula id="ieqn-86"><mml:math id="mml-ieqn-86"><mml:mi>F</mml:mi><mml:mi>C</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the counter corresponding to the tag <inline-formula id="ieqn-87"><mml:math id="mml-ieqn-87"><mml:mi>i</mml:mi></mml:math></inline-formula>.</p>
<p>The ant colony selects features and accumulates pheromone values using the aforementioned strategy. The values in the pheromone correspond to the weights of the features. Algorithm 2 provides a detailed description of the proposed method. The specific flow chart is shown in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>. The computational complexity of the proposed method is analyzed. Firstly, the label weight <inline-formula id="ieqn-88"><mml:math id="mml-ieqn-88"><mml:mrow><mml:mtext>lb</mml:mtext></mml:mrow></mml:math></inline-formula> is obtained by Algorithm 1. The computational complexity of uncertainty matrix <inline-formula id="ieqn-89"><mml:math id="mml-ieqn-89"><mml:mi>S</mml:mi><mml:mi>U</mml:mi></mml:math></inline-formula> is <inline-formula id="ieqn-90"><mml:math id="mml-ieqn-90"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>q</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, and the main cycle is <inline-formula id="ieqn-91"><mml:math id="mml-ieqn-91"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>n</mml:mi><mml:mi>C</mml:mi><mml:mi>y</mml:mi><mml:mi>c</mml:mi><mml:mi>l</mml:mi><mml:mi>e</mml:mi><mml:mo>&#x2217;</mml:mo><mml:mi>n</mml:mi><mml:mi>A</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mo>&#x2217;</mml:mo><mml:mi>N</mml:mi><mml:mi>F</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Then, the feature label correlation matrix <inline-formula id="ieqn-92"><mml:math id="mml-ieqn-92"><mml:mi>f</mml:mi><mml:mi>l</mml:mi><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi></mml:math></inline-formula> is obtained with computational complexity <inline-formula id="ieqn-93"><mml:math id="mml-ieqn-93"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>m</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:mi>m</mml:mi><mml:mi>q</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Therefore, the total computational complexity of algorithm 2 is <inline-formula id="ieqn-94"><mml:math id="mml-ieqn-94"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>q</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:msup><mml:mi>m</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:mi>m</mml:mi><mml:mi>q</mml:mi><mml:mo>+</mml:mo><mml:mi>n</mml:mi><mml:mi>C</mml:mi><mml:mi>y</mml:mi><mml:mi>c</mml:mi><mml:mi>l</mml:mi><mml:mi>e</mml:mi><mml:mo>&#x2217;</mml:mo><mml:mi>n</mml:mi><mml:mi>A</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mo>&#x2217;</mml:mo><mml:mi>N</mml:mi><mml:mi>F</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Algorithm 2 flow chart</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_55080-fig-1.tif"/>
</fig>
<fig id="fig-8">
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_55080-fig-8.tif"/>
</fig>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Experiment Results and Discussions</title>
<p>In this section the efficacy of the proposed method is rigorously evaluated across seven multi-label datasets with five filter methods. Subsequently, a thorough presentation of the results is provided, complemented by an extensive analysis and interpretation of the findings.</p>
<sec id="s5_1">
<label>5.1</label>
<title>Datasets</title>
<p>In this work, comprehensive experiments are carried out on authentic public datasets from various domains, including images, text, and biology, as shown in <xref ref-type="table" rid="table-1">Table 1</xref>. It illustrates the number of instances, the number of features, the number of labels, cardinality (average number of labels for per instance), density, and domains for each data set.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Multiple domain datasets</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>Instances</th>
<th>Features</th>
<th>Labels</th>
<th>Cardinality</th>
<th>Density</th>
<th>Domain</th>
</tr>
</thead>
<tbody>
<tr>
<td>Scene</td>
<td>2407</td>
<td>294</td>
<td>6</td>
<td>1.074</td>
<td>0.234</td>
<td>Image</td>
</tr>
<tr>
<td>Enron</td>
<td>1702</td>
<td>1001</td>
<td>53</td>
<td>3.378</td>
<td>0.442</td>
<td>Text</td>
</tr>
<tr>
<td>Flags</td>
<td>194</td>
<td>19</td>
<td>7</td>
<td>3.392</td>
<td>0.442</td>
<td>Image</td>
</tr>
<tr>
<td>Emotions</td>
<td>593</td>
<td>72</td>
<td>6</td>
<td>1.868</td>
<td>0.422</td>
<td>Music</td>
</tr>
<tr>
<td>Chess</td>
<td>1657</td>
<td>585</td>
<td>227</td>
<td>2.411</td>
<td>0.644</td>
<td>Text</td>
</tr>
<tr>
<td>Image</td>
<td>2000</td>
<td>284</td>
<td>5</td>
<td>1.236</td>
<td>0.625</td>
<td>Image</td>
</tr>
<tr>
<td>Yeast</td>
<td>2417</td>
<td>103</td>
<td>14</td>
<td>4,237</td>
<td>0.082</td>
<td>Biology</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s5_2">
<label>5.2</label>
<title>Experimental Settings</title>
<p>The study presents an analysis and discussion of the seven publicly accessible multi-label datasets that were utilized, as well as the five filter methods that were compared, while the performance of all methods is evaluated using four multi-label categorical evaluation metrics. The paper compares the proposed method with MLACO [<xref ref-type="bibr" rid="ref-18">18</xref>], MCLS [<xref ref-type="bibr" rid="ref-36">36</xref>], ELA-Chi [<xref ref-type="bibr" rid="ref-37">37</xref>], PPT-Chi [<xref ref-type="bibr" rid="ref-37">37</xref>], and PPT-MI [<xref ref-type="bibr" rid="ref-37">37</xref>]. MLACO is based on the graph frame and adopts ACO. MCLS uses popular learning transformation and constraints on label space. PPT-CHI, PPT-MI, and ELA-Chi are classic problem-based transformation methods based on mutual information.</p>
<p>The parameters for the comparison methods have been specified. In the MLACO [<xref ref-type="bibr" rid="ref-18">18</xref>] algorithm, the maximum number of iterations is set to 40 <inline-formula id="ieqn-112"><mml:math id="mml-ieqn-112"><mml:mo stretchy="false">(</mml:mo><mml:mi>n</mml:mi><mml:mi>C</mml:mi><mml:mi>y</mml:mi><mml:mi>c</mml:mi><mml:mi>l</mml:mi><mml:mi>e</mml:mi><mml:mo>=</mml:mo><mml:mn>40</mml:mn></mml:math></inline-formula>), with a pheromone decay rate of 0.1 <inline-formula id="ieqn-113"><mml:math id="mml-ieqn-113"><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03C1;</mml:mi><mml:mo>=</mml:mo><mml:mn>0.1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and the number of ants fixed at 25 <inline-formula id="ieqn-114"><mml:math id="mml-ieqn-114"><mml:mrow><mml:mo>(</mml:mo><mml:mi>n</mml:mi><mml:mi>A</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn>25</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The quantity of features selected by each ant in every iteration is defined as half the original number of features <inline-formula id="ieqn-115"><mml:math id="mml-ieqn-115"><mml:mi>N</mml:mi><mml:mi>F</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mi>m</mml:mi><mml:mn>2</mml:mn></mml:mfrac></mml:mstyle></mml:math></inline-formula>. <inline-formula id="ieqn-116"><mml:math id="mml-ieqn-116"><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> the power parameters for the start value, termination value, and polynomial decay rate are configured as 1, 0, and 0.7, and the parameter <inline-formula id="ieqn-117"><mml:math id="mml-ieqn-117"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula> is established at 1. The settings of the proposed method are consistent with MLACO. The number of unique parameter neighbors in the Manifold-based Constraint Laplacian Score (MCLS) [<xref ref-type="bibr" rid="ref-36">36</xref>] is set to 5. The pruning threshold for Pruned Problem Transformation-Chi (PPT-Chi) [<xref ref-type="bibr" rid="ref-37">37</xref>] and Pruned Problem Transformation for Multi-Label (PPT-MI) [<xref ref-type="bibr" rid="ref-37">37</xref>] is set to 6. Entropy-based Label Assignment-Chi (ELA-Chi) [<xref ref-type="bibr" rid="ref-37">37</xref>] does not require to have a parameter set. The classifier of the experiment utilizes lazy learning-based ML-KNN [<xref ref-type="bibr" rid="ref-25">25</xref>] which is often used to access the performance of multi-label feature selection methods, with the number of neighbors set to 10. Four multi-label evaluation metrics are used: Hamming loss, One-error, Average-precision, and Micro-F1. Smaller values for Hamming loss and One-error are better, and larger values for Average-precision and Micro-F1 are better. For each dataset and algorithm, the average of 20 independent runs of these evaluation metrics is computed for all graphs and tables. Additionally, the dataset is partitioned using 5-fold cross-validation. The feature subset is obtained by sorting the features obtained by all methods. In order to evaluate the efficiency of feature selection methods, we first select the most representative <inline-formula id="ieqn-118"><mml:math id="mml-ieqn-118"><mml:mi>d</mml:mi></mml:math></inline-formula> features through each method as the final feature subset, where <inline-formula id="ieqn-119"><mml:math id="mml-ieqn-119"><mml:mi>d</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mn>10</mml:mn><mml:mo>,</mml:mo><mml:mn>20</mml:mn><mml:mo>,</mml:mo><mml:mn>30</mml:mn><mml:mo>,</mml:mo><mml:mn>40</mml:mn><mml:mo>,</mml:mo><mml:mn>50</mml:mn><mml:mo>,</mml:mo><mml:mn>60</mml:mn><mml:mo>,</mml:mo><mml:mn>70</mml:mn><mml:mo>,</mml:mo><mml:mn>80</mml:mn><mml:mo>,</mml:mo><mml:mn>90</mml:mn><mml:mo>,</mml:mo><mml:mn>100</mml:mn><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>. And then we classify these selected feature subsets. Since the feature dimension of the datasets Flags and Emotions are less than 100, we set them as <inline-formula id="ieqn-120"><mml:math id="mml-ieqn-120"><mml:mi>d</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mn>4</mml:mn><mml:mo>,</mml:mo><mml:mn>6</mml:mn><mml:mo>,</mml:mo><mml:mn>8</mml:mn><mml:mo>,</mml:mo><mml:mn>10</mml:mn><mml:mo>,</mml:mo><mml:mn>12</mml:mn><mml:mo>,</mml:mo><mml:mn>16</mml:mn><mml:mo>,</mml:mo><mml:mn>18</mml:mn><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula id="ieqn-121"><mml:math id="mml-ieqn-121"><mml:mi>d</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mn>10</mml:mn><mml:mo>,</mml:mo><mml:mn>20</mml:mn><mml:mo>,</mml:mo><mml:mn>30</mml:mn><mml:mo>,</mml:mo><mml:mn>40</mml:mn><mml:mo>,</mml:mo><mml:mn>50</mml:mn><mml:mo>,</mml:mo><mml:mn>60</mml:mn><mml:mo>,</mml:mo><mml:mn>70</mml:mn><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula> respectively on these two datasets. In order to further analyze the performance of each method, the first 30 features of each method are selected as the feature subset of multi-label learning.</p>
</sec>
<sec id="s5_3">
<label>5.3</label>
<title>Results and Discussion</title>
<p><xref ref-type="fig" rid="fig-2">Figs. 2</xref>&#x2013;<xref ref-type="fig" rid="fig-5">5</xref> illustrate the performance variance of all methods on seven datasets with different sizes of feature subsets. Based on the results, the following observation can be deduced: From the results shown in <xref ref-type="fig" rid="fig-2">Fig. 2a</xref>, it can be seen that the proposed method outperforms all other methods, achieving the best results. Furthermore, with an increasing number of features, the proposed method exhibits remarkable stability and consistently surpass the compared algorithms. It is observed from <xref ref-type="fig" rid="fig-2">Fig. 2b</xref> highlights that the performance metrics, particularly one-error and average-precision, exhibit stability with the incremental increase in features. Despite the observed fluctuations and a decline in performance concerning hamming loss and micro-F1 measures, the proposed method consistently maintains a considerable superiority to comparative approaches. The observations from <xref ref-type="fig" rid="fig-3">Fig. 3a</xref> reveal that all methods exhibit their highest variability on the Flags dataset, which can be attributed to its inherently low feature dimensionality. Additionally, it is discernible that an increase in the number of features does not consistently enhance the classification performance of algorithms. Analysis across the four evaluative metrics indicates that the performance of the proposed method falls short of the benchmark set by comparative methodologies exclusively when dealing with subsets of features with the limited size. <xref ref-type="fig" rid="fig-3">Fig. 3b</xref> presents the results on the Emotions dataset. The overall trend is the same as the performance on the Scene dataset, with the difference being that the gap between the other methods and the proposed method is more pronounced, especially in hamming loss measures.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Results of four evaluation indicators on Scene and Enron datasets</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_55080-fig-2.tif"/>
</fig><fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Results of four evaluation indicators on Flags and Emotions datasets</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_55080-fig-3.tif"/>
</fig><fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>Results of four evaluation indicators on Chess and Yeast datasets</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_55080-fig-4.tif"/>
</fig><fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>Results of four evaluation indicators on Image dataset</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_55080-fig-5.tif"/>
</fig>
<p>As depicted in <xref ref-type="fig" rid="fig-4">Fig. 4a</xref>, the performance of the proposed method is observed to be inferior to that of PPT-MI for feature counts less than 60. Beyond the threshold of 60 features, the proposed method asserts its unequivocal superiority across various evaluation metrics. This observation underscores the nuanced efficacy of the proposed method in relation to feature quantity and its comparative advantage over a spectrum of evaluative criteria. According to the results in <xref ref-type="fig" rid="fig-4">Fig. 4b</xref>, it displays that the proposed method does not match the MCLS in terms of one-error rate and average-precision. Nevertheless, it remains competitive relative to other evaluated methods, such as MLACO. Moreover, for most other evaluation metrics, the proposed method outperforms all compared counterparts. The noticeable decrease in the effectiveness of the proposed method, especially in the Yeast dataset, can likely be attributed to the minimal correlation between labels within the dataset. From the <xref ref-type="fig" rid="fig-5">Fig. 5</xref>, it can be seen that the performance of the proposed method is monotonically improved with a relatively small feature subset size. Although it is not as stable as the other methods as the feature subset size increases, but remains highly competitive in terms of classification error rate and accuracy. From the overall trend, it can be seen that with the increasing size of feature subset, the classification error of the proposed method consistently decreases, the classification accuracy continues to rise. This improvement sets the proposed method apart from the compared methods, demonstrating its superior performance.</p>
<p>To further demonstrate the effectiveness of the proposed method, quantitative analysis is employed to demonstrate its performance. This involves the use of a feature subset composed of the top 30 features derived from the feature ranking outcomes. The rankings of each method on various datasets are presented in the <xref ref-type="table" rid="table-2">Tables 2</xref>&#x2013;<xref ref-type="table" rid="table-5">5</xref> with AvgRank signifying the mean ranking of the performance of each method on all datasets. The boldface in the table indicates the best performance, and the numbers in parentheses indicate the relative ranking of the six algorithms on each evaluation metric for each dataset. In the analysis presented across four tables, the proposed method consistently ranks the top position, with the only exception being its position on the Chess dataset. However, on other data sets, the proposed method significantly performs better than other methods. Additionally, the proposed method attains the highest AvgRank, which is an aggregate performance measure across all metrics. On the four evaluation metrics across the seven datasets, the method achieves average improvements of 6.81%, 18.01%, 6.33%, and 25.14%, respectively, in comparison to MLACO. These improvements indicates that dynamic redundancy and label correlation significantly enhances the search capabilities of the ACO algorithm. Therefore, the proposed method achieves highly competitive performance compared with all the comparing methods.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Hamming loss for 6 filtering methods on datasets</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="center" colspan="7">Dataset</th>
</tr>
<tr>
<th>Method</th>
<th>MFACO</th>
<th>MLACO</th>
<th>MCLS</th>
<th>ELA-Chi</th>
<th>PPT-Chi</th>
<th>PPT-MI</th>
</tr>
</thead>
<tbody>
<tr>
<td>Scene</td>
<td><bold>0.1182(1)</bold></td>
<td>0.1220(2)</td>
<td>0.1681(4)</td>
<td>0.1698(6)</td>
<td>0.1691(5)</td>
<td>0.1577(3)</td>
</tr>
<tr>
<td>Enron</td>
<td><bold>0.0535(1)</bold></td>
<td>0.0567(4)</td>
<td>0.0595(5)</td>
<td>0.0602(6)</td>
<td>0.0563(3)</td>
<td>0.0558(2)</td>
</tr>
<tr>
<td>Flags</td>
<td><bold>0.2820(1)</bold></td>
<td>0.3130(3)</td>
<td>0.3135(4)</td>
<td>0.3080(2)</td>
<td>0.3432(6)</td>
<td>0.3395(5)</td>
</tr>
<tr>
<td>Emotions</td>
<td><bold>0.2103(1)</bold></td>
<td>0.2368(3)(3)</td>
<td>0.2424(6)</td>
<td>0.2374(4)</td>
<td>0.2399(5)</td>
<td>0.2254(2)</td>
</tr>
<tr>
<td>Chess</td>
<td>0.0090(2)</td>
<td>0.0099(3)</td>
<td>0.0101(4)</td>
<td>0.0103(5)</td>
<td>0.0105(6)</td>
<td><bold>0.0071(1)</bold></td>
</tr>
<tr>
<td>Yeast</td>
<td><bold>0.2021(1)</bold></td>
<td>0.2071(3)</td>
<td>0.2093(4)</td>
<td>0.2112(6)</td>
<td>0.2110(5)</td>
<td>0.2047(2)</td>
</tr>
<tr>
<td>Image</td>
<td><bold>0.1875(1)</bold></td>
<td>0.2015(2)</td>
<td>0.2328(6)</td>
<td>0.2271(5)</td>
<td>0.2250(4)</td>
<td>0.2250(3)</td>
</tr>
<tr>
<td>Avg Rank</td>
<td>1.14</td>
<td>2.86</td>
<td>4.71</td>
<td>4.86</td>
<td>4.86</td>
<td>2.57</td>
</tr>
</tbody>
</table>
</table-wrap><table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>One-error for 6 filtering methods on datasets</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th colspan="7" align="center">Dataset</th>
</tr>
<tr>
<th>Method</th>
<th>MFACO</th>
<th>MLACO</th>
<th>MCLS</th>
<th>ELA-Chi</th>
<th>PPT-Chi</th>
<th>PPT-MI</th>
</tr>
</thead>
<tbody>
<tr>
<td>Scene</td>
<td><bold>0.3339(1)</bold></td>
<td>0.3481(2)</td>
<td>0.5675(6)</td>
<td>0.5491(5)</td>
<td>0.5392(4)</td>
<td>0.5052(3)</td>
</tr>
<tr>
<td>Enron</td>
<td><bold>0.2851(1)</bold></td>
<td>0.3578(3)</td>
<td>0.3985(5)</td>
<td>0.4029(6)</td>
<td>0.3647(4)</td>
<td>0.3103(2)</td>
</tr>
<tr>
<td>Flags</td>
<td><bold>0.1711(1)</bold></td>
<td>0.2982(5)</td>
<td>0.2632(4)</td>
<td>0.1842(2)</td>
<td>0.2368(3)</td>
<td>0.3158(6)</td>
</tr>
<tr>
<td>Emotions</td>
<td><bold>0.2532(1)</bold></td>
<td>0.3264(3)</td>
<td>0.3407(6)</td>
<td>0.3293(4)</td>
<td>0.3327(5)</td>
<td>0.3063(2)</td>
</tr>
<tr>
<td>Chess</td>
<td>0.4779(2)</td>
<td>0.6209(3)</td>
<td>0.6552(4)</td>
<td>0.7256(5)</td>
<td>0.7391(6)</td>
<td><bold>0.3238(1)</bold></td>
</tr>
<tr>
<td>Yeast</td>
<td><bold>0.2277(1)</bold></td>
<td>0.2426(5)</td>
<td>0.2340(3)</td>
<td>0.2443(6)</td>
<td>0.2422(4)</td>
<td>0.2298(2)</td>
</tr>
<tr>
<td>Image</td>
<td><bold>0.3750(1)</bold></td>
<td>0.4054(2)</td>
<td>0.5189(6)</td>
<td>0.4933(5)</td>
<td>0.4814(3)</td>
<td>0.4816(4)</td>
</tr>
<tr>
<td>Avg Rank</td>
<td>1.14</td>
<td>3.29</td>
<td>4.86</td>
<td>4.71</td>
<td>4.14</td>
<td>2.86</td>
</tr>
</tbody>
</table>
</table-wrap><table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Average-precision for 6 filtering methods on datasets</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th colspan="7" align="center">Dataset</th>
</tr>
<tr>
<th>Method</th>
<th>MFACO</th>
<th>MLACO</th>
<th>MCLS</th>
<th>ELA-Chi</th>
<th>PPT-Chi</th>
<th>PPT-MI</th>
</tr>
</thead>
<tbody>
<tr>
<td>Scene</td>
<td><bold>0.8000(1)</bold></td>
<td>0.7913(2)</td>
<td>0.6326(6)</td>
<td>0.6418(5)</td>
<td>0.6540(4)</td>
<td>0.6844(3)</td>
</tr>
<tr>
<td>Enron</td>
<td><bold>0.6383(1)</bold></td>
<td>0.6029(3)</td>
<td>0.5806(4)</td>
<td>0.5485(6)</td>
<td>0.5769(5)</td>
<td>0.6105(2)</td>
</tr>
<tr>
<td>Flags</td>
<td><bold>0.8158(1)</bold></td>
<td>0.7797(5)</td>
<td>0.7887(3)</td>
<td>0.8110(2)</td>
<td>0.7883(4)</td>
<td>0.7595(6)</td>
</tr>
<tr>
<td>Emotions</td>
<td><bold>0.8087(1)</bold></td>
<td>0.7620(3)</td>
<td>0.7534(5)</td>
<td>0.7579(4)</td>
<td>0.7533(6)</td>
<td>0.7742(2)</td>
</tr>
<tr>
<td>Chess</td>
<td>0.3863(2)</td>
<td>0.3149(3)</td>
<td>0.3076(4)</td>
<td>0.2562(5)</td>
<td>0.2323(6)</td>
<td><bold>0.4989(1)</bold></td>
</tr>
<tr>
<td>Yeast</td>
<td><bold>0.7580(1)</bold></td>
<td>0.7455(4)</td>
<td>0.7518(2)</td>
<td>0.7402(5)</td>
<td>0.7354(6)</td>
<td>0.7488(3)</td>
</tr>
<tr>
<td>Image</td>
<td><bold>0.7554(1)</bold></td>
<td>0.7385(2)</td>
<td>0.6601(6)</td>
<td>0.6818(5)</td>
<td>0.6870(4)</td>
<td>0.6940(3)</td>
</tr>
<tr>
<td>Avg Rank</td>
<td>1.14</td>
<td>3.14</td>
<td>4.29</td>
<td>4.57</td>
<td>5.00</td>
<td>2.86</td>
</tr>
</tbody>
</table>
</table-wrap><table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>Micro-F1 for 6 filtering methods on datasets</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th colspan="7" align="center">Dataset</th>
</tr>
<tr>
<th>Method</th>
<th>MFACO</th>
<th>MLACO</th>
<th>MCLS</th>
<th>ELA-Chi</th>
<th>PPT-Chi</th>
<th>PPT-MI</th>
</tr>
</thead>
<tbody>
<tr>
<td>Scene</td>
<td><bold>0.6084(1)</bold></td>
<td>0.5873(2)</td>
<td>0.2281(6)</td>
<td>0.2376(5)</td>
<td>0.2730(4)</td>
<td>0.3422(3)</td>
</tr>
<tr>
<td>Enron</td>
<td><bold>0.4886(1)</bold></td>
<td>0.4022(4)</td>
<td>0.3331(6)</td>
<td>0.3384(5)</td>
<td>0.4226(3)</td>
<td>0.4235(2)</td>
</tr>
<tr>
<td>Flags</td>
<td><bold>0.7206(1)</bold></td>
<td>0.6786(2)</td>
<td>0.6653(3)</td>
<td>0.6570(5)</td>
<td>0.6263(6)</td>
<td>0.6617(4)</td>
</tr>
<tr>
<td>Emotions</td>
<td><bold>0.5918(1)</bold></td>
<td>0.5271(4)</td>
<td>0.5096(6)</td>
<td>0.5315(3)</td>
<td>0.5256(5)</td>
<td>0.5617(2)</td>
</tr>
<tr>
<td>Chess</td>
<td>0.2645(2)</td>
<td>0.1210(3)</td>
<td>0.0845(4)</td>
<td>0.0709(5)</td>
<td>0.0050(6)</td>
<td><bold>0.4989(1)</bold></td>
</tr>
<tr>
<td>Yeast</td>
<td><bold>0.6262(1)</bold></td>
<td>0.6084(4)</td>
<td>0.6068(5)</td>
<td>0.5873(6)</td>
<td>0.6102(3)</td>
<td>0.6131(2)</td>
</tr>
<tr>
<td>Image</td>
<td><bold>0.5033(1)</bold></td>
<td>0.4536(2)</td>
<td>0.2772(6)</td>
<td>0.3135(5)</td>
<td>0.3343(3)</td>
<td>0.3190(4)</td>
</tr>
<tr>
<td>Avg Rank</td>
<td>1.14</td>
<td>3.00</td>
<td>5.14</td>
<td>4.86</td>
<td>4.29</td>
<td>2.57</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s5_4">
<label>5.4</label>
<title>Statistical Verification</title>
<p>Based on the previous experimental results, the aim is to compare and analyze the performance of the implemented multi-label feature selection methods. The superiority of the proposed methods is further evaluated. In this subsection, two statistical tests commonly used for multi-label feature selection, namely Friedman test and Posterior Nemenyi test, are used to verify the significance of the variances among the methods.</p>
<p>Initially, the proposed method is tested to obtain multi-label feature selection. If the calculated value Friedman statistic <inline-formula id="ieqn-122"><mml:math id="mml-ieqn-122"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>F</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is greater than the critical value, the assumption that all methods perform equally is rejected at the significance level <inline-formula id="ieqn-123"><mml:math id="mml-ieqn-123"><mml:mi>a</mml:mi></mml:math></inline-formula>. During this testing phase, the critical value of 2.534 is set and found. Subsequently, the Friedman statistic and critical value for the four evaluation indexes are obtained, as presented in <xref ref-type="table" rid="table-6">Table 6</xref>.</p>
<table-wrap id="table-6">
<label>Table 6</label>
<caption>
<title>Friedman statistics and critical values for four evaluation metrics</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Evaluation metrics</th>
<th><inline-formula id="ieqn-124"><mml:math id="mml-ieqn-124"><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>F</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula></th>
<th>Critical value <inline-formula id="ieqn-125"><mml:math id="mml-ieqn-125"><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>=</mml:mo><mml:mn>0.05</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula></th>
</tr>
</thead>
<tbody>
<tr>
<td>Hamming loss</td>
<td>13.0556</td>
<td rowspan="4">2.534</td>
</tr>
<tr>
<td>One-error</td>
<td>7.5395</td>
</tr>
<tr>
<td>Average-precision</td>
<td>8.2127</td>
</tr>
<tr>
<td>Micro-F1</td>
<td>12.5072</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The results from <xref ref-type="table" rid="table-6">Table 6</xref> indicate that the Friedman statistics for all four evaluation indicators are significantly higher than the critical value, thus rejecting the null hypothesis made. The statistical differences between the methods are also compared visually using the post-hoc test Nemenyi test. The smaller the mean ranking is the better the characterization performance. The mean ranking difference between two methods is compared with the critical distance, if the mean ranking difference is greater than the critical distance then there is a significant difference between the two. The critical distance is indicated by a solid red line. CD (Critical Difference) is calculated as shown in <xref ref-type="disp-formula" rid="eqn-19">Eq. (19)</xref>:</p>

<p><disp-formula id="eqn-19"><label>(19)</label><mml:math id="mml-eqn-19" display="block"><mml:mi>C</mml:mi><mml:mi>D</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>a</mml:mi></mml:mrow></mml:msub><mml:msqrt><mml:mfrac><mml:mrow><mml:mi>k</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>6</mml:mn><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:msqrt></mml:math></disp-formula>where <inline-formula id="ieqn-126"><mml:math id="mml-ieqn-126"><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>a</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>3.031</mml:mn></mml:math></inline-formula> is the parameter at significance level <inline-formula id="ieqn-127"><mml:math id="mml-ieqn-127"><mml:mi>a</mml:mi></mml:math></inline-formula> of 0.05, then the critical distance <inline-formula id="ieqn-128"><mml:math id="mml-ieqn-128"><mml:mi>C</mml:mi><mml:mi>D</mml:mi><mml:mo>=</mml:mo><mml:mn>2.850</mml:mn></mml:math></inline-formula> is calculated.</p>
<p>From <xref ref-type="fig" rid="fig-6">Fig. 6</xref>, it can be seen that the performance difference between the implemented method and MCLS, ELA-CHI, and PPT-Chi is the most significant. Although the difference between PPT-MI and MLACO is not significant, within a critical distance, the proposed method evaluates first in the ranking, while MLACO is ranked back on average. Taken together, the proposed method has strong competitiveness and gains some performance improvement compared with MLACO. This verifies the effectiveness of incorporating dynamic redundancy and label correlation as correlation information.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>CD map of the four evaluation indicators</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_55080-fig-6.tif"/>
</fig>
</sec>
</sec>
<sec id="s6">
<label>6</label>
<title>Conclusion</title>
<p>In the paper, a multi-label feature selection method is implemented that enhances ACO through the incorporation of dynamic redundancy and label correlation. Firstly, a concept of dynamic redundancy is proposed to ensure that the ant colony pays attention to the change of the redundancy of the selected feature subset at the same time as it wanders around, aiming to yield the lowest possible redundancy in the selected feature set. Secondly, the label relevance matrix is optimized individually by the ant colony to obtain the label weights, and then the label weights are integrated into the feature label relevance, which both incorporates the label relevance and retains all the feature label relevance. The above two strategies enhance the optimization ability of the ant colony, which enables the ant colony to capture discriminative features to obtain the optimal feature subset. The proposed method is compared with filtered methods, other state-of-the-art methods and classical methods that are also used for ACO. It demonstrates superior performance across four evaluation metrics on seven datasets, testifying that search efficacy of ACO has been enhanced and that it successfully pinpoints discriminative features.</p>
</sec>
</body>
<back>
<ack><p>The authors would like to thank the anonymous reviewers and the editor for their valuable suggestions, which greatly contributed to the improved quality of this article.</p>
</ack>
<sec><title>Funding Statement</title>
<p>This research was supported by National Natural Science Foundation of China (Grant Nos. 62376089, 62302153, 62302154, 62202147) and the key Research and Development Program of Hubei Province, China (Grant No. 2023BEB024).</p>
</sec>
<sec><title>Author Contributions</title>
<p>The authors confirm contribution to the paper as follows: study conception and design: Ting Cai, Chun Ye, Zhiwei Ye, Ziyuan Chen; data collection: Chun Ye, Zhiwei Ye, Ziyuan Chen, Mengqing Mei; analysis and interpretation of results: Mengqing Mei, Haichao Zhang, Wanfang Bai, Peng Zhang; draft manuscript preparation: Ting Cai, Chun Ye, Zhiwei Ye, Haichao Zhang. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability"><title>Availability of Data and Materials</title>
<p>The data that support the findings of this study are openly available in the Mulan database at the link <ext-link ext-link-type="uri" xlink:href="https://mulan.sourceforge.net/datasets-mlc.html">https://mulan.sourceforge.net/datasets-mlc.html</ext-link> (accessed on 31 August 2024) and the multi-label classification database at <ext-link ext-link-type="uri" xlink:href="http://www.uco.es/kdis/mllresources/">http://www.uco.es/kdis/mllresources/</ext-link> (accessed on 31 August 2024).</p>
</sec>
<sec><title>Ethics Approval</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement"><title>Conflicts of Interest</title>
<p>The authors declare that they have no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Qian</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Huang</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Xu</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Shu</surname></string-name>, and <string-name><given-names>W.</given-names> <surname>Ding</surname></string-name></person-group>, &#x201C;<article-title>A survey on multi-label feature selection from perspectives of label fusion</article-title>,&#x201D; <source>Inf. Fusion</source>, vol. <volume>100</volume>, <year>2023, Art. no. 101948</year>. doi: <pub-id pub-id-type="doi">10.1016/j.inffus.2023.101948</pub-id>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Wu</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Cai</surname></string-name>, and <string-name><given-names>P. S.</given-names> <surname>Yu</surname></string-name></person-group>, &#x201C;<article-title>Multi-view multi-label learning with sparse feature selection for image annotation</article-title>,&#x201D; <source>IEEE Trans. Multimedia</source>, vol. <volume>22</volume>, no. <issue>11</issue>, pp. <fpage>2844</fpage>&#x2013;<lpage>2857</lpage>, <year>Nov. 2020</year>. doi: <pub-id pub-id-type="doi">10.1109/TMM.2020.2966887</pub-id>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Nssibi</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Manita</surname></string-name>, and <string-name><given-names>O.</given-names> <surname>Korbaa</surname></string-name></person-group>, &#x201C;<article-title>Advances in nature-inspired metaheuristic optimization for feature selection problem: A comprehensive survey</article-title>,&#x201D; <source>Comput. Sci. Rev.</source>, vol. <volume>49</volume>, <year>2023, Art. no. 100559</year>. doi: <pub-id pub-id-type="doi">10.1016/j.cosrev.2023.100559</pub-id>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>B.</given-names> <surname>Abdollahzadeh</surname></string-name> and <string-name><given-names>F. S.</given-names> <surname>Gharehchopogh</surname></string-name></person-group>, &#x201C;<article-title>A multi-objective optimization algorithm for feature selection problems</article-title>,&#x201D; <source>Eng. Comput.</source>, vol. <volume>38</volume>, pp. <fpage>1845</fpage>&#x2013;<lpage>1863</lpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1007/s00366-021-01369-9</pub-id>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>Saberi-Movahed</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Dual regularized unsupervised feature selection based on matrix factorization and minimum redundancy with application in gene selection</article-title>,&#x201D; <source>Knowl.-Based Syst.</source>, vol. <volume>256</volume>, <year>2022, Art. no. 109884</year>. doi: <pub-id pub-id-type="doi">10.1016/j.knosys.2022.109884</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Shu</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Yan</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Yu</surname></string-name>, and <string-name><given-names>W.</given-names> <surname>Qian</surname></string-name></person-group>, &#x201C;<article-title>Information gain-based semi-supervised feature selection for hybrid data</article-title>,&#x201D; <source>Appl. Intell.</source>, vol. <volume>53</volume>, no. <issue>6</issue>, pp. <fpage>7310</fpage>&#x2013;<lpage>7325</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1007/s10489-022-03770-3</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Tian</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Li</surname></string-name>, and <string-name><given-names>X.</given-names> <surname>Liu</surname></string-name></person-group>, &#x201C;<article-title>A two-stage clonal selection algorithm for local feature selection on high-dimensional data</article-title>,&#x201D; <source>Inf. Sci.</source>, vol. <volume>677</volume>, <year>2024, Art. no. 120867</year>. doi: <pub-id pub-id-type="doi">10.1016/j.ins.2024.120867</pub-id>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Xu</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Ji</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Li</surname></string-name>, and <string-name><given-names>W.</given-names> <surname>Lu</surname></string-name></person-group>, &#x201C;<article-title>MIC-SHAP: An ensemble feature selection method for materials machine learning</article-title>,&#x201D; <source>Mater. Today Commun.</source>, vol. <volume>37</volume>, <year>2023, Art. no. 106910</year>. doi: <pub-id pub-id-type="doi">10.1016/j.mtcomm.2023.106910</pub-id>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>D.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Zhou</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Wei</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Chen</surname></string-name>, and <string-name><given-names>Z.</given-names> <surname>Zhang</surname></string-name></person-group>, &#x201C;<article-title>Multi-strategy assisted multi-objective whale optimization algorithm for feature selection</article-title>,&#x201D; <source>Comput. Model. Eng. Sci.</source>, vol. <volume>140</volume>, no. <issue>2</issue>, pp. <fpage>1563</fpage>&#x2013;<lpage>1593</lpage>, <year>2024</year>. doi: <pub-id pub-id-type="doi">10.32604/cmes.2024.048049</pub-id>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Ye</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Luo</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Zhou</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Wang</surname></string-name>, and <string-name><given-names>Q.</given-names> <surname>He</surname></string-name></person-group>, &#x201C;<article-title>An ensemble framework with improved hybrid breeding optimization-based feature selection for intrusion detection</article-title>,&#x201D; <source>Future Gener. Comput. Syst.</source>, vol. <volume>151</volume>, no. <issue>3</issue>, pp. <fpage>124</fpage>&#x2013;<lpage>136</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1016/j.future.2023.09.035</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Tang</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Huang</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Tan</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Fang</surname></string-name>, and <string-name><given-names>H.</given-names> <surname>Huang</surname></string-name></person-group>, &#x201C;<article-title>Multi-subswarm cooperative particle swarm optimization algorithm and its application</article-title>,&#x201D; <source>Inf. Sci.</source>, vol. <volume>677</volume>, <year>2024, Art. no. 120887</year>. doi: <pub-id pub-id-type="doi">10.1016/j.ins.2024.120887</pub-id>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Q.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Wu</surname></string-name>, and <string-name><given-names>S.</given-names> <surname>He</surname></string-name></person-group>, &#x201C;<article-title>A fusion algorithm based on whale and grey wolf optimization algorithm for solving real-world optimization problems</article-title>,&#x201D; <source>Appl. Soft Comput.</source>, vol. <volume>146</volume>, <year>2023, Art. no. 110701</year>. doi: <pub-id pub-id-type="doi">10.1016/j.asoc.2023.110701</pub-id>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A. Z.</given-names> <surname>Ye</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>High-Dimensional feature selection based on improved binary ant colony optimization combined with hybrid rice optimization algorithm</article-title>,&#x201D; <source>Int. J. Intell. Syst.</source>, vol. <volume>1</volume>, pp. <fpage>1</fpage>&#x2013;<lpage>27</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1155/2023/1444938</pub-id>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Alzaqebah</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Improved whale optimization with local-search method for feature selection</article-title>,&#x201D; <source>Comput. Mater. Contin.</source>, vol. <volume>75</volume>, no. <issue>1</issue>, pp. <fpage>1371</fpage>&#x2013;<lpage>1389</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.32604/cmc.2023.033509</pub-id>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Chandrashekar</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Krishnadoss</surname></string-name>, <string-name><given-names>V. K.</given-names> <surname>Poornachary</surname></string-name>, and <string-name><given-names>B.</given-names> <surname>Ananthakrishnan</surname></string-name></person-group>, &#x201C;<article-title>MCWOA Scheduler: Modified chimp-whale optimization algorithm for task scheduling in cloud computing</article-title>,&#x201D; <source>Comput. Mater. Contin.</source>, vol. <volume>78</volume>, no. <issue>2</issue>, pp. <fpage>2593</fpage>&#x2013;<lpage>2616</lpage>, <year>2024</year>. doi: <pub-id pub-id-type="doi">10.32604/cmc.2024.046304</pub-id>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Ghosh</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Guha</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Sarkar</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Abraham</surname></string-name></person-group>, &#x201C;<article-title>A wrapper-filter feature selection technique based on ant colony optimization</article-title>,&#x201D; <source>Neural Comput. Appl.</source>, vol. <volume>32</volume>, no. <issue>12</issue>, pp. <fpage>7839</fpage>&#x2013;<lpage>7857</lpage>, <year>2020</year>. doi: <pub-id pub-id-type="doi">10.1007/s00521-019-04171-3</pub-id>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Ma</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Zhou</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Zhu</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Li</surname></string-name>, and <string-name><given-names>L.</given-names> <surname>Jiao</surname></string-name></person-group>, &#x201C;<article-title>A two-stage hybrid ant colony optimization for high-dimensional feature selection</article-title>,&#x201D; <source>Pattern Recognit.</source>, vol. <volume>116</volume>, <year>2021, Art. no. 107933</year>. doi: <pub-id pub-id-type="doi">10.1016/j.patcog.2021.107933</pub-id>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Paniri</surname></string-name>, <string-name><given-names>M. B.</given-names> <surname>Dowlatshahi</surname></string-name>, and <string-name><given-names>H.</given-names> <surname>Nezamabadi-Pour</surname></string-name></person-group>, &#x201C;<article-title>MLACO: A multi-label feature selection algorithm based on ant colony optimization</article-title>,&#x201D; <source>Knowl.-Based Syst.</source>, vol. <volume>192</volume>, <year>2020, Art. no. 105285</year>. doi: <pub-id pub-id-type="doi">10.1016/j.knosys.2019.105285</pub-id>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Paniri</surname></string-name>, <string-name><given-names>M. B.</given-names> <surname>Dowlatshahi</surname></string-name>, and <string-name><given-names>H.</given-names> <surname>Nezamabadi-pour</surname></string-name></person-group>, &#x201C;<article-title>Ant-TD: Ant colony optimization plus temporal difference reinforcement learning for multi-label feature selection</article-title>,&#x201D; <source>Swarm Evol. Comput.</source>, vol. <volume>64</volume>, <year>2021, Art. no. 100892</year>. doi: <pub-id pub-id-type="doi">10.1016/j.swevo.2021.100892</pub-id>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Hatami</surname></string-name>, <string-name><given-names>S. R.</given-names> <surname>Mahmood</surname></string-name>, and <string-name><given-names>P.</given-names> <surname>Moradi</surname></string-name></person-group>, &#x201C;<article-title>A graph-based multi-label feature selection using ant colony optimization</article-title>,&#x201D; in <conf-name>10th Int. Symp. Telecommun. (IST)</conf-name>, <publisher-loc>Tehran, Iran</publisher-loc>, <year>2020</year>, pp. <fpage>175</fpage>&#x2013;<lpage>180</lpage>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A. K.</given-names> <surname>Shukla</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Singh</surname></string-name>, and <string-name><given-names>M.</given-names> <surname>Vardhan</surname></string-name></person-group>, &#x201C;<article-title>A hybrid framework for optimal feature subset selection</article-title>,&#x201D; <source>J. Intell. Fuzzy Syst.</source>, vol. <volume>36</volume>, no. <issue>3</issue>, pp. <fpage>2247</fpage>&#x2013;<lpage>2259</lpage>, <year>2019</year>. doi: <pub-id pub-id-type="doi">10.3233/JIFS-169936</pub-id>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z. A.</given-names> <surname>Kakarash</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Mardukhia</surname></string-name>, and <string-name><given-names>P.</given-names> <surname>Moradi</surname></string-name></person-group>, &#x201C;<article-title>Multi-label feature selection using density-based graph clustering and ant colony optimization</article-title>,&#x201D; <source>J. Comput. Des. Eng.</source>, vol. <volume>10</volume>, no. <issue>1</issue>, pp. <fpage>122</fpage>&#x2013;<lpage>138</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1093/jcde/qwac120</pub-id>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>O.</given-names> <surname>Luaces</surname></string-name>, <string-name><given-names>J.</given-names> <surname>D&#x00ED;ez</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Barranquero</surname></string-name>, <string-name><given-names>J. J.</given-names> <surname>del Coz</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Bahamonde</surname></string-name></person-group>, &#x201C;<article-title>Binary relevance efficacy for multilabel classification</article-title>,&#x201D; <source>Prog. Artif. Intell.</source>, vol. <volume>1</volume>, no. <issue>4</issue>, pp. <fpage>303</fpage>&#x2013;<lpage>313</lpage>, <year>2012</year>. doi: <pub-id pub-id-type="doi">10.1007/s13748-012-0030-x</pub-id>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Read</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Pfahringer</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Holmes</surname></string-name>, and <string-name><given-names>E.</given-names> <surname>Frank</surname></string-name></person-group>, &#x201C;<article-title>Classifier chains for multi-label classification</article-title>,&#x201D; <source>Mach. Learn.</source>, vol. <volume>85</volume>, no. <issue>3</issue>, pp. <fpage>333</fpage>&#x2013;<lpage>359</lpage>, <year>2011</year>. doi: <pub-id pub-id-type="doi">10.1007/s10994-011-5256-5</pub-id>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M. L.</given-names> <surname>Zhang</surname></string-name> and <string-name><given-names>Z. H.</given-names> <surname>Zhou</surname></string-name></person-group>, &#x201C;<article-title>ML-KNN: A lazy learning approach to multi-label learning</article-title>,&#x201D; <source>Pattern Recognit.</source>, vol. <volume>40</volume>, no. <issue>7</issue>, pp. <fpage>2038</fpage>&#x2013;<lpage>2048</lpage>, <year>2007</year>. doi: <pub-id pub-id-type="doi">10.1016/j.patcog.2006.12.019</pub-id>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Nikfarjam</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Neumann</surname></string-name>, and <string-name><given-names>F.</given-names> <surname>Neumann</surname></string-name></person-group>, &#x201C;<article-title>Evolutionary diversity optimisation for the traveling thief problem</article-title>,&#x201D; in <conf-name>Proc. Genet. Evol. Comput. Conf.</conf-name>, <publisher-loc>New York, NY, USA</publisher-loc>, <year>Jul. 2022</year>, pp. <fpage>749</fpage>&#x2013;<lpage>756</lpage>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>Chen</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Zhou</surname></string-name>, and <string-name><given-names>D.</given-names> <surname>Shi</surname></string-name></person-group>, &#x201C;<article-title>A chaotic antlion optimization algorithm for text feature selection</article-title>,&#x201D; <source>Int. J. Comput. Intell. Syst.</source>, vol. <volume>15</volume>, no. <issue>1</issue>, <year>2022</year>, <comment>Art. no. 41</comment>. doi: <pub-id pub-id-type="doi">10.1007/s44196-022-00094-5</pub-id>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>AlShamlan</surname></string-name> and <string-name><given-names>H.</given-names> <surname>AlMazrua</surname></string-name></person-group>, &#x201C;<article-title>Enhancing cancer classification through a hybrid bio-inspired evolutionary algorithm for biomarker gene selection</article-title>,&#x201D; <source>Comput. Mater. Contin.</source>, vol. <volume>79</volume>, no. <issue>1</issue>, pp. <fpage>675</fpage>&#x2013;<lpage>694</lpage>, <year>2024</year>. doi: <pub-id pub-id-type="doi">10.32604/cmc.2024.048146</pub-id>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Q.</given-names> <surname>Wu</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Xu</surname></string-name>, and <string-name><given-names>M.</given-names> <surname>Liu</surname></string-name></person-group>, &#x201C;<article-title>Applying an improved dung beetle optimizer algorithm to network traffic identification</article-title>,&#x201D; <source>Comput. Mater. Contin.</source>, vol. <volume>78</volume>, no. <issue>3</issue>, pp. <fpage>4091</fpage>&#x2013;<lpage>4107</lpage>, <year>2024</year>. doi: <pub-id pub-id-type="doi">10.32604/cmc.2024.048461</pub-id>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>E. S. M.</given-names> <surname>El-Kenawy</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Khodadadi</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Mirjalili</surname></string-name>, <string-name><given-names>A. A.</given-names> <surname>Abdelhamid</surname></string-name>, <string-name><given-names>M. M.</given-names> <surname>Eid</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Ibrahim</surname></string-name></person-group>, &#x201C;<article-title>Greylag goose optimization: Nature-inspired optimization algorithm</article-title>,&#x201D; <source>Expert. Syst. Appl.</source>, vol. <volume>238</volume>, <year>2024, Art. no. 122147</year>. doi: <pub-id pub-id-type="doi">10.1016/j.eswa.2023.122147</pub-id>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Yuan</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Tan</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Yang</surname></string-name>, and <string-name><given-names>L.</given-names> <surname>Zeng</surname></string-name></person-group>, &#x201C;<article-title>SCChOA: Hybrid sine-cosine chimp optimization algorithm for feature selection</article-title>,&#x201D; <source>Comput. Mater. Contin.</source>, vol. <volume>77</volume>, no. <issue>3</issue>, pp. <fpage>3057</fpage>&#x2013;<lpage>3075</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.32604/cmc.2023.044807</pub-id>.</mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>B.</given-names> <surname>Kaur</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Rathi</surname></string-name>, and <string-name><given-names>R. K.</given-names> <surname>Agrawal</surname></string-name></person-group>, &#x201C;<article-title>Enhanced depression detection from speech using quantum whale optimization algorithm for feature selection</article-title>,&#x201D; <source>Comput. Biol. Med.</source>, vol. <volume>150</volume>, <year>2022, Art. no. 106122</year>. doi: <pub-id pub-id-type="doi">10.1016/j.compbiomed.2022.106122</pub-id>; <pub-id pub-id-type="pmid">36182759</pub-id></mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Asilian Bidgoli</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Ebrahimpour-Komleh</surname></string-name>, and <string-name><given-names>S.</given-names> <surname>Rahnamayan</surname></string-name></person-group>, &#x201C;<article-title>A novel binary many-objective feature selection algorithm for multi-label data classification</article-title>,&#x201D; <source>Int. J. Mach. Learn. Cyber</source>, vol. <volume>12</volume>, no. <issue>7</issue>, pp. <fpage>2041</fpage>&#x2013;<lpage>2057</lpage>, <year>2021</year>. doi: <pub-id pub-id-type="doi">10.1007/s13042-021-01291-y</pub-id>.</mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Hashemi</surname></string-name>, <string-name><given-names>M. B.</given-names> <surname>Dowlatshahi</surname></string-name>, and <string-name><given-names>H.</given-names> <surname>Nezamabadi-Pour</surname></string-name></person-group>, &#x201C;<article-title>MGFS: A multi-label graph-based feature selection algorithm via PageRank centrality</article-title>,&#x201D; <source>Expert. Syst. Appl.</source>, vol. <volume>142</volume>, <year>2020, Art. no. 113024</year>. doi: <pub-id pub-id-type="doi">10.1016/j.eswa.2019.113024</pub-id>.</mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Gonzalez-Lopez</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Ventura</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Cano</surname></string-name></person-group>, &#x201C;<article-title>Distributed multi-label feature selection using individual mutual information measures</article-title>,&#x201D; <source>Knowl.-Based Syst.</source>, vol. <volume>188</volume>, <year>2020, Art. no. 105052</year>. doi: <pub-id pub-id-type="doi">10.1016/j.knosys.2019.105052</pub-id>.</mixed-citation></ref>
<ref id="ref-36"><label>[36]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Huang</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Jiang</surname></string-name>, and <string-name><given-names>G.</given-names> <surname>Sun</surname></string-name></person-group>, &#x201C;<article-title>Manifold-based constraint Laplacian score for multi-label feature selection</article-title>,&#x201D; <source>Pattern Recognit. Lett.</source>, vol. <volume>112</volume>, no. <issue>3</issue>, pp. <fpage>346</fpage>&#x2013;<lpage>352</lpage>, <year>2018</year>. doi: <pub-id pub-id-type="doi">10.1016/j.patrec.2018.08.021</pub-id>.</mixed-citation></ref>
<ref id="ref-37"><label>[37]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K. J.</given-names> <surname>Kim</surname></string-name> and <string-name><given-names>C. H.</given-names> <surname>Jun</surname></string-name></person-group>, &#x201C;<article-title>Dynamic mutual information-based feature selection for multi-label learning</article-title>,&#x201D; <source>Intell. Data Anal.</source>, vol. <volume>27</volume>, no. <issue>4</issue>, pp. <fpage>891</fpage>&#x2013;<lpage>909</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.3233/IDA-226666</pub-id>.</mixed-citation></ref>
</ref-list>
</back></article>