<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">51816</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2024.051816</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Efficient Clustering Network Based on Matrix Factorization</article-title>
<alt-title alt-title-type="left-running-head">Efficient Clustering Network based on Matrix Factorization</alt-title>
<alt-title alt-title-type="right-running-head">Efficient Clustering Network based on Matrix Factorization</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Cheng</surname><given-names>Jieren</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<contrib id="author-2" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Li</surname><given-names>Jimei</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><xref ref-type="aff" rid="aff-3">3</xref><email>22220854000328@hainanu.edu.cn</email></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Zeng</surname><given-names>Faqiang</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Tao</surname><given-names>Zhicong</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<contrib id="author-5" contrib-type="author">
<name name-style="western"><surname>Yang</surname><given-names>Yue</given-names></name><xref ref-type="aff" rid="aff-2">2</xref><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<aff id="aff-1"><label>1</label><institution>School of Computer Science and Technology, Hainan University</institution>, <addr-line>Haikou, 570228</addr-line>, <country>China</country></aff>
<aff id="aff-2"><label>2</label><institution>School of Cyberspace Security, Hainan University</institution>, <addr-line>Haikou, 570228</addr-line>, <country>China</country></aff>
<aff id="aff-3"><label>3</label><institution>Hainan Blockchain Technology Engineering Research Center</institution>, <addr-line>Haikou, 570228</addr-line>, <country>China</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Jimei Li. Email: <email>22220854000328@hainanu.edu.cn</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2024</year></pub-date>
<pub-date date-type="pub" publication-format="electronic"><day>18</day><month>7</month><year>2024</year></pub-date>
<volume>80</volume>
<issue>1</issue>
<fpage>281</fpage>
<lpage>298</lpage>
<history>
<date date-type="received">
<day>15</day>
<month>3</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>21</day>
<month>5</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2024 Cheng et al.</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Cheng et al.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_51816.pdf"></self-uri>
<abstract>
<p>Contrastive learning is a significant research direction in the field of deep learning. However, existing data augmentation methods often lead to issues such as semantic drift in generated views while the complexity of model pre-training limits further improvement in the performance of existing methods. To address these challenges, we propose the Efficient Clustering Network based on Matrix Factorization (ECN-MF). Specifically, we design a batched low-rank Singular Value Decomposition (SVD) algorithm for data augmentation to eliminate redundant information and uncover major patterns of variation and key information in the data. Additionally, we design a Mutual Information-Enhanced Clustering Module (MI-ECM) to accelerate the training process by leveraging a simple architecture to bring samples from the same cluster closer while pushing samples from other clusters apart. Extensive experiments on six datasets demonstrate that ECN-MF exhibits more effective performance compared to state-of-the-art algorithms.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Contrastive learning</kwd>
<kwd>clustering</kwd>
<kwd>matrix factorization</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>Due to the potent representation learning capabilities of graph data, Graph Neural Networks (GNNs) have successfully permeated various domains, encompassing node classification [<xref ref-type="bibr" rid="ref-1">1</xref>], graph classification [<xref ref-type="bibr" rid="ref-2">2</xref>], time series analysis, knowledge graphs, and clustering [<xref ref-type="bibr" rid="ref-3">3</xref>]. Within the diverse landscape of graph learning, deep graph clustering [<xref ref-type="bibr" rid="ref-4">4</xref>] emerges as a fundamental yet challenging unsupervised task, marking a recent focal point of research interest. The exploration of deep graph clustering methods encompasses various learning mechanisms. Generative methods leverage generative models to characterize the distribution of graph data, achieving effective clustering of graphs [<xref ref-type="bibr" rid="ref-5">5</xref>&#x2013;<xref ref-type="bibr" rid="ref-9">9</xref>]. Adversarial methods [<xref ref-type="bibr" rid="ref-10">10</xref>,<xref ref-type="bibr" rid="ref-11">11</xref>] introduce the concept of adversarial training, enhancing clustering performance through the interplay between a generator and a discriminator. Contrastive methods [<xref ref-type="bibr" rid="ref-12">12</xref>&#x2013;<xref ref-type="bibr" rid="ref-16">16</xref>], on the other hand, propel the development of deep graph clustering by learning the similarity and dissimilarity between samples. Our method falls into the category of multi-view [<xref ref-type="bibr" rid="ref-17">17</xref>] contrastive learning, aligning with the latter approach.</p>
<p>Existing methods generate augmented views of the same nodes through graph augmentation. However, existing studies [<xref ref-type="bibr" rid="ref-18">18</xref>,<xref ref-type="bibr" rid="ref-19">19</xref>] indicate: 1) Due to the inherent characteristics of contrastive learning, sensitivity to noise and incorrect labels can lead to semantic drift and indistinguishable positive samples when inappropriate data augmentation techniques such as edge removal, noise addition, diffusion [<xref ref-type="bibr" rid="ref-13">13</xref>], or masking are used. 2) Due to the computational cost of matrix factorization itself, existing methods still face limitations in handling large sparse datasets. 3) Model pre-training typically requires extensive data for optimal performance. Fine-tuning pre-trained models necessitates large-scale labeled data, which may not be suitable for tasks with limited annotated data. Moreover, due to differences between tasks, transferring pre-trained models to specific tasks can be more complex and challenging.</p>
<p>To address the aforementioned issues, we propose an Efficient Clustering Network based on Matrix Factorization (ECN-MF). The main idea behind this approach is to design a batched low-rank Singular Value Decomposition algorithm and a Mutual Information-Enhanced Clustering Guidance Module. This aims to extract crucial information from the data while better preserving the original information in the embedded data, reducing information loss, and improving the model&#x2019;s generative capabilities. Specifically, in terms of data augmentation, we introduce a batched low-rank Singular Value Decomposition algorithm that decomposes the attribute matrix of large datasets into smaller modules. This allows better exploration of major variation patterns and important information in sparse and large datasets. In the network architecture, we employ a pseudo-siamese neural network with the same structure but without parameter sharing. This enables the model to better capture unique information from each view, thereby enhancing clustering performance. Additionally, we design a Mutual Information-Enhanced Clustering Guidance Module to ensure better preservation of original data information, reducing information loss, and enhancing the model&#x2019;s generative capabilities. It brings samples from the same cluster closer while pushing samples from other clusters apart, further improving clustering performance. The main contributions of this work are summarized as follows:
<list list-type="order">
<list-item>
<p>We propose an Efficient Clustering Network based on Matrix Factorization (ECN-MF), which does not require pre-training, thus alleviating the challenges of model pre-training and complex model transfer.</p></list-item>
<list-item>
<p>We propose a batched low-rank Singular Value Decomposition (SVD) algorithm to address the resource-intensive nature of the SVD algorithm in large sparse datasets. This method not only avoids losing important information during data augmentation but also effectively extracts latent information from the data.</p></list-item>
<list-item>
<p>We designed a Mutual Information-Enhanced Clustering Module (MI-ECM). It enhances the discriminative capability of the network while ensuring better retention of the original data information in the embedded data.</p></list-item>
<list-item>
<p>Experimental results demonstrate that our proposed method outperforms existing methods in handling challenges such as large sparse datasets and model transfer complexity.</p></list-item>
</list></p>
<p>The remaining sections of this paper are organized as follows. <xref ref-type="sec" rid="s2">Section 2</xref> reviews relevant literature on matrix factorization and contrastive deep graph clustering. In <xref ref-type="sec" rid="s3">Section 3</xref>, we provide detailed explanations of the symbols used, the batched low-rank Singular Value Decomposition algorithm, our network structure, and the Mutual Information-Enhanced Clustering Guidance Module. <xref ref-type="sec" rid="s4">Section 4</xref> presents the results of our method tested on five datasets. Finally, <xref ref-type="sec" rid="s5">Section 5</xref> concludes the paper.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Work</title>
<sec id="s2_1">
<label>2.1</label>
<title>Matrix Factorization</title>
<p>Matrix factorization (decomposition) is the process of breaking down a matrix into the product of several matrices. This can involve techniques such as triangular factorization, full rank factorization, orthogonal triangle decomposition, Jordan decomposition, and Singular Value Decomposition. In our case, we primarily utilize Singular Value Decomposition to process data. Singular Value Decomposition allows for the representation of a relatively complex matrix as the product of smaller and simpler matrices. These smaller matrices describe the essential characteristics of the original matrix. Singular Value Decomposition is applicable to any matrix, making it adaptable to the features of current attribute information. It finds applications in various fields such as signal processing, statistics, natural language processing, and more. In recommendation systems, Singular Value Decomposition is widely applied in collaborative filtering and matrix completion algorithms. Through Singular Value Decomposition, a user-item rating matrix can be reduced to a low-dimensional latent factor matrix, extracting latent features of users and items for recommendation purposes.</p>
<p>Singular Value Decomposition has significant potential value in processing raw data. The singular values decrease exponentially with rank, with early singular values much larger than later ones, as shown in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>. By applying low-rank Singular Value Decomposition to the data, we can capture the essential features of the data, enhance data representation, and improve the performance of multi-view clustering. Based on this, we have optimized and improved the Singular Value Decomposition method to be suitable for large datasets, providing higher computational efficiency and scalability.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>(a): Singular values exhibit exponential decay with rank, where the initial singular values are significantly larger than the subsequent ones. (b): All information of <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:math></inline-formula> is encoded in all singular values until <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mi mathvariant="bold-italic">k</mml:mi></mml:math></inline-formula>. The majority of information is encoded in the first singular vector returned by Singular Value Decomposition</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_51816-fig-1.tif"/>
</fig>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Contrastive Deep Graph Clustering</title>
<p>In recent years, contrastive learning has achieved remarkable success in the fields of images [<xref ref-type="bibr" rid="ref-20">20</xref>&#x2013;<xref ref-type="bibr" rid="ref-23">23</xref>] and graphics [<xref ref-type="bibr" rid="ref-24">24</xref>&#x2013;<xref ref-type="bibr" rid="ref-26">26</xref>], inspiring extensive research on contrastive deep graph clustering methods [<xref ref-type="bibr" rid="ref-12">12</xref>&#x2013;<xref ref-type="bibr" rid="ref-16">16</xref>]. The clustering performance of these methods is primarily influenced by three key factors: data augmentation, network architecture, and the handling of positive and negative sample pairs. Taking these factors into account, we summarize the distinctions between our proposed ECN-MF and other contrastive deep graph clustering methods.</p>
<sec id="s2_2_1">
<label>2.2.1</label>
<title>Data Augmentation</title>
<p>Data augmentation techniques are pivotal in the realm of deep graph-contrastive clustering. Current methods, such as edge removal, diffusion, masking, and noise addition, introduce varying degrees of perturbation to the original data. While these approaches help mitigate over-smoothing during iterative training of graph neural networks, they carry the risk of losing crucial information. Inappropriate data augmentation may lead to semantic drift and indistinguishable positive samples, resulting in suboptimal clustering performance. For instance, reference [<xref ref-type="bibr" rid="ref-13">13</xref>] utilizes diffusion matrices as augmented graphs, while Self-supervised contrastive attributed graph clustering (SCAGC) perturbs graph structure by randomly adding or removing edges. Reference [<xref ref-type="bibr" rid="ref-15">15</xref>] and SCAGC enhance node attributes through attribute perturbation. However, reference [<xref ref-type="bibr" rid="ref-27">27</xref>] has highlighted the risk of semantic drift with improper data augmentation. To address this challenge, we propose a novel enhancement approach. Unlike existing methods, ECN-MF leverages batch low-rank Singular Value Decomposition methods to extract crucial attribute information, filter out noisy data, and construct two augmented views of the same node without compromising the original structure.</p>
</sec>
<sec id="s2_2_2">
<label>2.2.2</label>
<title>Network Architecture</title>
<p>In terms of network architecture, SCAGC, reference [<xref ref-type="bibr" rid="ref-28">28</xref>] uses a shared Graph Convolutional Networks (GCNs) encoder to encode nodes. However, conventional GCNs encoders entangle transformation and aggregation operations during the training process, resulting in high time costs. To address this issue, we employ two separate Multilayer Perceptrons (MLPs) to encode the node attributes of the two views. These two MLPs have the same architecture but do not share parameters, ensuring that the node embeddings for the two views contain different semantic information.</p>
</sec>
<sec id="s2_2_3">
<label>2.2.3</label>
<title>Handling of Positive and Negative Sample Pairs</title>
<p>In contrastive methods, the handling of positive and negative sample pairs is crucial. Contrastive methods bring positive samples together while pushing negative samples apart. Therefore, the quality of positive and negative sample pairs significantly influences the performance of contrastive methods. Specifically, reference [<xref ref-type="bibr" rid="ref-13">13</xref>] generates negative samples by randomly shuffling features and designs the InfoMax loss to maximize cross-view mutual information. Reference [<xref ref-type="bibr" rid="ref-12">12</xref>] distinguishes between similar and dissimilar nodes using cross-entropy loss. Subsequently, SCAGC randomly selects samples from different clusters to improve the quality of negative samples. They also design a contrastive clustering loss to maximize the consistency between representations from the same cluster. Both references [<xref ref-type="bibr" rid="ref-14">14</xref>] and [<xref ref-type="bibr" rid="ref-16">16</xref>] utilize the infoNCE loss to attract positive sample pairs and separate negative sample pairs. While their approaches have shown effectiveness, they still depend on a well-pretrained model to choose high-quality positive and negative samples. To address this issue, we propose a MI-ECM to bring samples from the same cluster closer while pushing samples from different clusters apart, thereby enhancing the discriminative ability of sample pairs.</p>
</sec>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Method</title>
<p>In this section, we introduce a novel Efficient Clustering Algorithm based on Matrix Factorization (ECN-MF). The aim is to enhance clustering performance by leveraging matrix factorization in a way that captures unique information from each view with-out disrupting the original structure. The overall framework of ECN-MF is illustrated in <xref ref-type="fig" rid="fig-2">Fig. 2</xref>. In the following sections, we will provide a detailed explanation of the pro-posed ECN-MF.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Schematic diagram of the Efficient Clustering Network based on Matrix Factorization. The network consists of three main parts: the batched low-rank Singular Value Decomposition (SVD) on the left, the pseudo-siamese neural network in the middle, and the Mutual Information-Enhanced Clustering Guidance Module on the right. Here, <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:mover><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> is the normalized adjacency matrix, <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:math></inline-formula> is the attribute matrix, and <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is its corresponding low-rank attribute matrix. <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:msub><mml:mi mathvariant="bold-italic">E</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:msub><mml:mi mathvariant="bold-italic">E</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">2</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> represent the first and second encoders, <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">L</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:math></inline-formula> and <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">L</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> represent the embedding information of the first and second encoders. <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow></mml:math></inline-formula> is the probability matrix, and <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:mrow><mml:mtext mathvariant="bold">P</mml:mtext></mml:mrow></mml:math></inline-formula> is the target distribution</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_51816-fig-2.tif"/>
</fig>
<sec id="s3_1">
<label>3.1</label>
<title>Notations and Problem Definition</title>
<p>In an undirected graph <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:mi>G</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>, let <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:mi>V</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula> be a set containing <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:mi>N</mml:mi></mml:math></inline-formula> nodes with <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mi>K</mml:mi></mml:math></inline-formula> classes, and <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mi>E</mml:mi></mml:math></inline-formula> be the set of edges. <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>D</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> represents the attribute matrix, and <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> represents the original adjacency matrix. The degree matrix is denoted as <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>a</mml:mi><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, where <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="normal">&#x03A3;</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mi>&#x03B5;</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. Normalizing the original adjacency matrix <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow></mml:math></inline-formula> to <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:mover><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is achieved by computing <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mtext mathvariant="bold">I</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:mrow><mml:mtext mathvariant="bold">I</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is the identity matrix. <xref ref-type="table" rid="table-1">Table 1</xref> summarizes these symbols.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Description of the used notations</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Notation</th>
<th>Meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td><inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>D</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula></td>
<td>Attribute matrix</td>
</tr>
<tr>
<td><inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>D</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula></td>
<td>Low-rank attribute matrix</td>
</tr>
<tr>
<td><inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula></td>
<td>Original adjacency matrix</td>
</tr>
<tr>
<td><inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:mover><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula></td>
<td>Normalized adjacent matrix</td>
</tr>
<tr>
<td><inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:mrow><mml:mtext mathvariant="bold">I</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula></td>
<td>Identity matrix</td>
</tr>
<tr>
<td><inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula></td>
<td>Degree matrix</td>
</tr>
<tr>
<td><inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula></td>
<td>Graph embedding</td>
</tr>
<tr>
<td><inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>K</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula></td>
<td>Probability Distribution</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Batch Low-Rank Singular Value Decomposition Algorithm</title>
<p>Recent studies have demonstrated the significant effectiveness of Singular Value Decomposition in handling sparse matrices and dimensionality reduction. Inspired by their success, we introduce the Singular Value Decomposition algorithm, treating attribute information as an independent preprocessing step before training. This approach allows for the effective extraction of latent and essential information from the data while filtering out noise present in the attributes. The method is as follows:</p>
<p>Specifically, given a matrix <inline-formula id="ieqn-33"><mml:math id="mml-ieqn-33"><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, low-rank Singular Value Decomposition decomposes it into the product of three matrices, where <inline-formula id="ieqn-34"><mml:math id="mml-ieqn-34"><mml:mi>m</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-35"><mml:math id="mml-ieqn-35"><mml:mi>n</mml:mi></mml:math></inline-formula> can be any integers:
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mtext mathvariant="bold">US</mml:mtext></mml:mrow><mml:msup><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">T</mml:mi></mml:mrow></mml:msup></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-36"><mml:math id="mml-ieqn-36"><mml:mrow><mml:mtext mathvariant="bold">U</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>r</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is the left singular vector matrix, containing the structural information of the original data, where <inline-formula id="ieqn-37"><mml:math id="mml-ieqn-37"><mml:mi>r</mml:mi></mml:math></inline-formula> represents the rank of the low-rank. <inline-formula id="ieqn-38"><mml:math id="mml-ieqn-38"><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>r</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is the diagonal matrix containing singular values, typically arranged in descending order, representing the importance of the data. <inline-formula id="ieqn-39"><mml:math id="mml-ieqn-39"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is the right singular vector matrix, containing the feature information of the original data.</p>
<p>Singular Value Decomposition can be applied to decompose any matrix, making it adaptable to the characteristics of current attribute information. We will rewrite <xref ref-type="disp-formula" rid="eqn-1">Eq. (1)</xref> as follows:
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">r</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">U</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">T</mml:mi></mml:mrow></mml:msubsup></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-40"><mml:math id="mml-ieqn-40"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">r</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>D</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> represents the low-rank attribute matrix, <inline-formula id="ieqn-41"><mml:math id="mml-ieqn-41"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">U</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the low-rank left singular vector matrix, <inline-formula id="ieqn-42"><mml:math id="mml-ieqn-42"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the low-rank diagonal matrix, and <inline-formula id="ieqn-43"><mml:math id="mml-ieqn-43"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the low-rank right singular vector matrix, where <inline-formula id="ieqn-44"><mml:math id="mml-ieqn-44"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">T</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is the transpose of <inline-formula id="ieqn-45"><mml:math id="mml-ieqn-45"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>.</p>
<p>To adapt to the Singular Value Decomposition in the case of large data, we partition the dataset <inline-formula id="ieqn-46"><mml:math id="mml-ieqn-46"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula> into <inline-formula id="ieqn-47"><mml:math id="mml-ieqn-47"><mml:mi>M</mml:mi></mml:math></inline-formula> batches, where each batch <inline-formula id="ieqn-48"><mml:math id="mml-ieqn-48"><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> consists of <inline-formula id="ieqn-49"><mml:math id="mml-ieqn-49"><mml:mi>B</mml:mi></mml:math></inline-formula> samples, and <inline-formula id="ieqn-50"><mml:math id="mml-ieqn-50"><mml:mi>B</mml:mi></mml:math></inline-formula> is a positive integer, i.e.:
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mi mathvariant="bold-italic">t</mml:mi><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mi mathvariant="bold-italic">h</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mi mathvariant="bold-italic">B</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mi mathvariant="bold-italic">B</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mtext mathvariant="bold">2</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mi mathvariant="bold-italic">B</mml:mi></mml:mrow></mml:msub><mml:mo>}</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-51"><mml:math id="mml-ieqn-51"><mml:mi>B</mml:mi></mml:math></inline-formula> represents the number of samples included in <inline-formula id="ieqn-52"><mml:math id="mml-ieqn-52"><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>.</p>
<p>The objective of centralization is to set the mean of the data in each batch to zero. Assuming <inline-formula id="ieqn-53"><mml:math id="mml-ieqn-53"><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> contains <inline-formula id="ieqn-54"><mml:math id="mml-ieqn-54"><mml:mi>B</mml:mi></mml:math></inline-formula> samples with a mean of <inline-formula id="ieqn-55"><mml:math id="mml-ieqn-55"><mml:msub><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, centralization is applied to each batch <inline-formula id="ieqn-56"><mml:math id="mml-ieqn-56"><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, resulting in the centered batch <inline-formula id="ieqn-57"><mml:math id="mml-ieqn-57"><mml:msubsup><mml:mrow><mml:mtext mathvariant="italic">batch</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">i</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">centered</mml:mtext></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula>. This process can be mathematically represented as follows:
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:msubsup><mml:mrow><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mi mathvariant="bold-italic">t</mml:mi><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mi mathvariant="bold-italic">h</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">n</mml:mi><mml:mi mathvariant="bold-italic">t</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mi mathvariant="bold-italic">t</mml:mi><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mi mathvariant="bold-italic">h</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">&#x03BC;</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mi mathvariant="bold-italic">B</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">&#x03BC;</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mi mathvariant="bold-italic">B</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mtext mathvariant="bold">2</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">&#x03BC;</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mi mathvariant="bold-italic">B</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">&#x03BC;</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>}</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-58"><mml:math id="mml-ieqn-58"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mi>B</mml:mi><mml:mo>+</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> represents the 2-th sample in <inline-formula id="ieqn-59"><mml:math id="mml-ieqn-59"><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, and <inline-formula id="ieqn-60"><mml:math id="mml-ieqn-60"><mml:msub><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the mean of <inline-formula id="ieqn-61"><mml:math id="mml-ieqn-61"><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. Performing Singular Value Decomposition on each <inline-formula id="ieqn-62"><mml:math id="mml-ieqn-62"><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:msubsup><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> yields different low-rank matrices <inline-formula id="ieqn-63"><mml:math id="mml-ieqn-63"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">u</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">s</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">v</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> :
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>&#x223C;</mml:mo><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">s</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>&#x223C;</mml:mo><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>&#x223C;</mml:mo><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mi mathvariant="bold-italic">S</mml:mi><mml:mi mathvariant="bold-italic">V</mml:mi><mml:mi mathvariant="bold-italic">D</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mrow><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mi mathvariant="bold-italic">t</mml:mi><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mi mathvariant="bold-italic">h</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">n</mml:mi><mml:mi mathvariant="bold-italic">t</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Here, SVD represents the Singular Value Decomposition algorithm.</p>
<p>We concatenate all low-rank matrices <inline-formula id="ieqn-64"><mml:math id="mml-ieqn-64"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">u</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">s</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">v</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> from each <inline-formula id="ieqn-65"><mml:math id="mml-ieqn-65"><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:msubsup><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> to form matrices <inline-formula id="ieqn-66"><mml:math id="mml-ieqn-66"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">U</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>:
<disp-formula id="ueqn-6"><mml:math id="mml-ueqn-6" display="block"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">U</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x2217;</mml:mo><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>}</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">s</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">s</mml:mi><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">2</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">s</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo mathvariant="bold">&#x2217;</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:mo>}</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="ueqn-8"><mml:math id="mml-ueqn-8" display="block"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x2217;</mml:mo><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>}</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>We perform a descending order sorting on <inline-formula id="ieqn-67"><mml:math id="mml-ieqn-67"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and select the top <inline-formula id="ieqn-68"><mml:math id="mml-ieqn-68"><mml:mi>r</mml:mi></mml:math></inline-formula> values. In other words, we individually sort the real numbers of singular values in descending order and choose the top <inline-formula id="ieqn-69"><mml:math id="mml-ieqn-69"><mml:mi>r</mml:mi></mml:math></inline-formula> singular values. The purpose is to emphasize the significance of singular values, allowing the primary information of attribute matrix <inline-formula id="ieqn-70"><mml:math id="mml-ieqn-70"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:math></inline-formula> to be represented by a reduced number of singular values or vectors. The aforementioned learning process can be articulated as follows:
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">l</mml:mi><mml:mi mathvariant="bold-italic">o</mml:mi><mml:mi mathvariant="bold-italic">w</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">d</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">s</mml:mi><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">n</mml:mi><mml:mi mathvariant="bold-italic">d</mml:mi><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mi mathvariant="bold-italic">n</mml:mi><mml:mi mathvariant="bold-italic">g</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">s</mml:mi><mml:mi mathvariant="bold-italic">o</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mi mathvariant="bold-italic">t</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="ueqn-10"><mml:math id="mml-ueqn-10" display="block"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>w</mml:mi><mml:mi mathvariant="normal">&#x005F;</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>T</mml:mi><mml:mi>O</mml:mi><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>w</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi>r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Next, we will obtain the corresponding vectors of <inline-formula id="ieqn-71"><mml:math id="mml-ieqn-71"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> based on the indices selected indices of the singular values <inline-formula id="ieqn-72"><mml:math id="mml-ieqn-72"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>w</mml:mi><mml:mi mathvariant="normal">&#x005F;</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and form <inline-formula id="ieqn-73"><mml:math id="mml-ieqn-73"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>w</mml:mi><mml:mi mathvariant="normal">&#x005F;</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> according to <xref ref-type="disp-formula" rid="eqn-8">Eq. (8)</xref>:
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">l</mml:mi><mml:mi mathvariant="bold-italic">o</mml:mi><mml:mi mathvariant="bold-italic">w</mml:mi></mml:mrow><mml:mi mathvariant="normal">&#x005F;</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi mathvariant="bold-italic">f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="left left" rowspacing=".2em" columnspacing="1em" displaystyle="false"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mtext mathvariant="bold">v</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">l</mml:mi><mml:mi mathvariant="bold-italic">o</mml:mi><mml:mi mathvariant="bold-italic">w</mml:mi></mml:mrow><mml:mi mathvariant="normal">&#x005F;</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow></mml:msub><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mi mathvariant="bold-italic">m</mml:mi><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mi mathvariant="bold-italic">t</mml:mi><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mi mathvariant="bold-italic">h</mml:mi></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mrow><mml:mtext mathvariant="bold">v</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi mathvariant="bold-italic">o</mml:mi><mml:mi mathvariant="bold-italic">t</mml:mi><mml:mi mathvariant="bold-italic">h</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mi mathvariant="bold-italic">s</mml:mi></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Finally, we reconstruct the low-rank attribute matrix <inline-formula id="ieqn-74"><mml:math id="mml-ieqn-74"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> using <inline-formula id="ieqn-75"><mml:math id="mml-ieqn-75"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">U</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-76"><mml:math id="mml-ieqn-76"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>w</mml:mi><mml:mi mathvariant="normal">&#x005F;</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-77"><mml:math id="mml-ieqn-77"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>w</mml:mi><mml:mi mathvariant="normal">&#x005F;</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> according to <xref ref-type="disp-formula" rid="eqn-9">Eq. (9)</xref>.
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mtext mathvariant="bold">U</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mtext mathvariant="bold">S</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">l</mml:mi><mml:mi mathvariant="bold-italic">o</mml:mi><mml:mi mathvariant="bold-italic">w</mml:mi></mml:mrow><mml:mi mathvariant="normal">&#x005F;</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">V</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">l</mml:mi><mml:mi mathvariant="bold-italic">o</mml:mi><mml:mi mathvariant="bold-italic">w</mml:mi></mml:mrow><mml:mi mathvariant="normal">&#x005F;</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">T</mml:mi></mml:mrow></mml:msubsup></mml:math></disp-formula></p>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Pseudo-Siamese Neural Network</title>
<p>In this section, we embed the nodes of both the original and enhanced data into a latent space and design a pseudo-siamese neural network with an encoder that shares the same architecture but has non-shared learnable parameters.</p>
<p>Residual connections allow information to propagate between network layers, aiding in mitigating the vanishing gradient problem, accelerating the training process, and enhancing model performance. In this work, the representations learned by the <inline-formula id="ieqn-78"><mml:math id="mml-ieqn-78"><mml:mi>l</mml:mi></mml:math></inline-formula>-th layer of GCNs can be obtained through the following convolution operation:
<disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mi mathvariant="bold-italic">&#x2205;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mover><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow></mml:msup><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="ueqn-14"><mml:math id="mml-ueqn-14" display="block"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>&#x2205;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mover><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow><mml:mrow><mml:mi>v</mml:mi><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-79"><mml:math id="mml-ieqn-79"><mml:mi>&#x2205;</mml:mi></mml:math></inline-formula> is the activation function of the fully connected layer, such as Relu [<xref ref-type="bibr" rid="ref-29">29</xref>] or the Sigmoid function. <inline-formula id="ieqn-80"><mml:math id="mml-ieqn-80"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> and <inline-formula id="ieqn-81"><mml:math id="mml-ieqn-81"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> represent the embeddings of two views, and <inline-formula id="ieqn-82"><mml:math id="mml-ieqn-82"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow><mml:mrow><mml:mi>v</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula id="ieqn-83"><mml:math id="mml-ieqn-83"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow><mml:mrow><mml:mi>v</mml:mi><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> are the weight matrices corresponding to the encoder&#x2019;s <inline-formula id="ieqn-84"><mml:math id="mml-ieqn-84"><mml:mi>l</mml:mi></mml:math></inline-formula>-th layer. Additionally, we denote <inline-formula id="ieqn-85"><mml:math id="mml-ieqn-85"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:math></inline-formula> as the original data <inline-formula id="ieqn-86"><mml:math id="mml-ieqn-86"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:math></inline-formula> and <inline-formula id="ieqn-87"><mml:math id="mml-ieqn-87"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> as the enhanced data <inline-formula id="ieqn-88"><mml:math id="mml-ieqn-88"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. As shown in <xref ref-type="disp-formula" rid="eqn-10">Eq. (10)</xref>, the representations <inline-formula id="ieqn-89"><mml:math id="mml-ieqn-89"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> and <inline-formula id="ieqn-90"><mml:math id="mml-ieqn-90"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> will traverse the normalized adjacency matrix <inline-formula id="ieqn-91"><mml:math id="mml-ieqn-91"><mml:mover><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> to obtain new representations <inline-formula id="ieqn-92"><mml:math id="mml-ieqn-92"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:math></inline-formula> and <inline-formula id="ieqn-93"><mml:math id="mml-ieqn-93"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula>. It is important to note that the input to the first layer of GCNs consists of the original data <inline-formula id="ieqn-94"><mml:math id="mml-ieqn-94"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:math></inline-formula> and the enhanced data <inline-formula id="ieqn-95"><mml:math id="mml-ieqn-95"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>:
<disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mi mathvariant="bold-italic">&#x2205;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mover><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="ueqn-16"><mml:math id="mml-ueqn-16" display="block"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>&#x2205;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mover><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:msub><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow><mml:mrow><mml:mi>v</mml:mi><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>We denote the embeddings output by the last layer of GCNs as <inline-formula id="ieqn-96"><mml:math id="mml-ieqn-96"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>L</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:math></inline-formula> and <inline-formula id="ieqn-97"><mml:math id="mml-ieqn-97"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>.</p>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>Mutual Information-Enhanced Clustering Module</title>
<p>To minimize redundancy in embeddings and effectively preserve more discriminative features, we initially optimize the embeddings between cross-view samples using Mean Squared Error (MSE) loss. This ensures that the learned representations are not influenced by irrelevant information, thereby guaranteeing the quality of the latent space for subsequent clustering tasks. The formula is as follows:
<disp-formula id="eqn-12"><label>(12)</label><mml:math id="mml-eqn-12" display="block"><mml:msub><mml:mi mathvariant="bold-italic">L</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">M</mml:mi><mml:mi mathvariant="bold-italic">S</mml:mi><mml:mi mathvariant="bold-italic">E</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow><mml:mi mathvariant="bold-italic">n</mml:mi></mml:mfrac><mml:msubsup><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">n</mml:mi></mml:mrow></mml:msubsup><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">L</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>&#x2212;</mml:mo><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">L</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">2</mml:mtext></mml:mrow></mml:mrow></mml:msup></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-98"><mml:math id="mml-ieqn-98"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>L</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:math></inline-formula> and <inline-formula id="ieqn-99"><mml:math id="mml-ieqn-99"><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> represent the embeddings of the last layer of the graph convolutional network.</p>
<p>Next, we merge the embeddings from the two views for each node as follows:
<disp-formula id="eqn-13"><label>(13)</label><mml:math id="mml-eqn-13" display="block"><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow><mml:mrow><mml:mtext mathvariant="bold">2</mml:mtext></mml:mrow></mml:mfrac><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">L</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">L</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-100"><mml:math id="mml-ieqn-100"><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> represents the node embeddings for clustering.</p>
<p>Finally, by inputting <inline-formula id="ieqn-101"><mml:math id="mml-ieqn-101"><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow></mml:math></inline-formula> into a one-layer Multi-Layer Perceptron (MLP) with a softmax activation function, we convert it into a K-dimensional clustering space, where <inline-formula id="ieqn-102"><mml:math id="mml-ieqn-102"><mml:mi>K</mml:mi></mml:math></inline-formula> represents the number of clusters. The learning process can be expressed as:
<disp-formula id="eqn-14"><label>(14)</label><mml:math id="mml-eqn-14" display="block"><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mi mathvariant="bold-italic">s</mml:mi><mml:mi mathvariant="bold-italic">o</mml:mi><mml:mi mathvariant="bold-italic">f</mml:mi><mml:mi mathvariant="bold-italic">t</mml:mi><mml:mi mathvariant="bold-italic">m</mml:mi><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">Z</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula id="ieqn-103"><mml:math id="mml-ieqn-103"><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>K</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> represents the probability matrix indicating the probability of all <inline-formula id="ieqn-104"><mml:math id="mml-ieqn-104"><mml:mrow><mml:mtext>N</mml:mtext></mml:mrow></mml:math></inline-formula> nodes belonging to <inline-formula id="ieqn-105"><mml:math id="mml-ieqn-105"><mml:mi>K</mml:mi></mml:math></inline-formula> clusters. We can view <inline-formula id="ieqn-106"><mml:math id="mml-ieqn-106"><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow></mml:math></inline-formula> as a probability distribution.</p>
<p>Following the acquisition of the clustering probability distribution <inline-formula id="ieqn-107"><mml:math id="mml-ieqn-107"><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow></mml:math></inline-formula>, we refine the data representation by focusing on learning high-confidence data, aiming to strengthen the cohesion within the clustering. Specifically, we aim to reinforce the intra-cluster cohesion by emphasizing representations that are closer to the cluster centers. Therefore, we compute the target distribution <inline-formula id="ieqn-108"><mml:math id="mml-ieqn-108"><mml:mrow><mml:mtext>P</mml:mtext></mml:mrow></mml:math></inline-formula> as follows:
<disp-formula id="eqn-15"><label>(15)</label><mml:math id="mml-eqn-15" display="block"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">P</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mi mathvariant="bold-italic">j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mi mathvariant="bold-italic">j</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">2</mml:mtext></mml:mrow></mml:mrow></mml:msubsup><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">f</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:munder><mml:mo>&#x2211;</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">j</mml:mi><mml:mrow><mml:mi mathvariant="bold">&#x2032;</mml:mi></mml:mrow></mml:msup></mml:munder><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:msup><mml:mi mathvariant="bold-italic">j</mml:mi><mml:mrow><mml:mi mathvariant="bold">&#x2032;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mtext mathvariant="bold">2</mml:mtext></mml:mrow></mml:msubsup><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">f</mml:mi><mml:msup><mml:mi mathvariant="bold-italic">j</mml:mi><mml:mrow><mml:mi mathvariant="bold">&#x2032;</mml:mi></mml:mrow></mml:msup></mml:msub></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-109"><mml:math id="mml-ieqn-109"><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the soft clustering frequency. In the target distribution <inline-formula id="ieqn-110"><mml:math id="mml-ieqn-110"><mml:mrow><mml:mtext mathvariant="bold">P</mml:mtext></mml:mrow></mml:math></inline-formula>, each assignment in <inline-formula id="ieqn-111"><mml:math id="mml-ieqn-111"><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow></mml:math></inline-formula> is squared and normalized to give higher confidence to the assignments. This leads to the following objective function:
<disp-formula id="eqn-16"><label>(16)</label><mml:math id="mml-eqn-16" display="block"><mml:msub><mml:mi mathvariant="bold-italic">L</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">K</mml:mi><mml:mi mathvariant="bold-italic">L</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">P</mml:mtext></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:msub><mml:mi mathvariant="bold">log</mml:mi><mml:mfrac><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">P</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>Minimizing the KL divergence loss between the <inline-formula id="ieqn-112"><mml:math id="mml-ieqn-112"><mml:mrow><mml:mtext mathvariant="bold">Q</mml:mtext></mml:mrow></mml:math></inline-formula> and <inline-formula id="ieqn-113"><mml:math id="mml-ieqn-113"><mml:mrow><mml:mtext mathvariant="bold">P</mml:mtext></mml:mrow></mml:math></inline-formula> distributions enables the GCNs module to enhance its ability to learn representations for clustering tasks. This ensures that the data representations around cluster centers become more compact.</p>
<p>In order to aggregate samples within the same cluster while simultaneously separating them from samples in other clusters, we choose to periodically update the top <inline-formula id="ieqn-114"><mml:math id="mml-ieqn-114"><mml:mi>&#x03C4;</mml:mi></mml:math></inline-formula> probability samples from different clusters. This process enhances the cohesion of positive samples within clusters and improves the discriminative ability of sample pairs, i.e.,
<disp-formula id="eqn-17"><label>(17)</label><mml:math id="mml-eqn-17" display="block"><mml:mi mathvariant="bold-italic">I</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">j</mml:mi></mml:mrow></mml:msub><mml:mtext>&#x00A0;</mml:mtext><mml:mrow><mml:mtext mathvariant="bold">is</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="bold">among</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="bold">the</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="bold">top</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mi mathvariant="bold-italic">&#x03C4;</mml:mi><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="bold">elements</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="left left" rowspacing=".2em" columnspacing="1em" displaystyle="false"><mml:mtr><mml:mtd><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mrow><mml:mtext mathvariant="bold">if</mml:mtext></mml:mrow><mml:mtext>&#x00A0;</mml:mtext><mml:msub><mml:mi mathvariant="bold-italic">q</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">j</mml:mi></mml:mrow></mml:msub><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="bold">is</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="bold">among</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mrow /><mml:mrow><mml:mtext mathvariant="bold">the</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="bold">top</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mi mathvariant="bold-italic">&#x03C4;</mml:mi><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="bold">elements</mml:mtext></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="bold">otherwise</mml:mtext></mml:mrow></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-115"><mml:math id="mml-ieqn-115"><mml:mi>I</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mspace width="thinmathspace" /><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mi>a</mml:mi><mml:mi>m</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>g</mml:mi><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>e</mml:mi><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>p</mml:mi><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mi>&#x03C4;</mml:mi><mml:mspace width="thinmathspace" /><mml:mspace width="thinmathspace" /><mml:mi>e</mml:mi><mml:mi>l</mml:mi><mml:mi>e</mml:mi><mml:mi>m</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mi>s</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula> indicates that sample <inline-formula id="ieqn-116"><mml:math id="mml-ieqn-116"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is most likely to belong to cluster <inline-formula id="ieqn-117"><mml:math id="mml-ieqn-117"><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, and the label for this sample is recorded as <inline-formula id="ieqn-118"><mml:math id="mml-ieqn-118"><mml:msup><mml:mi>y</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mi>e</mml:mi><mml:mi>y</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>. The number of key samples in each cluster is <inline-formula id="ieqn-119"><mml:math id="mml-ieqn-119"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, and the total number of key samples is <inline-formula id="ieqn-120"><mml:math id="mml-ieqn-120"><mml:mi>C</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, enhancing the cohesion of positive samples within clusters.</p>
<p>We use the obtained pseudo-labels <inline-formula id="ieqn-121"><mml:math id="mml-ieqn-121"><mml:msup><mml:mi>y</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mi>e</mml:mi><mml:mi>y</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> to constrain the model&#x2019;s results, i.e.,
<disp-formula id="eqn-18"><label>(18)</label><mml:math id="mml-eqn-18" display="block"><mml:msub><mml:mi mathvariant="bold-italic">L</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">B</mml:mi><mml:mi mathvariant="bold-italic">C</mml:mi><mml:mi mathvariant="bold-italic">E</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">k</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">y</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:mrow><mml:mover><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mfrac><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">C</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">N</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">k</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:msubsup><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">C</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">N</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">k</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">k</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">y</mml:mi></mml:mrow></mml:msubsup><mml:mi mathvariant="bold">log</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mrow><mml:mover><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:msubsup><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">k</mml:mi><mml:mi mathvariant="bold-italic">e</mml:mi><mml:mi mathvariant="bold-italic">y</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow><mml:mi mathvariant="bold">log</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">1</mml:mtext></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mrow><mml:mover><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-122"><mml:math id="mml-ieqn-122"><mml:msup><mml:mi>y</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mi>e</mml:mi><mml:mi>y</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is the label of the key sample, and <inline-formula id="ieqn-123"><mml:math id="mml-ieqn-123"><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> is the model&#x2019;s clustering result.</p>
<p>In summary, the overall loss calculation for ECN-MF is as follows:
<disp-formula id="eqn-19"><label>(19)</label><mml:math id="mml-eqn-19" display="block"><mml:mi mathvariant="bold-italic">L</mml:mi><mml:mo>=</mml:mo><mml:mi mathvariant="bold-italic">&#x03B1;</mml:mi><mml:msub><mml:mi mathvariant="bold-italic">L</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">M</mml:mi><mml:mi mathvariant="bold-italic">S</mml:mi><mml:mi mathvariant="bold-italic">E</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mi mathvariant="bold-italic">&#x03B2;</mml:mi><mml:msub><mml:mi mathvariant="bold-italic">L</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">K</mml:mi><mml:mi mathvariant="bold-italic">L</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">L</mml:mi><mml:mrow><mml:mi mathvariant="bold-italic">B</mml:mi><mml:mi mathvariant="bold-italic">C</mml:mi><mml:mi mathvariant="bold-italic">E</mml:mi></mml:mrow></mml:msub></mml:math></disp-formula></p>
<p>Here, <inline-formula id="ieqn-124"><mml:math id="mml-ieqn-124"><mml:mi>&#x03B1;</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-125"><mml:math id="mml-ieqn-125"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula> are hyperparameters.</p>
<p>The detailed learning procedure of ECN-MF is shown in <bold>Algorithm 1</bold>.</p>
<fig id="fig-7">
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_51816-fig-7.tif"/>
</fig>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Experiments</title>
<sec id="s4_1">
<label>4.1</label>
<title>Dataset</title>
<p>We proposed ECN-MF, which was evaluated on six datasets, including CORA [<xref ref-type="bibr" rid="ref-12">12</xref>] CITESEER [<xref ref-type="bibr" rid="ref-12">12</xref>], European Air Traffic (EAT) [<xref ref-type="bibr" rid="ref-30">30</xref>], United States Air Traffic (UAT) [<xref ref-type="bibr" rid="ref-30">30</xref>], Amazon Photo (AMAP), and Amazon Computer (AMAC). <xref ref-type="table" rid="table-2">Table 2</xref> provides a brief description of these datasets.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Summary of datasets</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>Type</th>
<th>Sample</th>
<th>Edges</th>
<th>Dimension</th>
<th>Class</th>
</tr>
</thead>
<tbody>
<tr>
<td>CORA</td>
<td>Graph</td>
<td>2708</td>
<td>10556</td>
<td>1433</td>
<td>7</td>
</tr>
<tr>
<td>CITESEER</td>
<td>Graph</td>
<td>3327</td>
<td>9928</td>
<td>3703</td>
<td>6</td>
</tr>
<tr>
<td>EAT</td>
<td>Graph</td>
<td>399</td>
<td>11988</td>
<td>203</td>
<td>4</td>
</tr>
<tr>
<td>UAT</td>
<td>Graph</td>
<td>1190</td>
<td>27198</td>
<td>239</td>
<td>4</td>
</tr>
<tr>
<td>AMAP</td>
<td>Graph</td>
<td>7650</td>
<td>238162</td>
<td>745</td>
<td>8</td>
</tr>
<tr>
<td>AMAC</td>
<td>Graph</td>
<td>13752</td>
<td>491722</td>
<td>767</td>
<td>10</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Experiment Setup</title>
<p>All experimental results were obtained on a high-performance server equipped with an NVIDIA 3090 GPU, 64 GB RAM, and the PyTorch deep learning platform.</p>
<sec id="s4_2_1">
<label>4.2.1</label>
<title>Training Procedure</title>
<p>Our network is trained by minimizing the loss in <xref ref-type="disp-formula" rid="eqn-19">Eq. (19)</xref> using the Adam optimizer for 1000 iterations until convergence. After optimization, we directly apply <xref ref-type="disp-formula" rid="eqn-13">Eqs. (13)</xref> and <xref ref-type="disp-formula" rid="eqn-14">(14)</xref> to cluster the node embeddings of the two views and report the final convergence results for four metrics. Following all compared methods and to mitigate the adverse effects of randomness, we repeat the experiments 10 times and report the averages along with the corresponding standard deviations.</p>
</sec>
<sec id="s4_2_2">
<label>4.2.2</label>
<title>Parameter Settings</title>
<p>For the sake of fairness, regarding MCGC [<xref ref-type="bibr" rid="ref-14">14</xref>], we only executed their source code on the graph datasets listed in <xref ref-type="table" rid="table-2">Table 2</xref>. For other baselines, we reproduced the results by adopting the source code with the original settings. In our proposed method, the learning rate of the optimizer is set to 1e-4 for CORA/CITESEER/EAT/AMAP/AMAC and 1e-3 for UAT. The rank <inline-formula id="ieqn-159"><mml:math id="mml-ieqn-159"><mml:mi>r</mml:mi></mml:math></inline-formula> of the attribute matrix is set to 40 for CORA, 50 for CITESEER/EAT, 95 for UAT, 20 for AMAP, and 70 for AMAC. Our encoder consists of two layers of linear MLPs, with dimensions set to 900 for CITESEER, 600 for CORA/EAT/UAT/AMAC, and 800 for AMAP.</p>

</sec>
<sec id="s4_2_3">
<label>4.2.3</label>
<title>Metrics</title>
<p>To validate the superiority of our ECN-MF compared to the baselines, we employed four widely used metrics to evaluate clustering performance, namely Accuracy (ACC), Normalized Mutual Information (NMI), Average Rand Index (ARI), and macro F1-score (F1) [<xref ref-type="bibr" rid="ref-31">31</xref>&#x2013;<xref ref-type="bibr" rid="ref-33">33</xref>].</p>
</sec>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Performance Comparison</title>
<p>To demonstrate the superiority of our proposed ECN-MF, we compared ECN-MF with 13 base- lines. Specifically, a classification method, graphMAE2 [<xref ref-type="bibr" rid="ref-34">34</xref>], is considered. Five deep clustering methods, including AE [<xref ref-type="bibr" rid="ref-35">35</xref>], DEC [<xref ref-type="bibr" rid="ref-36">36</xref>] SSGC [<xref ref-type="bibr" rid="ref-37">37</xref>], SDCN [<xref ref-type="bibr" rid="ref-5">5</xref>], and SAGSC [<xref ref-type="bibr" rid="ref-38">38</xref>], utilize autoencoders for node encoding, followed by clustering on the learned embeddings. Two hard sample mining methods, GDCL [<xref ref-type="bibr" rid="ref-39">39</xref>], and ProGCL [<xref ref-type="bibr" rid="ref-40">40</xref>], are employed. Additionally, five deep graph clustering methods for comparison: MCGC [<xref ref-type="bibr" rid="ref-14">14</xref>], MVGRL [<xref ref-type="bibr" rid="ref-13">13</xref>], AFGRL [<xref ref-type="bibr" rid="ref-27">27</xref>], AutoSSL [<xref ref-type="bibr" rid="ref-41">41</xref>] and SCDGN [<xref ref-type="bibr" rid="ref-42">42</xref>], are incorporated. These methods are designed with contrastive strategies to enhance the discriminative capability of samples.</p>
<p><xref ref-type="table" rid="table-3">Table 3</xref> reports the clustering performance of all compared methods on six benchmarks. From these results, we can derive four key observations: 1) Our ECN-MF outperforms other deep clustering methods, attributed to the benefits of contrastive learning in implicitly capturing supervisory information. 2) Compared to contrastive methods, our approach demonstrates superior performance, leveraging an information-enhanced clustering guidance module to better preserve the original data information in the embedded data. It ensures better retention of information and improves the discriminative capability of sample pairs by bringing samples from the same cluster closer and pushing away those from different clusters. 3) Our method achieves the best performance on CITESEER, showcasing the effectiveness of utilizing Singular Value Decomposition to capture latent important information of samples, particularly advantageous in handling sparse matrices. 4) Favorable results on AMAP and AMAC highlight the effectiveness of our batched low-rank singular value decomposition algorithm in handling large datasets. In summary, our method outperforms most others on six datasets with four metrics, validating the effectiveness of our proposed approach in addressing unreasonable data preprocessing and handling positive and negative sample pairs.</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Clustering performance on the six datasets (average &#x00B1; standard deviation). Red and blue values represent the best and second-best results, respectively. OOM indicates Out-Of-Memory during training</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
<col/>
</colgroup>
<thead>
<tr>
<th>Method</th>
<th>Metric</th>
<th>Classification</th>
<th colspan="5" align="center">Classical deep graph clustering</th>
<th colspan="2" align="center">Hard sample</th>
<th colspan="5" align="center">Contrastive deep graph clustering</th>
<th>Ours</th>
</tr>
<tr>
<th/>
<th/>
<th>graphMAE2</th>
<th>AE</th>
<th>DEC</th>
<th>SSGC</th>
<th>SDCN</th>
<th>SAGSC</th>
<th>GDCL</th>
<th>ProGCL</th>
<th>MCGC</th>
<th>MVGRL</th>
<th>AFGRL</th>
<th>AutoSSL</th>
<th>SCDGN</th>
<th/>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="4">CORA</td>
<td>ACC</td>
<td>33.88&#x00B1;3.26</td>
<td>49.38&#x00B1;0.91</td>
<td>46.50&#x00B1;0.26</td>
<td>69.28&#x00B1;3.70</td>
<td>35.60&#x00B1;2.83</td>
<td>66.58&#x00B1;0.13</td>
<td>70.83&#x00B1;0.47</td>
<td>57.13&#x00B1;1.23</td>
<td>42.85&#x00B1;1.13</td>
<td>70.47&#x00B1;3.70</td>
<td>26.25&#x00B1;1.24</td>
<td>63.81&#x00B1;0.57</td>
<td><styled-content style="color:#0000FF;">71.18&#x00B1;0.64</styled-content></td>
<td><styled-content style="color:#FF0000;">71.57&#x00B1;0.64</styled-content></td>
</tr>
<tr>
<td>NMI</td>
<td>12.68&#x00B1;2.65</td>
<td>25.65&#x00B1;0.65</td>
<td>23.54&#x00B1;0.34</td>
<td>54.32&#x00B1;1.92</td>
<td>14.28&#x00B1;1.91</td>
<td>50.80&#x00B1;0.17</td>
<td><styled-content style="color:#FF0000;">56.60&#x00B1;0.36</styled-content></td>
<td>41.02&#x00B1;1.34</td>
<td>24.11&#x00B1;1.00</td>
<td>55.57&#x00B1;1.54</td>
<td>12.36&#x00B1;1.54</td>
<td>47.62&#x00B1;0.45</td>
<td><styled-content style="color:#0000FF;">55.27&#x00B1;0.59</styled-content></td>
<td>54.75&#x00B1;1.21</td>
</tr>
<tr>
<td>ARI</td>
<td>08.90&#x00B1;1.67</td>
<td>21.63&#x00B1;0.58</td>
<td>15.13&#x00B1;0.42</td>
<td>46.27&#x00B1;4.01</td>
<td>07.78&#x00B1;3.24</td>
<td>40.64&#x00B1;0.29</td>
<td>48.05&#x00B1;0.72</td>
<td>30.71&#x00B1;2.70</td>
<td>14.33&#x00B1;1.26</td>
<td>48.70&#x00B1;3.94</td>
<td>14.32&#x00B1;1.87</td>
<td>38.92&#x00B1;0.77</td>
<td><styled-content style="color:#FF0000;">49.18&#x00B1;1.38</styled-content></td>
<td><styled-content style="color:#0000FF;">48.62&#x00B1;1.25</styled-content></td>
</tr>
<tr>
<td>F1</td>
<td>32.05&#x00B1;4.01</td>
<td>43.71&#x00B1;1.05</td>
<td>39.23&#x00B1;0.17</td>
<td>64.70&#x00B1;5.53</td>
<td>24.37&#x00B1;1.04</td>
<td>63.64&#x00B1;0.12</td>
<td>52.88&#x00B1;0.97</td>
<td>45.68&#x00B1;1.29</td>
<td>35.16&#x00B1;0.91</td>
<td><styled-content style="color:#0000FF;">67.15&#x00B1;1.86</styled-content></td>
<td>30.20&#x00B1;1.15</td>
<td>56.42&#x00B1;0.21</td>
<td><styled-content style="color:#FF0000;">69.59&#x00B1;0.54</styled-content></td>
<td>64.45&#x00B1;1.03</td>
</tr>
<tr>
<td rowspan="4">CITESEER</td>
<td>ACC</td>
<td>31.48&#x00B1;4.02</td>
<td>57.08&#x00B1;0.13</td>
<td>55.89&#x00B1;0.20</td>
<td><styled-content style="color:#0000FF;">68.97&#x00B1;0.34</styled-content></td>
<td>65.96&#x00B1;0.31</td>
<td>66.58&#x00B1;0.13</td>
<td>66.39&#x00B1;0.65</td>
<td>65.92&#x00B1;0.80</td>
<td>64.76&#x00B1;0.07</td>
<td>62.83&#x00B1;1.59</td>
<td>31.45&#x00B1;0.54</td>
<td>66.76&#x00B1;0.67</td>
<td>63.43&#x00B1;0.18</td>
<td><styled-content style="color:#FF0000;">71.80&#x00B1;1.23</styled-content></td>
</tr>
<tr>
<td>NMI</td>
<td>07.80&#x00B1;2.77</td>
<td>27.64&#x00B1;0.08</td>
<td>28.34&#x00B1;0.30</td>
<td><styled-content style="color:#0000FF;">42.81&#x00B1;0.20</styled-content></td>
<td>38.71&#x00B1;0.32</td>
<td>40.42&#x00B1;0.09</td>
<td>39.52&#x00B1;0.38</td>
<td>39.59&#x00B1;0.39</td>
<td>39.11&#x00B1;0.06</td>
<td>40.69&#x00B1;0.93</td>
<td>15.17&#x00B1;0.47</td>
<td>40.67&#x00B1;0.84</td>
<td>41.50&#x00B1;0.32</td>
<td><styled-content style="color:#FF0000;">45.07&#x00B1;1.15</styled-content></td>
</tr>
<tr>
<td>ARI</td>
<td>05.97&#x00B1;2.39</td>
<td>29.31&#x00B1;0.14</td>
<td>28.12&#x00B1;0.36</td>
<td><styled-content style="color:#0000FF;">44.42&#x00B1;0.32</styled-content></td>
<td>40.17&#x00B1;0.43</td>
<td>41.26&#x00B1;0.09</td>
<td>41.07&#x00B1;0.96</td>
<td>36.16&#x00B1;1.11</td>
<td>37.54&#x00B1;0.12</td>
<td>34.18&#x00B1;1.73</td>
<td>14.32&#x00B1;0.78</td>
<td>38.73&#x00B1;0.55</td>
<td>41.47&#x00B1;0.50</td>
<td><styled-content style="color:#FF0000;">46.63&#x00B1;2.15</styled-content></td>
</tr>
<tr>
<td>F1</td>
<td>30.20&#x00B1;4.62</td>
<td>53.80&#x00B1;0.11</td>
<td>52.62&#x00B1;0.17</td>
<td><styled-content style="color:#0000FF;">64.49&#x00B1;0.27</styled-content></td>
<td>63.62&#x00B1;0.24</td>
<td>62.47&#x00B1;0.05</td>
<td>61.12&#x00B1;0.70</td>
<td>57.89&#x00B1;1.98</td>
<td>59.64&#x00B1;0.05</td>
<td>59.54&#x00B1;2.17</td>
<td>30.20&#x00B1;0.71</td>
<td>58.22&#x00B1;0.68</td>
<td>59.34&#x00B1;0.85</td>
<td><styled-content style="color:#FF0000;">64.53&#x00B1;1.89</styled-content></td>
</tr>
<tr>
<td rowspan="4">EAT</td>
<td>ACC</td>
<td>34.49&#x00B1;1.16</td>
<td>38.85&#x00B1;2.32</td>
<td>36.47&#x00B1;1.60</td>
<td>32.41&#x00B1;0.45</td>
<td>39.07&#x00B1;1.51</td>
<td><styled-content style="color:#0000FF;">46.32&#x00B1;0.25</styled-content></td>
<td>33.46&#x00B1;0.18</td>
<td>43.36&#x00B1;0.87</td>
<td>32.58&#x00B1;0.29</td>
<td>32.88&#x00B1;0.71</td>
<td>37.42&#x00B1;1.24</td>
<td>31.33&#x00B1;0.52</td>
<td>32.33&#x00B1;0.00</td>
<td><styled-content style="color:#FF0000;">51.87&#x00B1;0.95</styled-content></td>
</tr>
<tr>
<td>NMI</td>
<td>06.42&#x00B1;0.38</td>
<td>06.92&#x00B1;2.80</td>
<td>04.96&#x00B1;1.74</td>
<td>04.65&#x00B1;0.21</td>
<td>08.83&#x00B1;2.54</td>
<td><styled-content style="color:#FF0000;">26.60&#x00B1;0.48</styled-content></td>
<td>13.22&#x00B1;0.33</td>
<td>23.93&#x00B1;0.45</td>
<td>07.04&#x00B1;0.56</td>
<td>11.72&#x00B1;1.08</td>
<td>11.44&#x00B1;1.41</td>
<td>07.63&#x00B1;0.85</td>
<td>05.80&#x00B1;0.00</td>
<td><styled-content style="color:#0000FF;">24.05&#x00B1;1.67</styled-content></td>
</tr>
<tr>
<td>ARI</td>
<td>04.24&#x00B1;0.49</td>
<td>05.11&#x00B1;2.65</td>
<td>03.60&#x00B1;1.87</td>
<td>01.53&#x00B1;0.04</td>
<td>06.31&#x00B1;1.95</td>
<td><styled-content style="color:#FF0000;">24.00&#x00B1;0.53</styled-content></td>
<td>04.31&#x00B1;0.29</td>
<td>15.03&#x00B1;0.98</td>
<td>01.33&#x00B1;0.14</td>
<td>04.68&#x00B1;1.30</td>
<td>06.57&#x00B1;1.73</td>
<td>02.13&#x00B1;0.67</td>
<td>02.55&#x00B1;0.00</td>
<td><styled-content style="color:#0000FF;">22.75&#x00B1;1.03</styled-content></td>
</tr>
<tr>
<td>F1</td>
<td>32.87&#x00B1;1.59</td>
<td>38.75&#x00B1;2.25</td>
<td>34.84&#x00B1;1.28</td>
<td>26.49&#x00B1;0.66</td>
<td>33.42&#x00B1;3.10</td>
<td>38.93&#x00B1;0.12</td>
<td>25.02&#x00B1;0.21</td>
<td><styled-content style="color:#0000FF;">42.54&#x00B1;0.45</styled-content></td>
<td>27.03&#x00B1;0.16</td>
<td>25.35&#x00B1;0.75</td>
<td>30.53&#x00B1;1.47</td>
<td>21.82&#x00B1;0.98</td>
<td>25.11&#x00B1;0.00</td>
<td><styled-content style="color:#FF0000;">50.64&#x00B1;2.63</styled-content></td>
</tr>
<tr>
<td rowspan="4">UAT</td>
<td>ACC</td>
<td>34.51&#x00B1;1.18</td>
<td>46.82&#x00B1;1.14</td>
<td>45.61&#x00B1;1.84</td>
<td>36.74&#x00B1;0.81</td>
<td><styled-content style="color:#FF0000;">52.25&#x00B1;1.91</styled-content></td>
<td>42.94&#x00B1;0.57</td>
<td>48.70&#x00B1;0.06</td>
<td>45.38&#x00B1;0.58</td>
<td>41.93&#x00B1;0.56</td>
<td>44.16&#x00B1;1.38</td>
<td>41.50&#x00B1;0.25</td>
<td>42.52&#x00B1;0.64</td>
<td>44.86&#x00B1;1.62</td>
<td><styled-content style="color:#0000FF;">52.01&#x00B1;0.96</styled-content></td>
</tr>
<tr>
<td>NMI</td>
<td>06.43&#x00B1;0.37</td>
<td>17.18&#x00B1;1.60</td>
<td>16.63&#x00B1;2.39</td>
<td>08.04&#x00B1;0.18</td>
<td>21.61&#x00B1;1.26</td>
<td>18.30&#x00B1;0.16</td>
<td><styled-content style="color:#FF0000;">25.10&#x00B1;0.01</styled-content></td>
<td>22.04&#x00B1;2.23</td>
<td>16.64&#x00B1;0.41</td>
<td>21.53&#x00B1;0.94</td>
<td>17.33&#x00B1;0.54</td>
<td>17.86&#x00B1;0.22</td>
<td>12.90&#x00B1;0.45</td>
<td><styled-content style="color:#0000FF;">24.73&#x00B1;1.24</styled-content></td>
</tr>
<tr>
<td>ARI</td>
<td>04.25&#x00B1;0.49</td>
<td>13.59&#x00B1;2.02</td>
<td>13.14&#x00B1;1.97</td>
<td>05.12&#x00B1;0.27</td>
<td>21.63&#x00B1;1.49</td>
<td>13.40&#x00B1;0.36</td>
<td><styled-content style="color:#0000FF;">21.76&#x00B1;0.01</styled-content></td>
<td>14.74&#x00B1;1.99</td>
<td>12.21&#x00B1;0.13</td>
<td>17.12&#x00B1;1.46</td>
<td>13.62&#x00B1;0.57</td>
<td>13.13&#x00B1;0.71</td>
<td>11.80&#x00B1;0.77</td>
<td><styled-content style="color:#FF0000;">23.78&#x00B1;1.59</styled-content></td>
</tr>
<tr>
<td>F1</td>
<td>32.90&#x00B1;1.61</td>
<td><styled-content style="color:#0000FF;">45.66&#x00B1;1.49</styled-content></td>
<td>44.22&#x00B1;1.51</td>
<td>29.50&#x00B1;1.57</td>
<td>45.59&#x00B1;3.54</td>
<td>38.06&#x00B1;1.26</td>
<td><styled-content style="color:#FF0000;">45.69&#x00B1;0.08</styled-content></td>
<td>39.30&#x00B1;1.82</td>
<td>35.78&#x00B1;0.38</td>
<td>39.44&#x00B1;2.19</td>
<td>36.52&#x00B1;0.89</td>
<td>34.94&#x00B1;0.87</td>
<td>41.33&#x00B1;1.82</td>
<td>43.93&#x00B1;1.79</td>
</tr>
<tr>
<td rowspan="4">AMAP</td>
<td>ACC</td>
<td>34.49&#x00B1;1.16</td>
<td>48.25&#x00B1;0.08</td>
<td>47.22&#x00B1;0.08</td>
<td>60.23&#x00B1;0.19</td>
<td>53.44&#x00B1;0.81</td>
<td>57.80&#x00B1;0.27</td>
<td>43.75&#x00B1;0.78</td>
<td>51.53&#x00B1;0.38</td>
<td rowspan="4">OOM</td>
<td>41.07&#x00B1;3.12</td>
<td><styled-content style="color:#0000FF;">75.51&#x00B1;0.77</styled-content></td>
<td>54.55&#x00B1;0.97</td>
<td>70.55&#x00B1;0.03</td>
<td><styled-content style="color:#FF0000;">76.88&#x00B1;0.80</styled-content></td>
</tr>
<tr>
<td>NMI</td>
<td>06.43&#x00B1;0.37</td>
<td>38.76&#x00B1;0.30</td>
<td>37.35&#x00B1;0.05</td>
<td>60.37&#x00B1;0.15</td>
<td>44.85&#x00B1;0.83</td>
<td>53.60&#x00B1;0.41</td>
<td>37.32&#x00B1;0.28</td>
<td>39.56&#x00B1;0.39</td>
<td>30.28&#x00B1;3.94</td>
<td><styled-content style="color:#0000FF;">64.05&#x00B1;0.15</styled-content></td>
<td>48.56&#x00B1;0.71</td>
<td>64.02&#x00B1;0.04</td>
<td><styled-content style="color:#FF0000;">68.21&#x00B1;1.10</styled-content></td>
</tr>
<tr>
<td>ARI</td>
<td>04.25&#x00B1;0.49</td>
<td>20.80&#x00B1;0.47</td>
<td>18.59&#x00B1;0.04</td>
<td>35.99&#x00B1;0.47</td>
<td>31.21&#x00B1;1.23</td>
<td>30.90&#x00B1;0.03</td>
<td>21.57&#x00B1;0.51</td>
<td>34.18&#x00B1;0.89</td>
<td>18.77&#x00B1;2.34</td>
<td><styled-content style="color:#0000FF;">54.45&#x00B1;0.48</styled-content></td>
<td>26.87&#x00B1;0.34</td>
<td>53.02&#x00B1;0.03</td>
<td><styled-content style="color:#FF0000;">58.98&#x00B1;1.38</styled-content></td>
</tr>
<tr>
<td>F1</td>
<td>32.87&#x00B1;1.59</td>
<td>47.87&#x00B1;0.20</td>
<td>46.71&#x00B1;0.12</td>
<td>52.79&#x00B1;0.01</td>
<td>50.66&#x00B1;1.49</td>
<td>57.63&#x00B1;0.06</td>
<td>38.37&#x00B1;0.29</td>
<td>31.97&#x00B1;0.44</td>
<td>32.88&#x00B1;5.50</td>
<td><styled-content style="color:#0000FF;">69.99&#x00B1;0.34</styled-content></td>
<td>54.47&#x00B1;0.83</td>
<td>63.93&#x00B1;0.04</td>
<td><styled-content style="color:#FF0000;">71.58&#x00B1;0.31</styled-content></td>
</tr>
<tr>
<td rowspan="4">AMAC</td>
<td>ACC</td>
<td>35.76&#x00B1;1.30</td>
<td>21.67&#x00B1;0.51</td>
<td>19.03&#x00B1;0</td>
<td><styled-content style="color:#0000FF;">52.22&#x00B1;0.01</styled-content></td>
<td>43.04&#x00B1;4.09</td>
<td>43.89&#x00B1;0.01</td>
<td rowspan="4">OOM</td>
<td rowspan="4">OOM</td>
<td rowspan="4">OOM</td>
<td rowspan="4">OOM</td>
<td rowspan="4">OOM</td>
<td rowspan="4">OOM</td>
<td>51.64&#x00B1;0.04</td>
<td><styled-content style="color:#FF0000;">54.12&#x00B1;0.72</styled-content></td>
</tr>
<tr>
<td>NMI</td>
<td>05.74&#x00B1;0.65</td>
<td>03.21&#x00B1;0.79</td>
<td>02.82&#x00B1;0</td>
<td><styled-content style="color:#FF0000;">52.55&#x00B1;0.02</styled-content></td>
<td>23.30&#x00B1;6.60</td>
<td>38.92&#x00B1;0.01</td>
<td><styled-content style="color:#0000FF;">40.90&#x00B1;0.01</styled-content></td>
<td>35.59&#x00B1;0.98</td>
</tr>
<tr>
<td>ARI</td>
<td>04.18&#x00B1;0.75</td>
<td>&#x2212;2.06&#x00B1;0.96</td>
<td>&#x2212;1.26&#x00B1;0</td>
<td><styled-content style="color:#0000FF;">32.30&#x00B1;0.03</styled-content></td>
<td>17.40&#x00B1;7.49</td>
<td>16.83&#x00B1;0.02</td>
<td>29.69&#x00B1;0.04</td>
<td><styled-content style="color:#FF0000;">33.21&#x00B1;1.25</styled-content></td>
</tr>
<tr>
<td>F1</td>
<td>32.69&#x00B1;0.83</td>
<td>12.16&#x00B1;0.78</td>
<td>10.84&#x00B1;0</td>
<td>39.78&#x00B1;0.01</td>
<td>20.95&#x00B1;3.70</td>
<td><styled-content style="color:#0000FF;">45.29&#x00B1;0.04</styled-content></td>
<td><styled-content style="color:#FF0000;">51.12&#x00B1;0.02</styled-content></td>
<td>32.99&#x00B1;0.49</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_4">
<label>4.4</label>
<title>Time Complexity Analysis</title>
<p>Firstly, we perform Singular Value Decomposition on attributes, and the time complexity of calculating the low-rank process to obtain the reconstructed attribute <inline-formula id="ieqn-160"><mml:math id="mml-ieqn-160"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> in <xref ref-type="fig" rid="fig-2">Fig. 2</xref> is <inline-formula id="ieqn-161"><mml:math id="mml-ieqn-161"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:mi>h</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>B</mml:mi><mml:mi>D</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula id="ieqn-162"><mml:math id="mml-ieqn-162"><mml:mi>r</mml:mi></mml:math></inline-formula> is the rank size, <inline-formula id="ieqn-163"><mml:math id="mml-ieqn-163"><mml:mi>B</mml:mi></mml:math></inline-formula> is the batch size, and <inline-formula id="ieqn-164"><mml:math id="mml-ieqn-164"><mml:mi>D</mml:mi></mml:math></inline-formula> is the dimension. Subsequently, during the training phase, we employ a two-layer residual graph convolutional neural network (GCN) model. The time complexity for computing the two views is <inline-formula id="ieqn-165"><mml:math id="mml-ieqn-165"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>N</mml:mi><mml:mi>D</mml:mi><mml:mi>d</mml:mi><mml:mo>+</mml:mo><mml:mn>4</mml:mn><mml:mi>N</mml:mi><mml:mi>d</mml:mi><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula id="ieqn-166"><mml:math id="mml-ieqn-166"><mml:mi>d</mml:mi></mml:math></inline-formula> is the dimension of the graph convolutional neural network, <inline-formula id="ieqn-167"><mml:math id="mml-ieqn-167"><mml:mi>N</mml:mi></mml:math></inline-formula> is the number of samples, and <inline-formula id="ieqn-168"><mml:math id="mml-ieqn-168"><mml:mi>D</mml:mi></mml:math></inline-formula> is the dimension of the original samples. The time complexity for calculating the mean square error loss function is <inline-formula id="ieqn-169"><mml:math id="mml-ieqn-169"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, and <inline-formula id="ieqn-170"><mml:math id="mml-ieqn-170"><mml:mi>N</mml:mi></mml:math></inline-formula> is the number of samples. The time complexity for high confidence selection is <inline-formula id="ieqn-171"><mml:math id="mml-ieqn-171"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mi>K</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula id="ieqn-172"><mml:math id="mml-ieqn-172"><mml:mi>K</mml:mi></mml:math></inline-formula> is the number of clusters, and <inline-formula id="ieqn-173"><mml:math id="mml-ieqn-173"><mml:mi>N</mml:mi></mml:math></inline-formula> is the number of samples. The time complexity for computing the target distribution is <inline-formula id="ieqn-174"><mml:math id="mml-ieqn-174"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mi>K</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula id="ieqn-175"><mml:math id="mml-ieqn-175"><mml:mi>N</mml:mi></mml:math></inline-formula> is the number of samples. The time complexity for the KL divergence loss function is <inline-formula id="ieqn-176"><mml:math id="mml-ieqn-176"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mi>K</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, and the time complexity for the cross-entropy loss function is <inline-formula id="ieqn-177"><mml:math id="mml-ieqn-177"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mi>K</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula id="ieqn-178"><mml:math id="mml-ieqn-178"><mml:mi>N</mml:mi></mml:math></inline-formula> is the number of samples. Therefore, the overall time complexity of our algorithm is <inline-formula id="ieqn-179"><mml:math id="mml-ieqn-179"><mml:mi>O</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:mi>h</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>B</mml:mi><mml:mi>D</mml:mi><mml:mi>r</mml:mi><mml:mo>+</mml:mo><mml:mi>N</mml:mi><mml:mi>D</mml:mi><mml:mi>d</mml:mi><mml:mo>+</mml:mo><mml:mi>N</mml:mi><mml:mi>d</mml:mi><mml:mi>d</mml:mi><mml:mo>+</mml:mo><mml:mi>N</mml:mi><mml:mi>K</mml:mi><mml:mo>+</mml:mo><mml:mi>N</mml:mi><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
</sec>
<sec id="s4_5">
<label>4.5</label>
<title>Ablation Studies</title>
<p>In this section, we first experimentally validate the effectiveness of our proposed data augmentation method and periodic update strategy, as shown in <xref ref-type="table" rid="table-4">Table 4</xref>. For simplicity, we denote the Batch low-rank Singular Value Decomposition as B and the periodic update as P. Note that in order to replace the B operation, we use a mask to generate different views of the same node, with a mask rate of 0.5. &#x201C;(w/o)B &#x0026; P&#x201D; represents not using the batch low-rank Singular Value Decomposition operation and periodic update, while B &#x002B; P indicates the usage of both. <xref ref-type="table" rid="table-4">Table 4</xref> displays the convergence results after running 1000 epochs. Based on the observed results, we conclude that the performance would degrade without B and P, indicating that these two strategies contribute significantly to the performance improvement.</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>The ablation study of the proposed Batch Low-Rank Singular Value Decomposition Operation (B) and periodic update (P) on the five datasets</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th></th>
<th></th>
<th>CORA</th>
<th>CITESEER</th>
<th>EAT</th>
<th>UAT</th>
<th>AMAP</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2"><bold>ACC</bold></td>
<td>(w/o)B &#x0026; P</td>
<td>50.36</td>
<td>58.97</td>
<td>32.83</td>
<td>44.28</td>
<td>30.95</td>
</tr>
<tr>
<td>B &#x002B; P</td>
<td><bold>72.08</bold></td>
<td><bold>71.89</bold></td>
<td><bold>52.13</bold></td>
<td><bold>52.35</bold></td>
<td><bold>76.03</bold></td>
</tr>
<tr>
<td rowspan="2"><bold>NMI</bold></td>
<td>(w/o)B &#x0026; P</td>
<td>37.54</td>
<td>32.60</td>
<td>08.05</td>
<td>17.82</td>
<td>15.99</td>
</tr>
<tr>
<td>B &#x002B; P</td>
<td><bold>54.67</bold></td>
<td><bold>45.86</bold></td>
<td><bold>25.66</bold></td>
<td><bold>25.06</bold></td>
<td><bold>64.63</bold></td>
</tr>
<tr>
<td rowspan="2"><bold>ARI</bold></td>
<td>(w/o)B &#x0026; P</td>
<td>24.12</td>
<td>28.84</td>
<td>02.93</td>
<td>13.65</td>
<td>09.08</td>
</tr>
<tr>
<td>B &#x002B; P</td>
<td><bold>49.17</bold></td>
<td><bold>47.83</bold></td>
<td><bold>22.17</bold></td>
<td><bold>21.10</bold></td>
<td><bold>55.95</bold></td>
</tr>
<tr>
<td rowspan="2"><bold>F1</bold></td>
<td>(w/o)B &#x0026; P</td>
<td>52.40</td>
<td>55.31</td>
<td>26.13</td>
<td>40.87</td>
<td>18.28</td>
</tr>
<tr>
<td>B &#x002B; P</td>
<td><bold>64.33</bold></td>
<td><bold>64.48</bold></td>
<td><bold>52.05</bold></td>
<td><bold>44.17</bold></td>
<td><bold>67.67</bold></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In <xref ref-type="fig" rid="fig-3">Fig. 3</xref>, we visualize the accuracy throughout the entire training process until convergence using a line chart. The graph demonstrates that our model exhibits robust performance. Overall, the experimental results validate the effectiveness of B and P.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Accuracy analysis of the proposed batched low-rank Singular Value Decomposition (B) and periodic update (P) on the five datasets is presented. The red line represents the use of batched low-rank Singular Value Decomposition operation and periodic updates, while the blue line represents the absence of batched low-rank Singular Value Decomposition operation and periodic updates. This comparative experiment demonstrates the effectiveness of our proposed method</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_51816-fig-3.tif"/>
</fig>
</sec>
<sec id="s4_6">
<label>4.6</label>
<title>Hyperparameter Analysis</title>
<p>In this section, we will analyze the hyperparameters <inline-formula id="ieqn-180"><mml:math id="mml-ieqn-180"><mml:mi>r</mml:mi></mml:math></inline-formula>, <inline-formula id="ieqn-181"><mml:math id="mml-ieqn-181"><mml:mi>&#x03B1;</mml:mi></mml:math></inline-formula>, and <inline-formula id="ieqn-182"><mml:math id="mml-ieqn-182"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula> to demonstrate their impact on the dataset. For the rank <inline-formula id="ieqn-183"><mml:math id="mml-ieqn-183"><mml:mi>r</mml:mi></mml:math></inline-formula>, we set the range of values from 10 to 100, and for <inline-formula id="ieqn-184"><mml:math id="mml-ieqn-184"><mml:mi>&#x03B1;</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-185"><mml:math id="mml-ieqn-185"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula>, we set the range of values from 0.1 to 1.</p>
<sec id="s4_6_1">
<label>4.6.1</label>
<title><italic>Analysis of Hyperparameter</italic> <inline-formula id="ieqn-186"><mml:math id="mml-ieqn-186"><mml:mi>r</mml:mi></mml:math></inline-formula></title>
<p><xref ref-type="fig" rid="fig-4">Fig. 4</xref> depicts the performance variation of ECN-MF across the range of <inline-formula id="ieqn-187"><mml:math id="mml-ieqn-187"><mml:mi>r</mml:mi></mml:math></inline-formula> from 10 to 100. Key observations include: 1) Appropriately setting the hyperparameter <inline-formula id="ieqn-188"><mml:math id="mml-ieqn-188"><mml:mi>r</mml:mi></mml:math></inline-formula> can effectively enhance clustering performance. Without adjusting other data settings and only modifying the rank, the model achieves the highest accuracy of 73.13% on CITESEER, 73.61% on CORA, 54.39% on EAT, and 52.94% on UAT. 2) The performance of the hyperparameter <inline-formula id="ieqn-189"><mml:math id="mml-ieqn-189"><mml:mi>r</mml:mi></mml:math></inline-formula> remains relatively stable over a wide range, particularly excelling on sparse datasets such as CITESEER and UAT. 3) Examining the trend in average accuracy, we observe fluctuations and declines in clustering accuracy as the rank increases, particularly on datasets like CORA and EAT. This fluctuation is attributed to the diverse characteristics of the datasets; as the rank increases, the low-rank operation might extract more low-correlation information. 4) ECN-MF requires setting an appropriate rank based on the specific features of the dataset. Nonetheless, the experimental results indicate that the range of rank values can be relatively small, reducing the model&#x2019;s time complexity even with a smaller rank.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>The trend chart of clustering accuracy with varying <inline-formula id="ieqn-190"><mml:math id="mml-ieqn-190"><mml:mi mathvariant="bold-italic">r</mml:mi></mml:math></inline-formula> on four datasets demonstrates that appropriately setting a low rank reduces the model&#x2019;s time complexity while effectively enhancing clustering performance</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_51816-fig-4.tif"/>
</fig>
</sec>
<sec id="s4_6_2">
<label>4.6.2</label>
<title><italic>Analysis of Hyperparameters</italic> <inline-formula id="ieqn-191"><mml:math id="mml-ieqn-191"><mml:mi>&#x03B1;</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-192"><mml:math id="mml-ieqn-192"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula></title>
<p><xref ref-type="fig" rid="fig-5">Fig. 5</xref> illustrates the variation in clustering performance of ECN-MF on the CITESEER dataset across the range of <inline-formula id="ieqn-193"><mml:math id="mml-ieqn-193"><mml:mi>&#x03B1;</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-194"><mml:math id="mml-ieqn-194"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula> from 0.1 to 1. When modifying only <inline-formula id="ieqn-195"><mml:math id="mml-ieqn-195"><mml:mi>&#x03B1;</mml:mi></mml:math></inline-formula> without adjusting the other parameters, the model achieves a maximum accuracy of 72.25% on CITESEER. Similarly, when modifying only <inline-formula id="ieqn-196"><mml:math id="mml-ieqn-196"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula> without adjusting the other parameters, the model achieves a maximum accuracy of 72.22% on CITESEER. It can be observed from the graph that our model is not sensitive to the values of <inline-formula id="ieqn-197"><mml:math id="mml-ieqn-197"><mml:mi>&#x03B1;</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-198"><mml:math id="mml-ieqn-198"><mml:mi>&#x03B2;</mml:mi></mml:math></inline-formula>, and both can yield good results.</p>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>The trend chart of clustering accuracy with varying &#x03B1; and &#x03B2; on the CITESEER dataset, where the horizontal axis represents the values of <inline-formula id="ieqn-199"><mml:math id="mml-ieqn-199"><mml:mrow><mml:mi>&#x03B1;</mml:mi></mml:mrow></mml:math></inline-formula> and <inline-formula id="ieqn-200"><mml:math id="mml-ieqn-200"><mml:mrow><mml:mi>&#x03B2;</mml:mi></mml:mrow></mml:math></inline-formula>, and the vertical axis represents the values of the corresponding four evaluation metrics</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_51816-fig-5.tif"/>
</fig>
</sec>
</sec>
<sec id="s4_7">
<label>4.7</label>
<title>Visualization Analysis</title>
<p>To visually showcase the superiority of ECN-MF, we employ the t-SNE algorithm (Maaten and Hinton 2008) to visualize the distribution of the learned clustering embeddings Z in a two-dimensional space. As shown in <xref ref-type="fig" rid="fig-6">Fig. 6</xref>, ECN-MF can better reveal the intrinsic clustering structure among the data.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>2D visualization of the five datasets. The first row corresponds to the original data, while the second row corresponds to the distribution of ECN-MF. The visualization of partial samples reflects the effectiveness of our method</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_51816-fig-6.tif"/>
</fig>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Conclusion</title>
<p>This paper introduces an Efficient Clustering Network based on Matrix Factorization (ECN-MF) to alleviate the negative impact of inappropriate data augmentation and enhance the quality of positive samples. By simplifying the network structure, introducing novel data augmentation methods, and designing a mutual information-enhanced clustering guidance module, ECN-MF improves its capability to handle sparse and large datasets. It brings samples from the same cluster closer while pushing samples from different clusters apart. The results of this study demonstrate the effectiveness and superiority of ECN-MF in addressing the challenges of preprocessing deep graph clustering tasks and handling positive and negative sample pairs. The paper uses hyperparameters to define the rank of the Singular Value Decomposition without special treatment of hard samples. In the future, we hope to explore new avenues of research, including: 1) employing adaptive rank selection for data, accommodating a broader range of datasets; 2) focusing more on challenging samples to enhance data mining and processing capabilities; 3) optimizing loss functions tailored for Singular Value Decomposition and challenging samples to improve clustering performance.</p>
</sec>
</body>
<back>
<ack><p>The authors would like to acknowledge the valuable feedback provided by the reviewers.</p>
</ack>
<sec><title>Funding Statement</title>
<p>This work was supported by the Key Research and Development Program of Hainan Province (Grant Nos. ZDYF2023GXJS163, ZDYF2024GXJS014), National Natural Science Foundation of China (NSFC) (Grant Nos. 62162022, 62162024), the Major Science and Technology Project of Hainan Province (Grant No. ZDKJ2020012), Hainan Provincial Natural Science Foundation of China (Grant No. 620MS021), Youth Foundation Project of Hainan Natural Science Foundation (621QN211), Innovative Research Project for Graduate Students in Hainan Province (Grant Nos. Qhys2023-96, Qhys2023-95).</p>
</sec>
<sec><title>Author Contributions</title>
<p>The authors confirm contribution to the paper as follows: study conception and design: J. Li; analysis and interpretation of results: J. Li, F. Zeng; draft manuscript preparation: J. Cheng, J. Li, F. Zeng; data collection: Z. Tao, Y. Yang. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability"><title>Availability of Data and Materials</title>
<p>After the publication of the paper, the code will be made public on the author&#x2019;s GitHub.</p>
</sec>
<sec sec-type="COI-statement"><title>Conflicts of Interest</title>
<p>The authors declare that they have no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Berahmand</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Mohammadi</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Sheikhpour</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Li</surname></string-name>, and <string-name><given-names>Y.</given-names> <surname>Xu</surname></string-name></person-group>, &#x201C;<article-title>WSNMF: Weighted symmetric nonnegative matrix factorization for attributed graph clustering</article-title>,&#x201D; <source>Neurocomputing</source>, vol. <volume>566</volume>, no. <issue>2</issue>, pp. <fpage>127041</fpage>, <year>2024</year>. doi: <pub-id pub-id-type="doi">10.1016/j.neucom.2023.127041</pub-id>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Berahmand</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Li</surname></string-name>, and <string-name><given-names>Y.</given-names> <surname>Xu</surname></string-name></person-group>, &#x201C;<article-title>A deep semi-supervised community detection based on point-wise mutual information</article-title>,&#x201D; <source>IEEE Trans. Comput. Soc. Syst.</source>, pp. <fpage>1</fpage>&#x2013;<lpage>13</lpage>, <year>2023</year>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>E.</given-names> <surname>Pan</surname></string-name> and <string-name><given-names>Z.</given-names> <surname>Kang</surname></string-name></person-group>, &#x201C;<article-title>Multi-view contrastive graph clustering</article-title>,&#x201D; <source>Adv. Neural Inf. Process Syst.</source>, vol. <volume>34</volume>, pp. <fpage>2148</fpage>&#x2013;<lpage>2159</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Pan</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Hu</surname></string-name>, <string-name><given-names>G. D.</given-names> <surname>Long</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Jiang</surname></string-name> and <string-name><given-names>C.</given-names> <surname>Zhang</surname></string-name></person-group>, &#x201C;<article-title>Attributed graph clustering: A deep attentional embedding approach</article-title>,&#x201D; in <conf-name>Proc. Twenty-Eighth Int. Joint Conf. Artif. Intell.</conf-name>, <publisher-loc>Macao, China</publisher-loc>, <year>2019</year>, pp. <fpage>3670</fpage>&#x2013;<lpage>3676</lpage>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>D.</given-names> <surname>Bo</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Shi</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Zhu</surname></string-name>, <string-name><given-names>E.</given-names> <surname>Lu</surname></string-name> and <string-name><given-names>P.</given-names> <surname>Cui</surname></string-name></person-group>, &#x201C;<article-title>Structural deep clustering network</article-title>,&#x201D; in <conf-name>Proc. web conf. 2020</conf-name>, <publisher-loc>Taipei, Taiwan</publisher-loc>, <year>2020</year>, pp. <fpage>1400</fpage>&#x2013;<lpage>1410</lpage>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Tu</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Deep fusion clustering network</article-title>,&#x201D; in <conf-name>Proc. AAAI Conf. Artif. Intell.</conf-name>, <publisher-loc>Vancouver, Canada</publisher-loc>, <year>2021</year>, vol. <volume>35</volume>, pp. <fpage>9978</fpage>&#x2013;<lpage>9987</lpage>. doi: <pub-id pub-id-type="doi">10.1609/aaai.v35i11.17198</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Peng</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Jia</surname></string-name>, and <string-name><given-names>J.</given-names> <surname>Hou</surname></string-name></person-group>, &#x201C;<article-title>Attention-driven graph clustering network</article-title>,&#x201D; in <conf-name>Proc. 29th ACM Int. Conf. Multimed.</conf-name>, <publisher-loc>Sichuan, China</publisher-loc>, <year>2021</year>, pp. <fpage>935</fpage>&#x2013;<lpage>943</lpage>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Park</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Lee</surname></string-name>, <string-name><given-names>H. J.</given-names> <surname>Chang</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Lee</surname></string-name>, and <string-name><given-names>J. Y.</given-names> <surname>Choi</surname></string-name></person-group>, &#x201C;<article-title>Symmetric graph convolutional autoencoder for unsupervised graph representation learning</article-title>,&#x201D; in <conf-name>Proc. IEEE/CVF Int. Conf. Comput. Vision</conf-name>, <publisher-loc>Paris, France</publisher-loc>, <year>2019</year>, pp. <fpage>6519</fpage>&#x2013;<lpage>6528</lpage>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Cheng</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Tao</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Xie</surname></string-name>, and <string-name><given-names>Q.</given-names> <surname>Gao</surname></string-name></person-group>, &#x201C;<article-title>Multi-view attribute graph convolution networks for clustering</article-title>,&#x201D; in <conf-name>Proc. Twenty-Ninth Int. Conf. Int. Joint Conf. Artif. Int.</conf-name>, <publisher-loc>Yokohama, Japan</publisher-loc>, <year>2021</year>, pp. <fpage>2973</fpage>&#x2013;<lpage>2979</lpage>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Pan</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Hu</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Fung</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Long</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Jiang</surname></string-name> and <string-name><given-names>C.</given-names> <surname>Zhang</surname></string-name></person-group>, &#x201C;<article-title>Learning graph embedding with adversarial training methods</article-title>,&#x201D; <source>IEEE Trans. Cybern.</source>, vol. <volume>50</volume>, no. <issue>6</issue>, pp. <fpage>2475</fpage>&#x2013;<lpage>2487</lpage>, <year>2019</year>. doi: <pub-id pub-id-type="doi">10.1109/TCYB.2019.2932096</pub-id>; <pub-id pub-id-type="pmid">31484146</pub-id></mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Tao</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Li</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Wang</surname></string-name>, and <string-name><given-names>Y.</given-names> <surname>Fu</surname></string-name></person-group>, &#x201C;<article-title>Adversarial graph embedding for ensemble clustering</article-title>,&#x201D; in <conf-name>Proc. 28th Int. Joint Conf. Artif. Intell.</conf-name>, <publisher-loc>Macao, China</publisher-loc>, <year>2019</year>, pp. <fpage>3562</fpage>&#x2013;<lpage>3568</lpage>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>G.</given-names> <surname>Cui</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Zhou</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Yang</surname></string-name>, and <string-name><given-names>Z.</given-names> <surname>Liu</surname></string-name></person-group>, &#x201C;<article-title>Adaptive graph encoder for attributed graph embedding</article-title>,&#x201D; in <conf-name>Proc. 26th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min.</conf-name>, <publisher-loc>California, USA</publisher-loc>, <year>2020</year>, pp. <fpage>976</fpage>&#x2013;<lpage>985</lpage>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Hassani</surname></string-name> and <string-name><given-names>A. H.</given-names> <surname>Khasahmadi</surname></string-name></person-group>, &#x201C;<article-title>Contrastive multi-view representation learning on graphs</article-title>,&#x201D; in <conf-name>Int. Conf. Mach. Learn.</conf-name>, <publisher-loc>Vienna, Austria</publisher-loc>, <year>2020</year>, pp. <fpage>4116</fpage>&#x2013;<lpage>4126</lpage>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Xia</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Gao</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Han</surname></string-name> and <string-name><given-names>X.</given-names> <surname>Gao</surname></string-name></person-group>, &#x201C;<article-title>Multi-view graph embedding clustering network: Joint self-supervision and block diagonal representation</article-title>,&#x201D; <source>Neural Netw.</source>, vol. <volume>145</volume>, no. <issue>11</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>9</lpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1016/j.neunet.2021.10.006</pub-id>; <pub-id pub-id-type="pmid">34710786</pub-id></mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Liu</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Deep graph clustering via dual correlation reduction</article-title>,&#x201D; in <conf-name>Proc. AAAI Con. Artif. Intell.</conf-name>, <publisher-loc>Vancouver, Canada</publisher-loc>, <year>2022</year>, vol. <volume>36</volume>, pp. <fpage>7603</fpage>&#x2013;<lpage>7611</lpage>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Kang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Ruan</surname></string-name>, and <string-name><given-names>X.</given-names> <surname>He</surname></string-name></person-group>, &#x201C;<article-title>Multilayer graph contrastive clustering network</article-title>,&#x201D; <source>Inf. Sci.</source>, vol. <volume>613</volume>, pp. <fpage>256</fpage>&#x2013;<lpage>267</lpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1016/j.ins.2022.09.042</pub-id>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Yang</surname></string-name> and <string-name><given-names>H.</given-names> <surname>Wang</surname></string-name></person-group>, &#x201C;<article-title>Multi-view clustering: A survey, big data mining and analytics</article-title>,&#x201D; <source>Big Data Min. Anal.</source>, vol. <volume>1</volume>, no. <issue>2</issue>, pp. <fpage>83</fpage>&#x2013;<lpage>107</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>X.</given-names> <surname>Cai</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Huang</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Xia</surname></string-name>, and <string-name><given-names>X.</given-names> <surname>Ren</surname></string-name></person-group>, &#x201C;<article-title>LightGCL: Simple yet effective graph contrastive learning for recommendation</article-title>,&#x201D; in <conf-name>Eleventh Int. Conf. Learn. Rep.</conf-name>, <year>2022</year>, pp. <fpage>1</fpage>&#x2013;<lpage>15</lpage>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G. A.</given-names> <surname>Khan</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Hu</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Li</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Diallo</surname></string-name>, and <string-name><given-names>H.</given-names> <surname>Wang</surname></string-name></person-group>, &#x201C;<article-title>Multi-view data clustering via non-negative matrix factorization with manifold regularization</article-title>,&#x201D; <source>Int. J. Mach. Learn. Cybern.</source>, vol. <volume>13</volume>, no. <issue>3</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>13</lpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1007/s13042-021-01307-7</pub-id>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>T.</given-names> <surname>Chen</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Kornblith</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Norouzi</surname></string-name>, and <string-name><given-names>G.</given-names> <surname>Hinton</surname></string-name></person-group>, &#x201C;<article-title>A simple framework for contrastive learning of visual representations</article-title>,&#x201D; in <conf-name>Int. Conf. Mach. Learn.</conf-name>, <publisher-loc>Vienna, Austria</publisher-loc>, <year>2020</year>, pp. <fpage>1597</fpage>&#x2013;<lpage>1607</lpage>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Zbontar</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Jing</surname></string-name>, <string-name><given-names>I.</given-names> <surname>Misra</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>LeCun</surname></string-name>, and <string-name><given-names>S.</given-names> <surname>Deny</surname></string-name></person-group>, &#x201C;<article-title>Barlow twins: Self-supervised learning via redundancy reduction</article-title>,&#x201D; in <conf-name>Int. Conf. Mach. Learn.</conf-name>, <year>2021</year>, pp. <fpage>12310</fpage>&#x2013;<lpage>12320</lpage>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>Zhong</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Graph contrastive clustering</article-title>,&#x201D; in <conf-name>Proc. IEEE/CVF Int. Conf. Comput. Vis.</conf-name>, <year>2021</year>, pp. <fpage>9224</fpage>&#x2013;<lpage>9233</lpage>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>X.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Hu</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Zhou</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Liu</surname></string-name>, and <string-name><given-names>E.</given-names> <surname>Zhu</surname></string-name></person-group>, &#x201C;<article-title>Interpolation-based contrastive learning for few-label semi-supervised learning</article-title>,&#x201D; <source>IEEE Trans. Neural Netw. Learn. Syst.</source>, vol. <volume>35</volume>, no. <issue>2</issue>, pp. <fpage>2054</fpage>&#x2013;<lpage>2065</lpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1109/TNNLS.2022.3186512</pub-id>; <pub-id pub-id-type="pmid">35797319</pub-id></mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>You</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Chen</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Sui</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Chen</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Wang</surname></string-name> and <string-name><given-names>Y.</given-names> <surname>Shen</surname></string-name></person-group>, &#x201C;<article-title>Graph contrastive learning with augmentations</article-title>,&#x201D; <source>Adv. Neural Inf. Process Syst.</source>, vol. <volume>33</volume>, pp. <fpage>5812</fpage>&#x2013;<lpage>5823</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Xia</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Wu</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Chen</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Hu</surname></string-name>, and <string-name><given-names>S. Z.</given-names> <surname>Li</surname></string-name></person-group>, &#x201C;<article-title>SimGRACE: A simple framework for graph contrastive learning without data augmentation</article-title>,&#x201D; in <conf-name>Proc. ACM Web Conf. 2022</conf-name>, <publisher-loc>Barcelona, Spain</publisher-loc>, <year>2022</year>, pp. <fpage>1070</fpage>&#x2013;<lpage>1079</lpage>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>P.</given-names> <surname>Bielak</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Kajdanowicz</surname></string-name>, and <string-name><given-names>N. V.</given-names> <surname>Chawla</surname></string-name></person-group>, &#x201C;<article-title>Graph barlow twins: A self-supervised representation learning framework for graphs</article-title>,&#x201D; <source>Knowl. Based Syst.</source>, vol. <volume>256</volume>, no. <issue>4</issue>, pp. <fpage>109631</fpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1016/j.knosys.2022.109631</pub-id>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>N.</given-names> <surname>Lee</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Lee</surname></string-name>, and <string-name><given-names>C.</given-names> <surname>Park</surname></string-name></person-group>, &#x201C;<article-title>Augmentation-free self-supervised learning on graphs</article-title>,&#x201D; <source>Proc. AAAI Conf. Artif. Intell.</source>, vol. <volume>36</volume>, no. <issue>7</issue>, pp. <fpage>7372</fpage>&#x2013;<lpage>7380</lpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1609/aaai.v36i7.20700</pub-id>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Gong</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Zhou</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Tu</surname></string-name>, and <string-name><given-names>X.</given-names> <surname>Liu</surname></string-name></person-group>, &#x201C;<article-title>Attributed graph clustering with dual redundancy reduction</article-title>,&#x201D; in <conf-name> IJCAI</conf-name>, <year>2022</year>, pp. <fpage>1</fpage>&#x2013;<lpage>7</lpage>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>V.</given-names> <surname>Nair</surname></string-name> and <string-name><given-names>G. E.</given-names> <surname>Hinton</surname></string-name></person-group>, &#x201C;<article-title>Rectified linear units improve restricted boltzmann machines</article-title>,&#x201D; in <conf-name>Proc. 27th Int. Conf. Mach. Learn. (ICML-10)</conf-name>, <publisher-loc>Haifa, Israel</publisher-loc>, <year>2010</year>, pp. <fpage>807</fpage>&#x2013;<lpage>814</lpage>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>N.</given-names> <surname>Mrabah</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Bouguessa</surname></string-name>, <string-name><given-names>M. F.</given-names> <surname>Touati</surname></string-name>, and <string-name><given-names>R.</given-names> <surname>Ksantini</surname></string-name></person-group>, &#x201C;<article-title>Rethinking graph auto-encoder models for attributed graph clustering</article-title>,&#x201D; <source>IEEE Trans. Knowl. Data Eng.</source>, pp. <fpage>1</fpage>&#x2013;<lpage>31</lpage>, <year>2022</year>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Zhou</surname></string-name>, <etal>et al.</etal></person-group>, &#x201C;<article-title>Multiple kernel clustering with neighbor-kernel subspace segmentation</article-title>,&#x201D; <source>IEEE Trans. Neural Netw. Learn. Syst.</source>, vol. <volume>31</volume>, no. <issue>4</issue>, pp. <fpage>1351</fpage>&#x2013;<lpage>1362</lpage>, <year>2019</year>. doi: <pub-id pub-id-type="doi">10.1109/TNNLS.2019.2919900</pub-id>; <pub-id pub-id-type="pmid">31265409</pub-id></mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Wang</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Fast parameter-free multi-view subspace clustering with consensus anchor guidance</article-title>,&#x201D; <source>IEEE Trans. Image Process.</source>, vol. <volume>31</volume>, no. <issue>4</issue>, pp. <fpage>556</fpage>&#x2013;<lpage>568</lpage>, <year>2021</year>. doi: <pub-id pub-id-type="doi">10.1109/TIP.2021.3131941</pub-id>; <pub-id pub-id-type="pmid">34890327</pub-id></mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Zhou</surname></string-name>, and <string-name><given-names>E.</given-names> <surname>Zhu</surname></string-name></person-group>, &#x201C;<article-title>Late fusion multiple kernel clustering with proxy graph refinement</article-title>,&#x201D; <source>IEEE Trans. Neural Netw. Learn. Syst.</source>, vol. <volume>34</volume>, no. <issue>8</issue>, pp. <fpage>4359</fpage>&#x2013;<lpage>4370</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Hou</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>GraphMAE2: A decoding-enhanced masked self-supervised graph learner</article-title>,&#x201D; in <conf-name>Proc. ACM Web Conf. 2023</conf-name>, <publisher-loc>Austin, Texas, USA</publisher-loc>, <year>2023</year>, pp. <fpage>737</fpage>&#x2013;<lpage>746</lpage>.</mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>B.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Fu</surname></string-name>, <string-name><given-names>N. D.</given-names> <surname>Sidiropoulos</surname></string-name>, and <string-name><given-names>M.</given-names> <surname>Hong</surname></string-name></person-group>, &#x201C;<article-title>Towards K-means-friendly spaces: Simultaneous deep learning and clustering</article-title>,&#x201D; in <conf-name>Int. Conf. Mach. Learn.</conf-name>, <publisher-loc>Sydney, Australia</publisher-loc>, <year>2017</year>, pp. <fpage>3861</fpage>&#x2013;<lpage>3870</lpage>.</mixed-citation></ref>
<ref id="ref-36"><label>[36]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Xie</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Girshick</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Farhadi</surname></string-name></person-group>, &#x201C;<article-title>Unsupervised deep embedding for clustering analysis</article-title>,&#x201D; in <conf-name>Int. Conf. Mach. Learn.</conf-name>, <publisher-loc>New York, USA</publisher-loc>, <year>2016</year>, pp. <fpage>478</fpage>&#x2013;<lpage>487</lpage>.</mixed-citation></ref>
<ref id="ref-37"><label>[37]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>Zhu</surname></string-name> and <string-name><given-names>P.</given-names> <surname>Koniusz</surname></string-name></person-group>, &#x201C;<article-title>Simple spectral graph convolution</article-title>,&#x201D; in <conf-name>Int. Conf. Learn. Rep.</conf-name>, <year>2020</year>, pp. <fpage>1</fpage>&#x2013;<lpage>15</lpage>.</mixed-citation></ref>
<ref id="ref-38"><label>[38]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Fettal</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Labiod</surname></string-name>, and <string-name><given-names>M.</given-names> <surname>Nadif</surname></string-name></person-group>, &#x201C;<article-title>Scalable attributed-graph subspace clustering</article-title>,&#x201D; <source>Proc. AAAI Conf. Artif. Intell.</source>, vol. <volume>37</volume>, no. <issue>6</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>9</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1609/aaai.v37i6.25918</pub-id>.</mixed-citation></ref>
<ref id="ref-39"><label>[39]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>Zhao</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>E.</given-names> <surname>Yang</surname></string-name>, and <string-name><given-names>C.</given-names> <surname>Deng</surname></string-name></person-group>, &#x201C;<article-title>Graph debiased contrastive learning with joint representation clustering</article-title>,&#x201D; in <conf-name>Proc. Thirtieth Int. Joint Conf. Artif. Intell.</conf-name>, <year>2021</year>, pp. <fpage>3434</fpage>&#x2013;<lpage>3440</lpage>.</mixed-citation></ref>
<ref id="ref-40"><label>[40]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Xia</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Wu</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Chen</surname></string-name>, and <string-name><given-names>S.</given-names> <surname>Li</surname></string-name></person-group>, &#x201C;<article-title>ProGCL: Rethinking hard negative mining in graph contrastive learning</article-title>,&#x201D; in <conf-name>Int. Conf. Mach. Learn.</conf-name>, <publisher-loc>Maryland, USA</publisher-loc>, <year>2022</year>, pp. <fpage>24332</fpage>&#x2013;<lpage>24346</lpage>.</mixed-citation></ref>
<ref id="ref-41"><label>[41]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Jin</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Zhao</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Ma</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Shah</surname></string-name> and <string-name><given-names>J.</given-names> <surname>Tang</surname></string-name></person-group>, &#x201C;<article-title>Automated self-supervised learning for graphs</article-title>,&#x201D; in <conf-name>10th Int. Conf. Learn. Rep. (ICLR 2022)</conf-name>, <year>2022</year>.</mixed-citation></ref>
<ref id="ref-42"><label>[42]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Ma</surname></string-name> and <string-name><given-names>K.</given-names> <surname>Zhan</surname></string-name></person-group>, &#x201C;<article-title>Self-contrastive graph diffusion network</article-title>,&#x201D; in <conf-name>Proc. 31st ACM Int. Conf. Multimed.</conf-name>, <publisher-loc>Ottawa, Canada</publisher-loc>, <year>2023</year>, pp. <fpage>3857</fpage>&#x2013;<lpage>3865</lpage>.</mixed-citation></ref>
</ref-list>
</back></article>