Role-Based Network Embedding via Quantum Walk with Weighted Features Fusion

,


Introduction
Roles are sets of nodes with similar structural patterns and functions. Role-based network embedding aims to project role-similar nodes into a compact low-dimensional vector space. It is widely used in various downstream tasks such as role classification, etc. The first appearance of the concept of roles is in sociology [1] for mining potential social relationships, and it is gradually applied in complex networks like traffic network congestion [2]. By measuring the structural similarity [3], role-similar nodes can be found in distant locations, even in different subgraphs of disconnected networks.
Most previous studies mainly focus on the local context such as node structure, while rarely measuring their relevance globally, especially when nodes are far away so that ignoring their similarity in the global context. For example, Role-based graph embedding (Role2vec) [4] uses the classical random walk or its variants to extract structural information and aggregate role-based embedding. It selectively extracts the structural information from low-order neighborhoods, which ignores global information. However, the situation that role-similar nodes are far away from each other in the graph while showing high relevance in terms of node structural similarity may exist. As shown in Fig. 1, in the world-air-traffic network, landmarks with the same color indicate the same hub role, and landmarks with the same role may be far away. For example, two distant landmarks in Beijing and Washington D.C. airports. They are both capitals of their country and have the same status and role, which is reflected by global information. More recently, Role Embedding via Discrete-time quantum walk (RED) [5] considers the global role information and highlights the global role relevance of nodes, but ignores the effect of node features. In addition to the features of the node itself, its neighborhood nodes will also indeed affect the embedding of role-based networks. For example, a music teacher in the social network is interested in both music and photography, the role of which is more influenced by music than photography and its neighborhoods may be more music lovers than photography lovers. Most of the current methods based on features focus on linear aggregation [6] or manual [7] and recursive [8] feature extraction, which cannot represent the diverse information of neighborhood features. Characteristic functions on graphs: birds of a Feather (FEATHER) [9] considers this problem and proposes a characteristic function to make meaningful combinations of multiple node and neighborhood features. Nevertheless, the extraction or aggregation of multiple features is extremely sensitive to noise that may interfere with the final embedding. VAE [10] assumes that the input follows an ideal data distribution by adding constraints to the encoder. Therefore, even if the noise disturbs networks to different degrees, a corresponding output can also be generated by using the coding within the disturbed range. Role-based network Embedding via Structural features reconstruction with Degreeregularized constraint (RESD) [11] uses VAE to reduce the effect of noise on the role-based network. It also proves the adverse effect of noise on embedding experimentally.
Above all, there are mainly three challenges in current role-based network embedding. 1) Rolesimilar nodes may be far away so we cannot capture global role information. 2) The varied distributions of nodes and their features are always ignored or simplified. 3) The noise may have an adverse influence on the embedding. To address these challenges, we propose a method called Role-based network Embedding via Quantum walk with weighted Features fusion (REQF).
Firstly, we superimpose all walk paths based on the superposition of the quantum walk to capture the global role information by multi-step evolution sequences and use the biased quantum walk to emphasize the local structure and learn role closeness. Secondly, we design a quantum walk weighted characteristic function to fuse features of nodes and their neighborhoods by different distributions. Finally, we use VAE to reduce the effects of noise and generate role-based network embedding. Extensive experiments on real-world networks demonstrate the effectiveness and stability of the proposed REQF.
Our main contributions are listed as follows: We propose a novel method REQF, which considers the influence of global and local role information, node features, and the effect of noise simultaneously for generating role-based network embedding. We utilize quantum walk to capture role information from global and local perspectives for solving the node distance problem, and we construct a quantum walk weighted characteristic function, that uses the quantum walk as probability weights of the characteristic function, to take node features into consideration for feature fusion. We use VAE to reduce the effect of noise in the network and get the optimal role-based network embedding.
The rest of the paper is organized as follows: In Section 2, we briefly review the related work. In Section 3, we introduce the proposed REQF in detail. In Section 4, we report experimental results on real-world networks. Finally, in Section 5, we conclude the paper.

Related Work
In graph networks, roles are sets of nodes with similar structural patterns and functions. It is first defined as classes of structurally equivalent nodes, and the structural equivalence is measured by the structural similarity of nodes [12]. Role-based network embedding aims to embed role-similar nodes into a compact embedding space.
Role-based networks are more concerned with the structural similarity of nodes and are independent of distance. To represent role information, Role-based graph embedding (Role2vec) [4] uses an attribute-based random walk to capture the role information of nodes. Learning structural Node Embeddings on Graphs via diffusion Wavelets (GraphWave) [13] learns structural node embedding by diffusion wavelets. Fast structural node embedding via role identification (RiWalk) [14] uses random walk to approximate graph kernels for role identification. Graph neural network with Local Structural Patterns (GraLSP) [15] incorporates local structure into sets of neighborhoods by anonymous walk. These methods based on the random walk are selective in obtaining the structural information and do not consider the global role information of nodes, which results in information loss. Thus, rolebased network embedding needs a technique to overcome the limitations of distance and global information loss.
Quantum walk [16] is the quantization of walk based on the principle of random motion of particles. The generation of a walking path is determined by the walk position indicating the current location of the node and the walk state indicating the transfer process of walking to the next node. The classical random walk only generates one path for each walk, and no superposition is allowed. In contrast, the quantum walk has the property of coherent superposition that achieve randomness due to its superposition [17]. It can be described as the superposition of all walk states and it walks to all paths simultaneously. As shown in Fig. 2, quantum walk allows all nodes to interact with the whole network simultaneously. Presently, the quantum walk has considerable research prospects in the field of role embedding. Most quantum walk of graphs is implemented by calculating the superposition of probability distributions. Fast Quantum Walk Kernel (FQWK) [18] uses quantum walk as the convolution kernel to measure the similarity of neighborhood substructures between node pairs. However, the method does not consider the structural influence of nodes from the global perspective. RED [5] uses quantum walk to capture the role information of nodes from global and local perspectives, but it ignores the feature information of nodes. It also proposes that role-similar nodes have similar evolution sequences, and the discrete-time quantum walk (DTQW) models are highly correlated with the structural information of nodes. Motivated by it, we introduce the DTQW model to learn the global and local structural patterns of nodes.
On the other side, most of the current feature-based network embedding methods generate embedding by simple extraction or linear aggregation of features. For example, Role eXtraction (RolX) [6] aggregates node features based on the matrix decomposition. Graph Auto-encoder guided by Structural information (GAS) [7] uses the manually extracted features as guidance to train the graph auto-encoder. Role Discovery-guided network embedding based on Autoencoder and Attention mechanism (RDAA) [8] uses recursive feature extraction and incorporates attention mechanisms to guide role discovery in auto-encoders. These methods view all neighborhood features of nodes as uniform distributions, ignoring the fact that different distributions of neighborhood features may have different effects on the central node. Thus, FEATHER [9] uses a complex-valued characteristic function to make meaningful combinations of multiple features and proves that characteristic functions can obtain effective neighborhood features at discrete evaluation points. But its probability weights are simply described by the power of the adjacency matrix, which cannot represent the correlated information well. Therefore, we try to use the quantum walk to appropriately represent the probability weights of the characteristic function that can implicitly highlight the role information and fuse various features in role-based network embedding.
Furthermore, noise affects the generation of high-quality network embedding. Due to the overreliance on neighborhood information, the feature-based methods are sensitive to noise, causing side effects on the generation of embedding. RESD [11] uses VAE to reduce the effect of noise after extracting node features and preserve structural identity with degree regularization. Adversarial Variational Autoencoder Embedding (AVAE) [19] utilizes VAE to avoid noise from obstructing the convergence of generated data. Dual Adversarial Variational Embedding (DAVE) [20] proposes a model based on VAE for recommendations providing personalized noise reduction for different users and items. Purified Graph Generation (PGG) [21] also uses a variant of VAE to filter noise during graph generation. These methods indicate that the influence of noise can be reduced by using the approach of an encoder. Therefore, we attempt to use VAE to obtain a more accurate role-based embedding.
In this paper, we try to overcome the limitations of role-based embedding in terms of node distance and features via quantum walk and weighted characteristic function to capture the global role information and fuse node features, and we use VAE to effectively reduce the effect of noise, and finally obtain a more compact role-based network embedding.

The Proposed Method
In this section, we introduce the notation and the proposed method REQF. We first start with an overview and then present the detailed designs, including role information representation, feature information fusion, and noise effect reduction. Finally, we analyze the computational complexity of REQF.
For clear clarification, the symbols and their definitions are listed in Table 1.
The state of quantum walk |v The superposition of quantum states at the node v U The evolution operator of quantum walk α v The complex-valued probability amplitude of node v p t v The probability of quantum walk being at a node v and evolution time t M t The probability distribution of all nodes at the t th evolution The summation function by row S(·) The similarity measure function The neighborhood nodes of node v A The adjacency matrix The degree of a node v i p biased The biased probability t g , t l The evolution times of the global part and local part sort(·) The proximity ranking function by row The feature vector of node v θ The evaluate points z global , z local , z feature , z The node representation of the global role, local role, feature, and VAE Z Role-based network embedding

Definition 1 Quantum Walk: Given an undirected and unweighted graph
. v n } is the set of n nodes. Quantum walk consists of the state of quantum walk |ψ and the state evolution |ψ U → |ψ . The state of quantum walk is described by a coordinate wave function |ψ = v α v |v , where |v denotes the superposition of quantum states at the node v, which is an n-dimensional column vector, and α v is the corresponding complex-valued probability amplitude. The state evolution process is usually described by |ψ U → |ψ , where the evolution operator U is used to represent the position and direction of the quantum walk.
We define the probability of quantum walk being at a node v and evolution time t The result of the quantum walk M t is a n × n matrix, representing the probability distribution of all nodes at the t th evolution, that is The state evolution is defined as: Definition 2 Role Similarity: Roles are sets of nodes with similar structural patterns and functions in graph networks. The role similarity of nodes is defined by the evolution sequence that is formed by summing the probability distribution matrix of the quantum walk.
where q is the evolution sequence of quantum walks, is the summation function by row, and S(·) is the similarity measure function. Roles will get more similar if the value of S(·) reaches closer to 1.

Definition 3 Role-based Network Embedding:
For a graph G, the neighborhood nodes of a node v are defined as N (v) = {u ∈ V |(u, v) ∈ E}. Role-based network embedding Z ∈ R n×d aims to generate the embedding that maximum reserves the role information contained in it, where d is the embedding dimension. z v ∈ R d is the map for each node v into a low-dimensional vector space that contains role information.
Role-similar nodes would be embedded into similar representations.

Overview of REQF
The illustration of our framework REQF is shown in Fig. 3. REQF consists of three parts: quantum walk, feature fusion, and VAE.
First of all, we utilize quantum walk to capture the global role information z global (red arrow flow) and local role proximity z local (blue arrow flow) of nodes by the evolution sequence q. The block graphs represent the process of the quantum walk evolution as an operator U. The dotted arrows around nodes indicate that all nodes are involved in the evolution at the same time, and the thick dotted arrows indicate the biased process of the quantum walk with probability p biased to make the node obtain a higher initial probability. Then, we use the quantum walk weighted characteristic function for feature fusion (yellow arrow flow), which uses M t to weight characteristic function and is calculated at evaluation points θ to obtain the node neighborhood feature distribution z feature with structural information. Finally, we leverage VAE to reduce the effect of noise z and generate the embedding Z.

Role Information Representation
REQF captures role information via quantum walk. It captures global role information and local role information by controlling the initial state probability. The global role information is the role relevance between nodes from the whole network. The local role information is the role proximity between nodes by the local structure of nodes, and it emphasizes more on the correlation between nodes and their neighborhoods.

Global Role Information
When all nodes have the same initial state probability, they evolve simultaneously over the entire network, and REQF simulates the global evolution by the probability distribution matrix M t of quantum walk to represent the global role relevance of nodes and generate the global role representation z global . The formulated Grover operator [22] is shown in Eq. (5).
where d(v i ) is the degree of a node v i , A is the adjacency matrix, f (·) is the summation function by row, and t g is the evolution times of quantum walk.

Local Role Information
If the initial state probability of a particular node is higher, the evolution of the quantum walk should be biased to emphasize the similarity of the local structure of the node [23], and the probability distribution obtained can be viewed as the correlation between the node and the neighborhood. Thus, REQF use emphasizes the local structure of the node by biased quantum walk M t to learn the role similarity of the node and generate a local role representation z local .
where p biased is the biased probability, t l is the evolution times of biased quantum walk, and sort(·) is the proximity ranking function by row to find other nodes with high role similarity to the central node.

Feature Information Fusion
Different distributions of neighborhood features may have different effects on nodes in real networks. FEATHER has theoretically demonstrated that graphs with the same structure have the same Characteristic Functions (CF). Therefore, considering the varied distributions of node neighborhood features, we use CF on the graph to describe the neighborhood feature distribution of the nodes, instead of fusing feature information in a simple linear aggregation way. The probability weights are defined by quantum walk to weight the CF which is to obtain feature information that highlights the structural property of the nodes.
According to the definition of the CF, we define the probability weights as the probability value of the quantum walk distribution matrix M t between the source node v i and target node v j . REQF uses the quantum walk weighted CF to fuse features and obtain a better and richer node representation z feature .
where x v j is the feature vector of the target node v j , θ is the value of evaluation points.

Noise Reduction and Embedding Generation
In reality, a lot of noise exists in extracting the role and feature information of nodes, and some node features unrelated to roles may interfere with the final role-based network embedding. To reduce the effect of noise, we use VAE to encode node role and feature information into the node embedding representation.
We use a Multi-Layer Perceptron (MLP) as our decoder and define it as follows: where W l and b l represent the weight vector and bias vector of the l th layer, L is the number of hidden layers in the encoder, and tanh(·) is the hyperbolic tangent activation function. The vector z feature (v i ) contains the role and feature information of node v i , h 0 v i = z feature (v i ) is the input of the encoder.
The node embedding representation z v i of node v i follows a Gaussian distribution, that is, . It can be obtained through a reparameterization trick [11] for learning the parameter μ v i and σ v i .
where W μ , W σ , b μ and b σ are respectively the weight and bias of the last layer of the encoder. is the element-wise product, and ι follows a standard Gaussian distribution, ι ∼ N (0, I).
Equally, we use an MLP as the decoder: whereŴ l andb l are the weight and bias vector of the l th hidden layer in the decoder.ĥ 0 v i = z v i is the input of the decoder andĥ L v i is the reconstructed vector. Finally, we define the loss function and the final embedding is obtained as follows: The whole process of REQF is described in Algorithm 1.

Algorithm 1: Process of REQF
Input: A network G = (V , E), a set of node feature vectors X = {x 1 , x 2 , . . . , x n }, quantum walk evolution times t g and biased evolution times t l ; output: Role-based node embedding Z 1: //Capture node role information 2: Initialize the probability distribution matrix M 0 by Eq. (6); 3: for each evolution time t g do 4: Compute the global probability distribution matrix M t by Eq. (5); 5: Obtain global role representation z global by Eq. (7); 6: end for 7: Initialize the probability distribution matrix M 0 by Eq. (8); 8: for each biased evolution time t l do 9: Compute the local probability distribution matrix M t by Eq. (5); 10: Obtain local role representation z local by Eq. (9); 11: end for 12: //Fuse node feature information 13: Obtain feature-fused representation z feature by Eq. (10); 14: //Variational auto-encoder 15: for each epoch do 16: Encode node information z by Eq. (14)

Computational Complexity Analysis
The computational complexity of the proposed REQF mainly depends on the quantum walk and VAE, which are highly related to the number of nodes and feature dimensions.
Given a network G = (V , E), V is the set of nodes. REQF captures the global (lines 2-6) and local (lines 7-11) role information of nodes through the quantum walk, which takes O(t g + t l · |V |) time. In the quantum walk module, we simulate the quantum walk on classical computers, and the complexity is O(|V | 3 ). The process of feature fusion (line 13) takes O(mt g · |V |), and m denote the dimension of the node feature. In the VAE module, the computational complexity is O(r 2 d), and r denotes the dimension of the input features, d denotes the output dimension. Therefore, the whole computational complexity of REQF is O(t l · |V | 4 + r 2 d).
The computational complexity of the REQF algorithm here is a bit high because the quantum walk is simulated with a conventional computer. But the complexity of the actual quantum walk is only O(1), so the theoretical complexity of REQF is only O(mt g · |V | + r 2 d), which indicates that the simulation of the quantum walk can be replaced by directly conducted in the future technologies, and make the present high complexity greatly reduced certainly.

Experiments
In this section, we evaluate the performance of REQF on real-world networks from the tasks of role classification, role detection and visualization, parameter sensitivity, and ablation study.

Datasets
We conduct experiments on seven real-world networks with unweighted undirected edges, the detailed statistics of the datasets are shown in Table 2. Air-traffic networks [11] is about air-traffic information network in different countries including America, Brazil, and Europe, Cora [15] is a citation network related to science, Actor [3] is a social network of American actors, LastFM Asia [9] is a social network of the LastFM music platform composed of Asian users, and Film [3] is English-language film network.

Experiment Setting
We compare REQF with classic and advanced role-based network embedding methods including Role2vec [4], FEATHER [9], RED [5], and RESD [11]. All the experimental parameters are set to the default values provided in their papers.
For the REQF, we set the final parameters in the parameter sensitivity experiment. We set quantum walk evolution times t g = 16, biased evolution times t l = 4, biased probability p biased = 0.9, and evaluation points θ = 5. As for the datasets with missing features, like the Film dataset, we use ReFex [24] to extract the features of nodes. In VAE, we use a 1-layer MLP model with the ReLU activation function and set the number of hidden layers to 2, and set the number of epochs to 50.

Role Classification
We conduct the task of role-based node classification on seven real-world networks. In this experiment, we input the embedding into a linear logistic regression classifier, 70% of the data is randomly selected for training, and the rest is used for testing. We run all the methods 20 times, and compute their average F1 score to measure the accuracy of the classification, and the area under the curve (AUC) score to measure the quality of it. The results are shown in Tables 3-5. The results of role classification indicate that the proposed REQF has achieved the best performance in five networks, with 14.6% higher than the best baseline method. However, the performance of REQF on Cora and LastFM Asia networks is lower than FEATHER and Role2Vec separately. The reason may be that the users of these two social networks are mostly mutual strangers, with relatively equal interconnection and less obvious role characteristics, resulting in poor performance in role-based classification. It can be proved that both datasets also show low performance in the RED and RESD methods that are based on role embedding. So, the results demonstrate the effectiveness of REQF in the role classification task.

Role Detection
Role detection is one of the important tasks of role-based network embedding. It clusters the embedding into different existing classes to represent different roles in the real network. The experiment is conducted to K-means clustering method using Brazil fights dataset to observe the performances.
We evaluate the clustering extensively using four popular metrics: Adjusted Mutual Information (AMI) [25] to assess the closeness between the clustering results and the true classes of the samples, Adjusted Rand Index (ARI) [26] to measure the degree of fit between the two data distributions, Vmeasure [27] to indicate the closeness of the group divisions, and Silhouette score (Sil) [28] to indicate the closeness of each group class. The values of all metrics range from 0 to 1, and a value closer to 1 means a better effect.
The results are shown in Table 6, and REQF-means REQF without VAE to reduce the effect of noise. REQF significantly outperformed all other models in role detection due to considering node role information, features, and the effect of noise at the same time, which other methods ignore.
Compared with the best baseline, REQF is improved by 16.5% in AMI, 38.9% in ARI, 15.5% in V-measure, and 24.1% in Sil. An intuitive comparison is shown in Fig. 4. It is obvious that our REQF shows the best performance in role detection.

Parameter Sensitivity
For parameters of REQF, we conduct experiments to analyze the sensitivity of all parameters on the Brazil flights network. Fig. 5 plots the micro-F1 score of the role classification task with varying parameters.
Effect of t g . As shown in Fig. 5a, when evolution times t g < 16, the micro-F1 score fluctuates slightly, and with the increase of the evolution times, the model gradually tends to be stable. However, when t g > 16, the model performance decreases significantly. Therefore, we can get the best model performance when t g = 16.
Effect of t l . Fig. 5b shows how the biased evolution times t l affects the performance. We can observe that REQF gets the best results with t l = 4. When t l = 2, although the average score is higher, it fluctuates more making the model more unstable, especially when t l = 8 with the longest box. Thus, the performance does not become better as the biased evolution times increase, which indicates that quantum walk can capture network information and achieve better performance in short evolution times. Effect of p biased . The results of varying biased probability p biased are shown in Fig. 5c. We can get the optimal mean value when p biased = 0.2 and the second-best when p biased = 0.9, but it is more stable at p biased = 0.9. It confirms that the performance of our model is optimal when the biased probability is 0.9. Fig. 5d shows the influence of evaluation points on REQF performance. When θ = 1, we get the highest score but the length of the box is long, which means it fluctuates greatly. While when θ = 7, the length of the box is the shortest and the model is the most stable. On the whole, varying θ has little effect on the mean value, so the appropriate parameter value can be set according to the demand for downstream tasks.

Ablation Study
To explore the influence of quantum walk, CF, and VAE modules on the REQF, we compare the performance between REQF and REQF without the quantum walk (REF), feature fusion (REQ), and VAE (REQF-). The results are shown in Fig. 6.
It is obvious that REQF significantly outperforms the other models, showing the best performance. REF is second-best, while it is still inferior to REQF, indicating the necessity of global role information and the effectiveness of the quantum walk module in capturing it. The results of REQFand REF are close, which proves that noise reduction of VAE plays an important role same as the quantum walk. REQ has the worst results, especially the Sil metric, which is significantly lower than all the others. It indicates that the CF greatly affects the performance of REQF, especially in the clustering tasks, and it is necessary to consider the different distributions of node features and neighborhoods. Hence, all modules indeed play an important role in REQF, and all of them effectively improve the overall performance of REQF.

Visualization
In order to visualize the effect of role clustering, we compare REQF with baselines for visualization on the Brazil flights dataset due to its proper size and uniform classes of node roles. We use t-SNE [29] to get the dimensional reduction visualization, which is sensitive to the local structure of nodes and exhibits more stability than other visualization algorithms [30]. The roles are mapped into the color of points, points with the same color should be close to each other, while points with different colors should be far apart.
As shown in Fig. 7, each node is represented as a point with different colors, and the color indicates the node's role. Points with the same role should be clustered together, that is, points with the same color should be close to each other, while points with different colors should be far apart. The visualization results show that the embedding of the REQF (Fig. 7f) can be well clustered in general, although there are a few individual nodes with incorrect clustering results. This is because REQF considers global and local role information, node features, and noise simultaneously. The points in different colors of Role2vec (Fig. 7a) are mixed up as it only considers the local structure of nodes and points cannot be classified well. RED (Fig. 7b) is better than Role2vec as it considers global role information. FEATHER (Fig. 7c) considers node features so it clusters closer while it still cannot effectively achieve role clustering. RESD (Fig. 7d) is more effective as the points of the same color are quite close to each other, but there is no obvious division of the different roles of nodes as it ignores global information. In addition, we conduct the experiment of REQF- (Fig. 7e) that REQF without VAE, and its visualization illustrates that the VAE indeed reduces the effect of noise and enables the model to have better role clustering ability. Therefore, REQF has the best visualization effect and the model can divide the roles of nodes well in the real network. In this paper, we propose REQF to generate role-based network embedding via quantum walk and its weighted feature fusion, which simultaneously considers node role information, node features, and noise. REQF utilizes quantum walk to capture the global and local role information of nodes, leverage its weighted characteristic function for feature fusion, and finally use VAE to reduce the effect of noise. The experimental results demonstrate the effectiveness and stability of the REQF on realworld datasets for downstream tasks. We also demonstrate the importance of each module and explore optimal values of parameters. For the limitation of our work, on the one hand, we only use a simple multi-layer perceptron, which may not be able to reduce the effect of noise sufficiently. On the other hand, our proposed method only focuses on homogeneous networks, and it cannot be expanded to heterogeneous graph networks. For future work, we can try a more efficient deep-learning framework to obtain a better representation of the role-based network. We can also expand to explore the field of heterogeneous role-based graph networks.
Funding Statement: This work was supported in part by the National Nature Science Foundation of China (Grant 62172065) and the Natural Science Foundation of Chongqing (Grant cstc2020jcyj-msxmX0137).

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.