<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">73500</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2025.073500</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Non-Euclidean Models for Fraud Detection in Irregular Temporal Data Environments</article-title>
<alt-title alt-title-type="left-running-head">Non-Euclidean Models for Fraud Detection in Irregular Temporal Data Environments</alt-title>
<alt-title alt-title-type="right-running-head">Non-Euclidean Models for Fraud Detection in Irregular Temporal Data Environments</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Kim</surname><given-names>Boram</given-names></name></contrib>
<contrib id="author-2" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Choi</surname><given-names>Guebin</given-names></name><email>guebin@jbnu.ac.kr</email></contrib>
<aff id="aff-1"><institution>Department of Statistics, Institute of Applied Statistics, Jeonbuk National University</institution>, <addr-line>Jeonju, 54896</addr-line>, <country>Republic of Korea</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Guebin Choi. Email: <email>guebin@jbnu.ac.kr</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2026</year>
</pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>10</day><month>2</month><year>2026</year>
</pub-date>
<volume>87</volume>
<issue>1</issue>
<elocation-id>74</elocation-id>
<history>
<date date-type="received">
<day>19</day>
<month>09</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>24</day>
<month>12</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2026 The Authors.</copyright-statement>
<copyright-year>2026</copyright-year>
<copyright-holder>Published by Tech Science Press.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_73500.pdf"></self-uri>
<abstract>
<p>Traditional anomaly detection methods often assume that data points are independent or exhibit regularly structured relationships, as in Euclidean data such as time series or image grids. However, real-world data frequently involve irregular, interconnected structures, requiring a shift toward non-Euclidean approaches. This study introduces a novel anomaly detection framework designed to handle non-Euclidean data by modeling transactions as graph signals. By leveraging graph convolution filters, we extract meaningful connection strengths that capture relational dependencies often overlooked in traditional methods. Utilizing the Graph Convolutional Networks (GCN) framework, we integrate graph-based embeddings with conventional anomaly detection models, enhancing performance through relational insights. Our method is validated on European credit card transaction data, demonstrating its effectiveness in detecting fraudulent transactions, particularly those with subtle patterns that evade traditional, amount-based detection techniques. The results highlight the advantages of incorporating temporal and structural dependencies into fraud detection, showcasing the robustness and applicability of our approach in complex, real-world scenarios.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Anomaly detection</kwd>
<kwd>credit card transactions</kwd>
<kwd>fraud detection</kwd>
<kwd>graph convolutional networks</kwd>
<kwd>non-euclidean data</kwd>
</kwd-group>
<funding-group>
<award-group id="awg1">
<funding-source>National Research Foundation of Korea</funding-source>
<award-id>RS-2023-00249743</award-id>
</award-group>
<award-group id="awg2">
<funding-source>Ministry of Education</funding-source>
<award-id>RS-2024-00443714</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>Credit card transactions are on the rise, driven by the convenience of digital payment methods and the rapid growth of e-commerce. However, with the increase in credit card transactions, fraudulent activities have also become more frequent. According to the Nilson Report (January 2025), global card fraud losses reached $33.83 billion in 2023, and cumulative losses are projected to reach $403.88 billion over the next decade [<xref ref-type="bibr" rid="ref-1">1</xref>]. Fraudulent transactions cause significant economic losses to financial institutions and consumers, highlighting the critical need for reliable detection methods.</p>
<p>Traditional methods for detecting credit card fraud typically focus on large transactions at specific merchants during certain times. For instance, instead of looking for fraud on a per-customer basis, these methods concentrate on big purchases at major retailers or transactions involving large sums of money [<xref ref-type="bibr" rid="ref-2">2</xref>]. This approach treats each transaction separately when checking for fraud. This implies that each transaction is assessed independently for potential fraudulent activity.</p>
<p>In contrast, this study explores the analysis of fraudulent transactions under the assumption that transactions are not independent. By considering the dependencies among transactions, this study aims to identify fraudulent activities through their connections. This approach recognizes that fraudulent transactions often exhibit patterns and relationships that can be more effectively detected when analyzed together. This concept is illustrated in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Transaction occurrences over time for customer Steven Johnson. Data source: Kaggle simulated credit card transaction dataset (see <xref ref-type="sec" rid="s3">Section 3</xref> for details). The <italic>x</italic>-axis represents transaction time and the <italic>y</italic>-axis represents transaction amount. Red dots indicate legitimate transactions, and blue dots indicate fraudulent transactions. The lower panel zooms into October 7&#x2013;20 to highlight the temporal clustering of fraudulent transactions</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73500-fig-1.tif"/>
</fig>
<p><xref ref-type="fig" rid="fig-1">Fig. 1</xref> shows how transactions happen over time and the amounts involved in each transaction. The term &#x2018;amount&#x2019; refers to how much money was spent in a transaction. In this figure, 12 fraudulent transactions are linked to an individual named Steven, showing that these transactions happened close to each other in time. It is clear that the amounts in these fraudulent transactions are generally higher than those in legitimate ones. When we look closely at the periods when these fraudulent transactions happened (as seen in the zoomed-in section of <xref ref-type="fig" rid="fig-1">Fig. 1</xref>), we can see that these fraudulent transactions occurred one after another. The goal of this analysis is to use the timing of transactions to identify fraud by looking at how these fraudulent activities are connected over time.</p>
<p>As discussed earlier, many existing methods for detecting fraud rely heavily on the amounts of the transactions. They often identify transactions as suspicious if they involve unusually large amounts of money. Simply put, if a person who usually spends about $30 suddenly makes a $1000 transaction, it is likely to be considered fraudulent. This reliance on transaction amounts is evident when examining the distribution of transaction values in the data.</p>
<p>This approach may seem efficient but it is not foolproof. For example, in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>, the sixth fraudulent transaction has a very small amount, making it difficult to predict it as a fraudulent transaction. How can we identify such a transaction as fraudulent? One might think we should determine its fraudulent nature using other explanatory variables (excluding the amount), but this is often impractical with real data. This is because the sixth fraudulent transaction is very close in time to the fifth and seventh fraudulent transactions, and therefore, due to the nature of the data, other variables (such as store information or customer characteristics like age and gender) cannot show significant differences.</p>
<p>In fact, we can intuitively infer that the sixth transaction is fraudulent, even though it has a small amount, because all the surrounding transactions are also fraudulent. For the sixth transaction to be legitimate, it would require an unlikely scenario where a user loses their credit card, quickly finds it to make a legitimate transaction, and then loses it again shortly after. This is not a realistic situation. It makes more sense to assume that transactions occurring close together in time are either all fraudulent or all legitimate. Therefore, analyzing the data as a time series is a more effective approach.</p>
<p>We could interpret and analyze the given data as a time series. However, applying typical time series analysis methods is not easy. This is because many traditional statistical methods for time series, such as autoregressive integrated moving average (ARIMA) and autoregressive with exogenous variables (ARX), and techniques using recurrent networks, such as recurrent neural networks (RNN) and long short-term memory networks (LSTM) [<xref ref-type="bibr" rid="ref-3">3</xref>], both assume that the observations are made at equally spaced intervals. In other words, they assume that the data points are uniformly distributed over time. However, the transition times in our data are not equally spaced.</p>
<p>For example, in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>, let&#x2019;s look at the time intervals between transactions. The time gap between the first fraudulent transaction and the immediately preceding legitimate transaction is longer than the gap between the first and the second fraudulent transactions. This means the first fraudulent transaction happened closer in time to the second fraudulent transaction than to the previous legitimate transaction. Therefore, when predicting the value of the first fraudulent transaction, it makes more sense to consider the next transaction rather than the previous one. This shows that the data points are not spaced uniformly over time, violating the equally-spaced observation assumption underlying most temporal models.</p>
<p>To represent these irregular connections between observations, we reframe the indices of the given data as a graph <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mrow><mml:mi>&#x1D4A2;</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mi>V</mml:mi><mml:mo>,</mml:mo><mml:mi>E</mml:mi><mml:mo>,</mml:mo><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>. In this graph, <italic>V</italic> is the set of nodes, and <italic>E</italic> is the set of edges connecting these nodes. <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> is an <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:mi>n</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>n</mml:mi></mml:math></inline-formula> matrix, where <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mi>n</mml:mi></mml:math></inline-formula> is the number of nodes in <italic>V</italic>. Each entry <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> in <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> represents the weight, indicating the strength or importance of the connection between node <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:mi>i</mml:mi></mml:math></inline-formula> and node <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mi>j</mml:mi></mml:math></inline-formula>.</p>
<p>In our dataset, <italic>V</italic> represents the indices of the observations. Edges (<italic>E</italic>) exist between transactions made with the same credit card, meaning transactions from different credit cards are not connected by edges. Additionally, the weight (<inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>) of each edge in <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> is higher when the transactions occur closer in time, showing stronger connections for transactions that happen close together.</p>
<p>Building on this structure, this study posits that utilizing temporal dependency, which reveals interrelations based on transaction timing, will be highly effective for analyzing fraudulent transactions, even if the transaction amounts differ from the average.</p>
<p>Inspired by the characteristics of credit card transaction data, this study proposes a novel integrated framework that models data with irregular time intervals as a graph structure and extracts embeddings using graph convolution operations. Specifically, we encode temporal proximity between transactions as edge weights in a graph, aggregate information from neighboring transactions through graph convolution to generate embeddings for each transaction, and use these embeddings as input features for conventional classification models. The main contributions of this study are as follows. First, we propose an efficient non-Euclidean embedding method that can effectively represent transaction data with irregular time intervals (<xref ref-type="sec" rid="s4">Section 4</xref>). Second, we demonstrate that the proposed method achieves stable performance improvements across various experimental settings, thereby establishing its robustness (<xref ref-type="sec" rid="s5">Section 5</xref>). Third, we statistically analyze and validate the effectiveness of the proposed embedding on credit card fraud detection performance (<xref ref-type="sec" rid="s6">Section 6</xref>).</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Works</title>
<p>This section reviews existing research on credit card fraud transaction detection. Related work can be categorized from four main perspectives: (1) data imbalance problem with tabular models, (2) research considering non-independence among transactions (customer-merchant relationships), (3) research leveraging temporal dependencies, and (4) other techniques.</p>
<p>Credit card fraud involves highly imbalanced data. Various machine learning techniques have been studied to handle this imbalanced data [<xref ref-type="bibr" rid="ref-4">4</xref>&#x2013;<xref ref-type="bibr" rid="ref-6">6</xref>]. There are numerous studies on addressing data imbalance, ranging from simple oversampling and undersampling methods to advanced online fraud detection systems that have demonstrated efficiency in dealing with large-scale imbalanced data, such as the research by Wei et al. [<xref ref-type="bibr" rid="ref-7">7</xref>]. Recent studies have specifically focused on addressing the extreme class imbalance in fraud detection. Tayebi and El Kafhali [<xref ref-type="bibr" rid="ref-8">8</xref>,<xref ref-type="bibr" rid="ref-9">9</xref>] proposed deep learning approaches including autoencoders and generative models to effectively handle imbalanced fraud datasets.</p>
<p>Meanwhile, research considering non-independence among transactions has been conducted. A common approach to model relationships between customers and merchants in financial networks is to use bipartite or tripartite graphs [<xref ref-type="bibr" rid="ref-10">10</xref>&#x2013;<xref ref-type="bibr" rid="ref-12">12</xref>]. In the bipartite formulation, nodes represent cardholders and merchants, and edges represent transactional relationships between them. A critical characteristic of this approach is transaction aggregation: multiple transactions between the same cardholder-merchant pair are combined into a single edge, with edge attributes (such as total amount) aggregated accordingly. The fraud label is typically assigned as positive if any constituent transaction was fraudulent. Graph embedding techniques such as Node2Vec [<xref ref-type="bibr" rid="ref-13">13</xref>] are then applied to learn node representations, which are subsequently used for edge classification [<xref ref-type="bibr" rid="ref-14">14</xref>]. The tripartite extension introduces transaction nodes as intermediate entities, partially preserving transaction-level information while maintaining the relational structure [<xref ref-type="bibr" rid="ref-15">15</xref>]. More recent work has explored heterogeneous graph representations incorporating multiple node types. Wang et al. [<xref ref-type="bibr" rid="ref-16">16</xref>] proposed a heterogeneous graph auto-encoder that captures relationships between cardholders, merchants, and transactions. While these graph-based approaches consider non-Euclidean data structures similar to our method, they focus on structural connectivity rather than temporal proximity between transactions.</p>
<p>There are also studies that leverage temporal dependencies in transactions. Sequence-based approaches treat each customer&#x2019;s transaction history as a time series and learn sequential patterns for fraud prediction. LSTM-based approaches include Benchaji et al. [<xref ref-type="bibr" rid="ref-17">17</xref>], who proposed an LSTM-based fraud detection model, and Alarfaj et al. [<xref ref-type="bibr" rid="ref-18">18</xref>], who combined attention mechanisms with LSTM for enhanced detection. For Transformer architectures, Yu et al. [<xref ref-type="bibr" rid="ref-19">19</xref>] applied an advanced Transformer model to credit card fraud detection, demonstrating superior performance over traditional machine learning techniques.</p>
<p>Research combining temporal dependencies with graph structures has also been conducted. Studies on fraud detection techniques based on Graph Convolutional Networks (GCN) [<xref ref-type="bibr" rid="ref-20">20</xref>] are actively progressing. Dynamic graph neural networks, such as DySAT [<xref ref-type="bibr" rid="ref-21">21</xref>] and ROLAND [<xref ref-type="bibr" rid="ref-22">22</xref>], extend static Graph Neural Networks (GNNs) by allowing graph structure and node embeddings to evolve over time. Cheng et al. [<xref ref-type="bibr" rid="ref-23">23</xref>] developed CaT-GNN, integrating causal inference with temporal graph modeling. While these approaches effectively capture dynamic patterns, they typically require discrete time snapshots and introduce additional computational complexity.</p>
<p>Since the characteristics of credit card fraud data are not identical across datasets, various techniques have been developed to fit the specific properties of each data. Wheeler (2000) applied case-based reasoning in the credit approval process [<xref ref-type="bibr" rid="ref-24">24</xref>], Srivastava (2008) used Hidden Markov Models to learn normal cardholder behaviors [<xref ref-type="bibr" rid="ref-25">25</xref>], and Sanchez (2009) utilized association rules to extract normal behavior patterns [<xref ref-type="bibr" rid="ref-26">26</xref>]. Liu et al. [<xref ref-type="bibr" rid="ref-27">27</xref>] addressed the over-smoothing problem in deep GNNs through high-order graph representation learning.</p>
<p>Our proposed method differs from existing approaches in several key aspects. First, unlike traditional tabular methods, our approach explicitly considers connectivity between observations through temporal proximity and customer information. Second, while bipartite/tripartite graph approaches only model customer-merchant connectivity without considering temporal relationships, our method simultaneously incorporates both temporal proximity and customer information. Third, time series methods such as LSTM and Transformer assume equally-spaced transactions, whereas our method naturally handles irregular time intervals. Fourth, existing methods combining temporal dependencies with graphs use graph-based models as the final classifier, considering connectivity across all transactions. In contrast, our method extracts GCN embeddings as features and feeds them into a tabular classifier. Since the exponential decay function weakens connection strengths for temporally distant transactions, the non-Euclidean structure is effectively utilized only for temporally proximate transactions&#x2014;typically cases where fraud occurs consecutively. This design improves computational efficiency and facilitates extension to additional variables.</p>
</sec>
<sec id="s3">
<label>3</label>
<title>Data Description</title>
<p>For the analysis of fraudulent transactions, we faced challenges in accessing real data from financial institutions like banks due to privacy concerns, as credit card transaction data is pseudonymized to protect customers&#x2019; personal information. Consequently, we used a publicly available dataset from Kaggle<xref ref-type="fn" rid="fn-1"><sup>1</sup></xref><fn id="fn-1">
<label>1</label>
<p><ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/dermisfit/fraud-transactions-dataset">https://www.kaggle.com/datasets/dermisfit/fraud-transactions-dataset</ext-link></p>
</fn> for our analysis. To apply our graph-based analysis method described in <xref ref-type="sec" rid="s4">Section 4</xref>, we require three essential components: (i) a temporal variable to identify connectivity between transactions, (ii) customer identifiers to construct individual transaction graphs, and (iii) node features for the GCN model. We selected this dataset because it provides all three components: transaction timestamps (<monospace>trans_date_and_time</monospace>), credit card numbers (<monospace>cc_num</monospace>) as customer identifiers, and transaction amounts (<monospace>amt</monospace>) as node features.</p>
<p>The dataset comprises 1,048,575 transactions with 22 variables, including 6006 fraudulent cases (0.573%). Among 943 cardholders, 596 experienced at least one fraud. For our graph-based analysis, we use transaction timestamps (<monospace>trans_date_and_time</monospace>) to compute temporal connectivity and transaction amounts (<monospace>amt</monospace>) as node features. Detailed data descriptions and exploratory data analysis are available on the Kaggle dataset page.</p>
<p><xref ref-type="fig" rid="fig-2">Fig. 2</xref> illustrates transaction graphs for customer Katherine Tucker. Each transaction is represented as a node, where node size corresponds to transaction amount (<inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">amt</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula>), color indicates fraud status (<inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula>: blue for fraud, red for legitimate), and edge thickness represents temporal proximity (<inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula>). The weight <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> between transactions <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mi>i</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mi>j</mml:mi></mml:math></inline-formula> is computed using a Gaussian kernel based on time difference, where values close to 1 indicate temporally adjacent transactions.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Transaction graphs for customer Katherine Tucker. For visualization clarity, only 10 representative transactions are shown from her total of 1250 transactions: 2 fraudulent transactions (blue) occurring on 15 January 2020, and 8 legitimate transactions (red) spanning from 16 March 2019 to 30 December 2019. Node size corresponds to transaction amount, and edge thickness represents temporal proximity. The left panel shows the full connectivity graph; the right panel shows the graph with only temporally close edges retained</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73500-fig-2.tif"/>
</fig>
<p>The left panel shows a fully connected graph where all transactions are linked. Fraudulent transactions tend to cluster together temporally, forming tightly connected subgraphs. The right panel retains only edges between temporally close transactions, providing a sparser structure that highlights the temporal clustering of fraud.</p>
</sec>
<sec id="s4">
<label>4</label>
<title>Proposed Method</title>
<sec id="s4_1">
<label>4.1</label>
<title>General Methodology</title>
<p>Let&#x2019;s say the given data is <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> and <inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula>, where <inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> is a <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:mi>n</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>p</mml:mi></mml:math></inline-formula> matrix and <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> is a vector of length <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:mi>n</mml:mi></mml:math></inline-formula>. Here, <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> contains labels, while <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> is the design matrix necessary for predicting <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula>. Some columns of <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> can define relationships between different observations. Let&#x2019;s denote one of these variables as <inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula>. Here, <inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:mi>j</mml:mi></mml:math></inline-formula> is the index of the variable selected to define relationships between observations. Let <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula> represent the relationships measured between observations from <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula> by an appropriate method. In this context, <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula> is an <inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:mi>n</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>n</mml:mi></mml:math></inline-formula> matrix.</p>
<p>Our goal is to predict <bold>y</bold> by considering both <bold>X</bold> and <inline-formula id="ieqn-33"><mml:math id="mml-ieqn-33"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mn>1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>J</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula>, where <italic>J</italic> is the index set of <inline-formula id="ieqn-34"><mml:math id="mml-ieqn-34"><mml:mi>j</mml:mi></mml:math></inline-formula> and |<italic>J</italic>| is its cardinality. Since the dimension of <inline-formula id="ieqn-35"><mml:math id="mml-ieqn-35"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula> increases with <inline-formula id="ieqn-36"><mml:math id="mml-ieqn-36"><mml:mi>n</mml:mi></mml:math></inline-formula>, it is necessary to reduce the size of <inline-formula id="ieqn-37"><mml:math id="mml-ieqn-37"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula> appropriately. For this purpose, we use the hidden layer <inline-formula id="ieqn-38"><mml:math id="mml-ieqn-38"><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> learned from the graph convolution filter: <inline-formula id="ieqn-39"><mml:math id="mml-ieqn-39"><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mstyle scriptlevel="0"><mml:mrow><mml:mo maxsize="1.2em" minsize="1.2em">(</mml:mo></mml:mrow></mml:mstyle><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup><mml:msubsup><mml:mrow><mml:mi mathvariant="bold">&#x0398;</mml:mi></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup><mml:mstyle scriptlevel="0"><mml:mrow><mml:mo maxsize="1.2em" minsize="1.2em">)</mml:mo></mml:mrow></mml:mstyle></mml:math></inline-formula> for <inline-formula id="ieqn-40"><mml:math id="mml-ieqn-40"><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula>. Here, <inline-formula id="ieqn-41"><mml:math id="mml-ieqn-41"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula> is the adjacency matrix corresponding to <inline-formula id="ieqn-42"><mml:math id="mml-ieqn-42"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-43"><mml:math id="mml-ieqn-43"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula> is the degree matrix of <inline-formula id="ieqn-44"><mml:math id="mml-ieqn-44"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula>, <inline-formula id="ieqn-45"><mml:math id="mml-ieqn-45"><mml:msubsup><mml:mrow><mml:mi mathvariant="bold">&#x0398;</mml:mi></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> represents the learnable parameters, and <inline-formula id="ieqn-46"><mml:math id="mml-ieqn-46"><mml:mi>&#x03C3;</mml:mi></mml:math></inline-formula> denotes the activation function. To learn <inline-formula id="ieqn-47"><mml:math id="mml-ieqn-47"><mml:msubsup><mml:mrow><mml:mi mathvariant="bold">&#x0398;</mml:mi></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula>, we use a loss function similar to the one considered in [<xref ref-type="bibr" rid="ref-20">20</xref>]: <inline-formula id="ieqn-48"><mml:math id="mml-ieqn-48"><mml:msub><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mi>j</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mn>0</mml:mn></mml:msup><mml:mo>+</mml:mo><mml:mi>&#x03BB;</mml:mi><mml:msubsup><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mtext>reg</mml:mtext></mml:mrow></mml:msubsup></mml:math></inline-formula>, where <inline-formula id="ieqn-49"><mml:math id="mml-ieqn-49"><mml:msubsup><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mtext>reg</mml:mtext></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">o</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi><mml:mi mathvariant="normal">&#x22A4;</mml:mi></mml:msubsup><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">o</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula>. <inline-formula id="ieqn-50"><mml:math id="mml-ieqn-50"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">o</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula> represents the output obtained by linearly transforming <inline-formula id="ieqn-51"><mml:math id="mml-ieqn-51"><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula>. In a typical GCN, the last layer is a graph convolution layer. However, since our goal is to utilize the hidden layer as a new feature, the final transformation is performed using a standard linear transform instead of a graph convolution unit. This is to fully embed the connection information of the observations in the penultimate hidden layer. <inline-formula id="ieqn-52"><mml:math id="mml-ieqn-52"><mml:msup><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mn>0</mml:mn></mml:msup></mml:math></inline-formula> represents the supervised loss with respect to the labeled part of <inline-formula id="ieqn-53"><mml:math id="mml-ieqn-53"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula>, and <inline-formula id="ieqn-54"><mml:math id="mml-ieqn-54"><mml:msup><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mrow><mml:mtext>reg</mml:mtext></mml:mrow></mml:msup></mml:math></inline-formula> implies a constraint that makes values more similar as the relationship between observations increases. <inline-formula id="ieqn-55"><mml:math id="mml-ieqn-55"><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> is the input for the <inline-formula id="ieqn-56"><mml:math id="mml-ieqn-56"><mml:mi>l</mml:mi></mml:math></inline-formula>th layer, and <inline-formula id="ieqn-57"><mml:math id="mml-ieqn-57"><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> is the output. When <inline-formula id="ieqn-58"><mml:math id="mml-ieqn-58"><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:math></inline-formula>, <inline-formula id="ieqn-59"><mml:math id="mml-ieqn-59"><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> is defined as <inline-formula id="ieqn-60"><mml:math id="mml-ieqn-60"><mml:msup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mi>j</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula>, where <inline-formula id="ieqn-61"><mml:math id="mml-ieqn-61"><mml:msup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mi>j</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> represents the set of selected variables from the entire explanatory variable <inline-formula id="ieqn-62"><mml:math id="mml-ieqn-62"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula>, explicitly excluding <inline-formula id="ieqn-63"><mml:math id="mml-ieqn-63"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula>. The output of the final layer is defined as <inline-formula id="ieqn-64"><mml:math id="mml-ieqn-64"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula>. Ultimately, an appropriate tabular model <inline-formula id="ieqn-65"><mml:math id="mml-ieqn-65"><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mtext>Tabular</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula> is trained by considering <inline-formula id="ieqn-66"><mml:math id="mml-ieqn-66"><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow><mml:mo>&#x2295;</mml:mo><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mn>1</mml:mn><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup><mml:mo>&#x2295;</mml:mo><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mn>2</mml:mn><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup><mml:mo>&#x2295;</mml:mo><mml:mo>&#x22EF;</mml:mo><mml:mo>&#x2295;</mml:mo><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>J</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>J</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> as explanatory variables to fit <inline-formula id="ieqn-67"><mml:math id="mml-ieqn-67"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula>. Here, <inline-formula id="ieqn-68"><mml:math id="mml-ieqn-68"><mml:mo>&#x2295;</mml:mo></mml:math></inline-formula> denotes concatenation. |<italic>J</italic>| represents the cardinality of the set <italic>J</italic>, which is the set of variables defining relationships among observations. The model <inline-formula id="ieqn-69"><mml:math id="mml-ieqn-69"><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mtext>Tabular</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula> can be any tabular classifier, such as <monospace>XGBoost</monospace>, <monospace>LightGBM</monospace>, etc. Algorithm 1 summarizes the proposed procedure, and the overall architecture is illustrated in <xref ref-type="fig" rid="fig-3">Fig. 3</xref>.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Overall architecture of the proposed framework. Multiple non-Euclidean feature extractors generate graph-based representations from the input data, which are aggregated and combined with original tabular features. The resulting features are then fed into a tabular model to produce the final output</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73500-fig-3.tif"/>
</fig>
<fig id="fig-7">
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73500-fig-7.tif"/>
</fig>
<p>Having established the general framework for graph-augmented feature learning, we now demonstrate its concrete application to credit card fraud detection. The following subsection specifies the graph construction, feature definitions, and model configuration tailored to our fraud detection task.</p>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Application to Fraud Data</title>
<p>In this section, we will describe a model for analyzing fraud data using the methodology proposed in the previous section. Let the given data be denoted as <inline-formula id="ieqn-86"><mml:math id="mml-ieqn-86"><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>. We interpret the given data as a graph. Here, we assume that the only column capturing the graph structure is <monospace>trans_date_and_time</monospace> (thus, <inline-formula id="ieqn-87"><mml:math id="mml-ieqn-87"><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>J</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula>), and we will refer to it as <inline-formula id="ieqn-88"><mml:math id="mml-ieqn-88"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula>. Therefore, the relationships in the graph can be summarized by <inline-formula id="ieqn-89"><mml:math id="mml-ieqn-89"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula>, and our task can be summarized as predicting <inline-formula id="ieqn-90"><mml:math id="mml-ieqn-90"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> using <inline-formula id="ieqn-91"><mml:math id="mml-ieqn-91"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> and <inline-formula id="ieqn-92"><mml:math id="mml-ieqn-92"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula>.</p>
<p>The loss function <inline-formula id="ieqn-93"><mml:math id="mml-ieqn-93"><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mn>0</mml:mn></mml:msup><mml:mo>+</mml:mo><mml:mi>&#x03BB;</mml:mi><mml:msup><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mrow><mml:mtext>reg</mml:mtext></mml:mrow></mml:msup></mml:math></inline-formula> is designed as follows. Here, <inline-formula id="ieqn-94"><mml:math id="mml-ieqn-94"><mml:msup><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mn>0</mml:mn></mml:msup></mml:math></inline-formula> is the Binary Cross-Entropy (BCE) loss: <inline-formula id="ieqn-95"><mml:math id="mml-ieqn-95"><mml:msup><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mn>0</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mo>+</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula id="ieqn-96"><mml:math id="mml-ieqn-96"><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> is the output of the GCN model. Additionally, <inline-formula id="ieqn-97"><mml:math id="mml-ieqn-97"><mml:msup><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow><mml:mrow><mml:mtext>reg</mml:mtext></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mi mathvariant="normal">&#x22A4;</mml:mi></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>=</mml:mo><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mstyle scriptlevel="0"><mml:mrow><mml:mo maxsize="1.2em" minsize="1.2em">(</mml:mo></mml:mrow></mml:mstyle><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mi>j</mml:mi></mml:msub><mml:msup><mml:mstyle scriptlevel="0"><mml:mrow><mml:mo maxsize="1.2em" minsize="1.2em">)</mml:mo></mml:mrow></mml:mstyle><mml:mn>2</mml:mn></mml:msup></mml:math></inline-formula>. This term represents a regularization component that enforces smoothness in the predictions. Specifically, it penalizes the model when the predicted values <inline-formula id="ieqn-98"><mml:math id="mml-ieqn-98"><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-99"><mml:math id="mml-ieqn-99"><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mi>j</mml:mi></mml:msub></mml:math></inline-formula> for two observations that are close in time are significantly different. Here, <inline-formula id="ieqn-100"><mml:math id="mml-ieqn-100"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> is the degree matrix, <inline-formula id="ieqn-101"><mml:math id="mml-ieqn-101"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> is the adjacency matrix, and <inline-formula id="ieqn-102"><mml:math id="mml-ieqn-102"><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> indicates the weight or connection strength between observations <inline-formula id="ieqn-103"><mml:math id="mml-ieqn-103"><mml:mi>i</mml:mi></mml:math></inline-formula> and <inline-formula id="ieqn-104"><mml:math id="mml-ieqn-104"><mml:mi>j</mml:mi></mml:math></inline-formula>. The term <inline-formula id="ieqn-105"><mml:math id="mml-ieqn-105"><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mstyle scriptlevel="0"><mml:mrow><mml:mo maxsize="1.2em" minsize="1.2em">(</mml:mo></mml:mrow></mml:mstyle><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mrow><mml:mover><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mi>j</mml:mi></mml:msub><mml:msup><mml:mstyle scriptlevel="0"><mml:mrow><mml:mo maxsize="1.2em" minsize="1.2em">)</mml:mo></mml:mrow></mml:mstyle><mml:mn>2</mml:mn></mml:msup></mml:math></inline-formula> ensures that the predictions for closely related observations (i.e., those with a strong connection in the graph) are similar.</p>
<p>Now, let&#x2019;s describe in more detail how to compute <inline-formula id="ieqn-106"><mml:math id="mml-ieqn-106"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula>. To create a weight matrix corresponding to <inline-formula id="ieqn-107"><mml:math id="mml-ieqn-107"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula>, we consider <monospace>cc_num</monospace> because even if transactions are temporally close, if they are made by different customer numbers, it is correct not to consider their connectivity. Therefore, <inline-formula id="ieqn-108"><mml:math id="mml-ieqn-108"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> will have a block-matrix structure: <inline-formula id="ieqn-109"><mml:math id="mml-ieqn-109"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mtext>diag</mml:mtext><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>I</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>. Here, <italic>I</italic> is the set of <monospace>cc_num</monospace>. For a fixed <inline-formula id="ieqn-110"><mml:math id="mml-ieqn-110"><mml:mi>i</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>I</mml:mi></mml:math></inline-formula>, the <inline-formula id="ieqn-111"><mml:math id="mml-ieqn-111"><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi>s</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>-th elements of <inline-formula id="ieqn-112"><mml:math id="mml-ieqn-112"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are defined as follows, as used in [<xref ref-type="bibr" rid="ref-28">28</xref>]: <inline-formula id="ieqn-113"><mml:math id="mml-ieqn-113"><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi>s</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mi>exp</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mo>&#x2212;</mml:mo><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>s</mml:mi><mml:msup><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mi>&#x03D5;</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> if <inline-formula id="ieqn-114"><mml:math id="mml-ieqn-114"><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi>s</mml:mi><mml:mo>&#x2208;</mml:mo><mml:msub><mml:mi>T</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula>, and 0 otherwise. In this context, <inline-formula id="ieqn-115"><mml:math id="mml-ieqn-115"><mml:msub><mml:mi>T</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> represents the set of transaction times for the <italic>i</italic>-th customer. The parameter <inline-formula id="ieqn-116"><mml:math id="mml-ieqn-116"><mml:mi>&#x03D5;</mml:mi></mml:math></inline-formula> is a positive constant that scales the time difference. A larger value in the exponential function indicates that the transaction times are closer together. To summarize, the weight matrix <inline-formula id="ieqn-117"><mml:math id="mml-ieqn-117"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> is constructed by considering both the temporal proximity and the customer number. Each block matrix <inline-formula id="ieqn-118"><mml:math id="mml-ieqn-118"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> within <inline-formula id="ieqn-119"><mml:math id="mml-ieqn-119"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> represents the temporal relationships between transactions for a specific customer. Transactions from different customers are not connected, which is reflected in the block-diagonal structure of <inline-formula id="ieqn-120"><mml:math id="mml-ieqn-120"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">W</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula>.</p>
<p>We have configured the architecture as shown in <xref ref-type="fig" rid="fig-4">Fig. 4</xref> to obtain the hidden layer.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>GCN architecture for extracting non-Euclidean features. Here <inline-formula id="ieqn-121"><mml:math id="mml-ieqn-121"><mml:msub><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">D</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> is the normalized adjacency matrix. The input <inline-formula id="ieqn-122"><mml:math id="mml-ieqn-122"><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> consists of the transaction amount <inline-formula id="ieqn-123"><mml:math id="mml-ieqn-123"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">amt</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula>, processed through two GCN layers to produce an 8-dimensional representation <inline-formula id="ieqn-124"><mml:math id="mml-ieqn-124"><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>2</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula></title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73500-fig-4.tif"/>
</fig>
<p>Here <inline-formula id="ieqn-125"><mml:math id="mml-ieqn-125"><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> consists solely of <inline-formula id="ieqn-126"><mml:math id="mml-ieqn-126"><mml:msub><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">amt</mml:mtext></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula>, since <monospace>amt</monospace> is the most crucial factor in predicting fraud. A typical GCN architecture uses an <inline-formula id="ieqn-127"><mml:math id="mml-ieqn-127"><mml:mrow><mml:mn>8</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math></inline-formula> GCN Layer instead of the final Linear Layer. However, our objective is to obtain the <inline-formula id="ieqn-128"><mml:math id="mml-ieqn-128"><mml:mrow><mml:mi>n</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mn>8</mml:mn></mml:mrow></mml:math></inline-formula> matrix <inline-formula id="ieqn-129"><mml:math id="mml-ieqn-129"><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>2</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> for concatenation with the original feature matrix <inline-formula id="ieqn-130"><mml:math id="mml-ieqn-130"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula> (which includes variables not used for graph construction), so we chose a Linear Layer solely for dimension reduction, sacrificing prediction performance in order to fully capture non-Euclidean information in <inline-formula id="ieqn-131"><mml:math id="mml-ieqn-131"><mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="monospace">time</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>2</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula>.</p>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Experiment &#x0026; Results</title>
<p>We evaluate our proposed graph-augmented approach using six baseline tabular models: NeuralNet (PyTorch-based MLP), RandomForest [<xref ref-type="bibr" rid="ref-29">29</xref>], ExtraTrees [<xref ref-type="bibr" rid="ref-30">30</xref>], XGBoost [<xref ref-type="bibr" rid="ref-31">31</xref>], LightGBM [<xref ref-type="bibr" rid="ref-32">32</xref>], and CatBoost [<xref ref-type="bibr" rid="ref-33">33</xref>]. The baseline models were trained using AutoGluon-Tabular [<xref ref-type="bibr" rid="ref-34">34</xref>], an automated machine learning framework.</p>
<sec id="s5_1">
<label>5.1</label>
<title>Overall Performance Comparison</title>
<p>We conducted comprehensive experiments to evaluate our proposed method against three categories of approaches: (i) conventional tabular models as baselines, (ii) the same tabular models augmented with GCN embeddings (our proposed approach), (iii) sequence-based architectures including GRU [<xref ref-type="bibr" rid="ref-35">35</xref>], LSTM with attention [<xref ref-type="bibr" rid="ref-36">36</xref>], Transformer-based models [<xref ref-type="bibr" rid="ref-37">37</xref>], and temporal convolutional networks (TCN) [<xref ref-type="bibr" rid="ref-38">38</xref>], and (iv) graph-based models such as GCN [<xref ref-type="bibr" rid="ref-20">20</xref>] and DySAT [<xref ref-type="bibr" rid="ref-21">21</xref>].</p>
<p>The results reveal important findings across model categories. Baseline tabular models already achieve strong performance (NeuralNet AUC 0.997, ensemble models 0.94&#x2013;0.99), maintaining high precision (above 0.90) but with relatively lower recall (0.21&#x2013;0.77). Sequence models achieve competitive Area Under Curve (AUC) (GRU 0.990, LSTM 0.988, Transformer 0.985), but despite high recall (above 0.91), they suffer from very low precision (below 0.15), exhibiting a precision-recall trade-off. Graph-based models (GCN AUC 0.985, DySAT AUC 0.977) directly utilize graph structure for classification but show lower performance than baseline tabular models.</p>
<p>In contrast, our proposed graph-augmented models achieve the highest performance by using GCN embeddings as additional features for tabular models: graph-augmented NeuralNet attains AUC of 0.9995, outperforming all other methods, consistently improving all baseline models (AUC improvement: 0.002&#x2013;0.054), and achieving both high precision (above 0.90) and improved recall for balanced performance. This superior performance can be attributed to the following: (1) compared to tabular models, graph embeddings provide temporal proximity information between transactions that tabular features alone cannot capture, improving recall; (2) compared to sequence models, using GCN embeddings as features rather than as the classifier preserves the stable precision of tabular models while incorporating graph information; and (3) compared to pure graph-based models, extracting graph features and then applying well-established tabular classifiers achieves better generalization than performing both feature extraction and classification in non-Euclidean space.</p>
<p>These results cannot be directly generalized to fraud detection in general. The optimal model may vary depending on the fraud transaction ratio, characteristics of fraud patterns, and temporal structure of the data. However, our proposed method holds unique significance in that it appropriately combines the stable precision of tabular models, the sequential pattern capturing capability of sequence models, and the relational information utilization of graph models.</p>
</sec>
<sec id="s5_2">
<label>5.2</label>
<title>Prediction Confidence Analysis</title>
<p>AUC is a useful threshold-independent metric for evaluating the overall discriminative ability of classifiers. However, in extreme class imbalance settings, F1-score also warrants consideration. As shown in <xref ref-type="table" rid="table-1">Table 1</xref>, baseline Euclidean models achieve high precision (0.81&#x2013;0.92) but relatively low recall (0.21&#x2013;0.78)&#x2014;this occurs because they primarily detect &#x201C;certain&#x201D; fraud cases with high transaction amounts. In contrast, our proposed graph-augmented models maintain precision (0.91&#x2013;0.95) while improving recall (0.76&#x2013;0.88), showing improvements in F1-score (0.84&#x2013;0.90). To analyze specifically where these AUC and F1-score improvements originate, we examine prediction confidence.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Comprehensive Model Comparison</title>
</caption>
<table>
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th align="center">Category</th>
<th align="center">Method</th>
<th>Accuracy</th>
<th>Precision</th>
<th>Recall</th>
<th>F1-score</th>
<th>AUC</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="6">Tabular<break/>(Baseline)</td>
<td>NeuralNet</td>
<td>0.998351</td>
<td>0.921711</td>
<td>0.777902</td>
<td>0.843722</td>
<td>0.997630</td>
</tr>
<tr>

<td>RandomForest [<xref ref-type="bibr" rid="ref-29">29</xref>]</td>
<td>0.997302</td>
<td>0.823810</td>
<td>0.672405</td>
<td>0.740447</td>
<td>0.980178</td>
</tr>
<tr>

<td>ExtraTrees [<xref ref-type="bibr" rid="ref-30">30</xref>]</td>
<td>0.997718</td>
<td>0.901409</td>
<td>0.675181</td>
<td>0.772064</td>
<td>0.981392</td>
</tr>
<tr>

<td>LightGBM [<xref ref-type="bibr" rid="ref-32">32</xref>]</td>
<td>0.996949</td>
<td>0.842149</td>
<td>0.574681</td>
<td>0.683169</td>
<td>0.991976</td>
</tr>
<tr>

<td>CatBoost [<xref ref-type="bibr" rid="ref-33">33</xref>]</td>
<td>0.997111</td>
<td>0.843077</td>
<td>0.608551</td>
<td>0.706869</td>
<td>0.974223</td>
</tr>
<tr>

<td>XGBoost [<xref ref-type="bibr" rid="ref-31">31</xref>]</td>
<td>0.996748</td>
<td>0.810208</td>
<td>0.564132</td>
<td>0.665140</td>
<td>0.945512</td>
</tr>
<tr>
<td rowspan="6">Proposed</td>
<td>Graph-aug. NeuralNet</td>
<td>0.998834</td>
<td>0.924764</td>
<td>0.866741</td>
<td>0.894813</td>
<td><bold>0.999516</bold></td>
</tr>
<tr>

<td>Graph-aug. RandomForest</td>
<td>0.998669</td>
<td>0.942949</td>
<td>0.816769</td>
<td>0.875335</td>
<td>0.999124</td>
</tr>
<tr>

<td>Graph-aug. ExtraTrees</td>
<td>0.998379</td>
<td><bold>0.951084</bold></td>
<td>0.755692</td>
<td>0.842203</td>
<td>0.997823</td>
</tr>
<tr>

<td>Graph-aug. LightGBM</td>
<td><bold>0.998862</bold></td>
<td>0.914418</td>
<td>0.883954</td>
<td><bold>0.898928</bold></td>
<td>0.999481</td>
</tr>
<tr>

<td>Graph-aug. CatBoost</td>
<td>0.998071</td>
<td>0.902835</td>
<td>0.742921</td>
<td>0.815109</td>
<td>0.994701</td>
</tr>
<tr>

<td>Graph-aug. XGBoost</td>
<td>0.998656</td>
<td>0.914561</td>
<td>0.843976</td>
<td>0.877852</td>
<td>0.999132</td>
</tr>
<tr>
<td rowspan="6">Sequence</td>
<td>LSTM [<xref ref-type="bibr" rid="ref-3">3</xref>]</td>
<td>0.968233</td>
<td>0.147261</td>
<td>0.949473</td>
<td>0.254977</td>
<td>0.987437</td>
</tr>
<tr>

<td>GRU [<xref ref-type="bibr" rid="ref-35">35</xref>]</td>
<td>0.967699</td>
<td>0.145101</td>
<td>0.948917</td>
<td>0.251712</td>
<td>0.990048</td>
</tr>
<tr>

<td>LSTM&#x002B;Attention [<xref ref-type="bibr" rid="ref-36">36</xref>]</td>
<td>0.968182</td>
<td>0.147301</td>
<td><bold>0.951694</bold></td>
<td>0.255116</td>
<td>0.989439</td>
</tr>
<tr>

<td>Transformer [<xref ref-type="bibr" rid="ref-37">37</xref>]</td>
<td>0.959977</td>
<td>0.119489</td>
<td>0.940589</td>
<td>0.212042</td>
<td>0.984973</td>
</tr>
<tr>

<td>TabTransformer [<xref ref-type="bibr" rid="ref-39">39</xref>]</td>
<td>0.817323</td>
<td>0.028575</td>
<td>0.936702</td>
<td>0.055458</td>
<td>0.949902</td>
</tr>
<tr>

<td>TCN [<xref ref-type="bibr" rid="ref-38">38</xref>]</td>
<td>0.841814</td>
<td>0.032134</td>
<td>0.914492</td>
<td>0.062087</td>
<td>0.944878</td>
</tr>
<tr>
<td rowspan="2">Graph-based</td>
<td>GCN [<xref ref-type="bibr" rid="ref-20">20</xref>]</td>
<td>0.994739</td>
<td>0.617742</td>
<td>0.212660</td>
<td>0.316398</td>
<td>0.984965</td>
</tr>
<tr>

<td>DySAT [<xref ref-type="bibr" rid="ref-21">21</xref>]</td>
<td>0.994475</td>
<td>0.773913</td>
<td>0.049417</td>
<td>0.092902</td>
<td>0.976723</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>While the overall AUC values show modest improvements due to already well-trained baseline models, the practical benefit becomes more pronounced when examining predictions for low-amount transactions. <xref ref-type="fig" rid="fig-5">Fig. 5</xref> presents a comprehensive comparison of predicted probabilities for actual fraud cases (<inline-formula id="ieqn-132"><mml:math id="mml-ieqn-132"><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula>) using the LightGBM model (results for other models are available in the supplementary materials). The top row shows histograms: the proposed model (orange) concentrates predictions near 1.0, indicating high confidence, while the classic model (blue) spreads predictions across a wider range. This difference is particularly striking for low-amount transactions (amt <inline-formula id="ieqn-133"><mml:math id="mml-ieqn-133"><mml:mo>&#x003C;</mml:mo><mml:mn>80</mml:mn></mml:math></inline-formula>), where the classic model shows a bimodal distribution with many predictions near 0, whereas the proposed model maintains confident predictions near 1.0.</p>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>Comparison of predicted fraud probabilities for actual fraud cases (<inline-formula id="ieqn-134"><mml:math id="mml-ieqn-134"><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula>). Top row: histograms showing prediction distributions; bottom row: empirical cumulative distribution functions (empirical CDF). Columns show all fraud cases (left), low-amount transactions with amt <inline-formula id="ieqn-135"><mml:math id="mml-ieqn-135"><mml:mo>&#x003C;</mml:mo><mml:mn>80</mml:mn></mml:math></inline-formula> (middle), and high-amount transactions with amt <inline-formula id="ieqn-136"><mml:math id="mml-ieqn-136"><mml:mo>&#x2265;</mml:mo><mml:mn>80</mml:mn></mml:math></inline-formula> (right). The classic model (blue) uses only tabular features <inline-formula id="ieqn-137"><mml:math id="mml-ieqn-137"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:mrow></mml:math></inline-formula>, while the proposed model (orange) incorporates GCN embeddings <inline-formula id="ieqn-138"><mml:math id="mml-ieqn-138"><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula>. The dashed gray line in empirical CDF plots represents the ideal classifier. The proposed model consistently achieves predictions closer to 1.0, particularly for low-amount transactions where the classic model struggles</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73500-fig-5.tif"/>
</fig>
<p>The bottom row of <xref ref-type="fig" rid="fig-5">Fig. 5</xref> presents the empirical cumulative distribution function (empirical CDF) of these predicted probabilities, providing a clearer comparison. For an ideal classifier, all fraud cases should receive a predicted probability of 1, resulting in a step function at probability <inline-formula id="ieqn-139"><mml:math id="mml-ieqn-139"><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula> (dashed gray line). The closer a model&#x2019;s empirical cumulative distribution function (CDF) is to this ideal step function, the more confidently it identifies fraud. The classic model (blue) shows a gradual increase across the probability range, indicating uncertainty in fraud detection, while the proposed model (orange) concentrates predictions near 1.0 with a median of 0.99 compared to 0.92 for the classic model. This improvement is particularly pronounced for low-amount transactions (amt <inline-formula id="ieqn-140"><mml:math id="mml-ieqn-140"><mml:mo>&#x003C;</mml:mo><mml:mn>80</mml:mn></mml:math></inline-formula>) in the middle column, where the classic model&#x2019;s empirical CDF rises steeply even at low probability values. The right column shows high-amount transactions (amt <inline-formula id="ieqn-141"><mml:math id="mml-ieqn-141"><mml:mo>&#x2265;</mml:mo><mml:mn>80</mml:mn></mml:math></inline-formula>), where both models perform better, but the proposed model still achieves predictions closer to the ideal, confirming that graph-based embeddings provide additional discriminative power.</p>

<p>We present a visual example using Michael Rodriguez&#x2019;s transactions, which include 249 test transactions with 4 fraudulent cases. <xref ref-type="table" rid="table-2">Table 2</xref> compares the baseline model predictions with our proposed graph-augmented model.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Michael Rodriguez&#x2019;s fraudulent transaction data</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th rowspan="2">Timestamp</th>
<th rowspan="2">Amount</th>
<th rowspan="2">Label</th>
<th colspan="2">Prediction</th>
<th colspan="2">Probability</th>
</tr>
<tr>
<th>Baseline</th>
<th>Proposed</th>
<th>Baseline</th>
<th>Proposed</th>
</tr>
</thead>
<tbody>
<tr>
<td>2019-10-12 05:12</td>
<td>291.43</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0.919</td>
<td>0.987</td>
</tr>
<tr>
<td>2019-10-12 22:12</td>
<td>905.52</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0.879</td>
<td>0.980</td>
</tr>
<tr>
<td>2019-10-13 05:04</td>
<td>20.02</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0.402</td>
<td>0.963</td>
</tr>
<tr>
<td>2019-10-13 22:16</td>
<td>736.16</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0.778</td>
<td>0.982</td>
</tr>
</tbody>
</table>
</table-wrap>
  
<p>The average transaction amount for Michael Rodriguez is 56.25. The average amount for legitimate transactions is 49.19, while the average amount for fraudulent transactions is 488.28. In existing studies, all transactions with high <monospace>amt</monospace> values were predicted as fraudulent, but a transaction with a small <monospace>amt</monospace> value of 20.02 was predicted as legitimate. When <monospace>amt</monospace> is 20.02, the predicted probability is a low value of 0.4. In contrast, the method utilizing graph information correctly predicted the small <monospace>amt</monospace> transaction as fraudulent with a high probability value of 0.96. Moreover, this method showed higher probability values for high <monospace>amt</monospace> transactions compared to existing studies.</p>
<p>This can be more easily understood by examining <xref ref-type="fig" rid="fig-6">Fig. 6</xref>, which visualizes the graph. It depicts nine transactions closely related to fraudulent transactions among Michael&#x2019;s transactions. The graph shown is after removing links based on the weights of the edges. This method can adequately determine fraud even when the transaction amounts are small.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>Graph representation of transaction patterns for customer Michael Rodriguez</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_73500-fig-6.tif"/>
</fig>
</sec>
<sec id="s5_3">
<label>5.3</label>
<title>Undersampling Experiments</title>
<p>In the previous experiments, we maintained the original fraud transaction ratio of 0.00573 with a 7:3 train-test split. However, dealing with such extreme class imbalance is a common challenge in fraud detection. A widely-used technique is undersampling, where the majority class (legitimate transactions) is reduced to achieve a more balanced training set while keeping the test set unchanged to reflect real-world conditions.</p>
<p>We conducted additional experiments with various undersampling ratios (fraud ratios from 0.05 to 0.5) to evaluate the robustness of our proposed method. The results consistently demonstrate that incorporating GCN embeddings into the feature set yields superior performance compared to using only the original features, regardless of the undersampling ratio. This validates the effectiveness of our approach when combined with standard class imbalance handling techniques. Detailed experimental settings and results are available in our supplementary materials.<xref ref-type="fn" rid="fn-2"><sup>2</sup></xref><fn id="fn-2">
<label>2</label>
<p><ext-link ext-link-type="uri" xlink:href="https://guebin.github.io/non-euclidean-models-for-fraud-detection/">https://guebin.github.io/non-euclidean-models-for-fraud-detection/</ext-link></p>
</fn></p>
</sec>
</sec>
<sec id="s6">
<label>6</label>
<title>Theoretical Interpretation: GCN Embeddings as Temporal Random Effects</title>
<p>This section provides a theoretical framework for understanding why GCN embeddings improve fraud detection performance. We interpret the GCN embeddings as a replacement for traditional random effects in hierarchical models, and conduct statistical tests to validate the significance of their contribution.</p>
<p>To simplify the theoretical discussion and enable closed-form statistical testing, we use logistic regression as the downstream classifier with an undersampled dataset (<inline-formula id="ieqn-142"><mml:math id="mml-ieqn-142"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>8410</mml:mn></mml:math></inline-formula>, fraud ratio &#x003D; 0.5). This choice is justified by the robustness analysis in the supplementary materials, which demonstrates that GCN embeddings provide consistent performance improvements across all six model types (NeuralNet, RandomForest, ExtraTrees, LightGBM, CatBoost, XGBoost) and all undersampling ratios (5%&#x2013;50%). Therefore, insights derived from the logistic regression setting with an undersampled dataset (<inline-formula id="ieqn-143"><mml:math id="mml-ieqn-143"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>8410</mml:mn></mml:math></inline-formula>, fraud ratio &#x003D; 0.5) generalize to the broader experimental spectrum.</p>
<p>Our graph-based approach models both temporal and customer information together. Temporal information can be extracted as features and placed in <inline-formula id="ieqn-144"><mml:math id="mml-ieqn-144"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:math></inline-formula>, while customer effects can be considered through mixed effects models. Using the notation from <xref ref-type="sec" rid="s4">Section 4</xref>, let <inline-formula id="ieqn-145"><mml:math id="mml-ieqn-145"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> denote the feature matrix and <inline-formula id="ieqn-146"><mml:math id="mml-ieqn-146"><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mo fence="false" stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:msup><mml:mo fence="false" stretchy="false">}</mml:mo><mml:mi>n</mml:mi></mml:msup></mml:math></inline-formula> the fraud indicators. For the <inline-formula id="ieqn-147"><mml:math id="mml-ieqn-147"><mml:mi>j</mml:mi></mml:math></inline-formula>-th transaction of customer <inline-formula id="ieqn-148"><mml:math id="mml-ieqn-148"><mml:mi>i</mml:mi></mml:math></inline-formula>, the traditional generalized linear mixed model with customer-level random effects is <inline-formula id="ieqn-149"><mml:math id="mml-ieqn-149"><mml:mtext>logit</mml:mtext><mml:mo stretchy="false">(</mml:mo><mml:mi>P</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x03B2;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">&#x03B2;</mml:mi><mml:mi mathvariant="normal">&#x22A4;</mml:mi></mml:msup><mml:msub><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula>, where <inline-formula id="ieqn-150"><mml:math id="mml-ieqn-150"><mml:msub><mml:mi>u</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x223C;</mml:mo><mml:mi>N</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mi>&#x03C3;</mml:mi><mml:mi>u</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>, <inline-formula id="ieqn-151"><mml:math id="mml-ieqn-151"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the <inline-formula id="ieqn-152"><mml:math id="mml-ieqn-152"><mml:mi>j</mml:mi></mml:math></inline-formula>-th row of <inline-formula id="ieqn-153"><mml:math id="mml-ieqn-153"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:math></inline-formula> corresponding to customer <inline-formula id="ieqn-154"><mml:math id="mml-ieqn-154"><mml:mi>i</mml:mi></mml:math></inline-formula>, and <inline-formula id="ieqn-155"><mml:math id="mml-ieqn-155"><mml:msub><mml:mi>u</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> represents the customer-specific random effect. This formulation assumes conditional independence of transactions given <inline-formula id="ieqn-156"><mml:math id="mml-ieqn-156"><mml:msub><mml:mi>u</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula>.</p>
<p>However, fraudulent transactions exhibit strong temporal clustering&#x2014;multiple unauthorized charges typically occur within minutes before detection. Our proposed method replaces the customer random effect <inline-formula id="ieqn-157"><mml:math id="mml-ieqn-157"><mml:msub><mml:mi>u</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> with a learned embedding: <inline-formula id="ieqn-158"><mml:math id="mml-ieqn-158"><mml:mtext>logit</mml:mtext><mml:mo stretchy="false">(</mml:mo><mml:mi>P</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x03B2;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">&#x03B2;</mml:mi><mml:mi mathvariant="normal">&#x22A4;</mml:mi></mml:msup><mml:msub><mml:mrow><mml:mtext mathvariant="bold">x</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">&#x03B3;</mml:mi><mml:mi mathvariant="normal">&#x22A4;</mml:mi></mml:msup><mml:msub><mml:mrow><mml:mtext mathvariant="bold">h</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, where <inline-formula id="ieqn-159"><mml:math id="mml-ieqn-159"><mml:msub><mml:mrow><mml:mtext mathvariant="bold">h</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2208;</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mi>d</mml:mi></mml:msup></mml:math></inline-formula> is the GCN embedding for the <inline-formula id="ieqn-160"><mml:math id="mml-ieqn-160"><mml:mi>j</mml:mi></mml:math></inline-formula>-th transaction of customer <inline-formula id="ieqn-161"><mml:math id="mml-ieqn-161"><mml:mi>i</mml:mi></mml:math></inline-formula>, computed from customer <inline-formula id="ieqn-162"><mml:math id="mml-ieqn-162"><mml:mi>i</mml:mi></mml:math></inline-formula>&#x2019;s transaction graph. This embedding captures both customer-specific patterns (since it is derived from customer <inline-formula id="ieqn-163"><mml:math id="mml-ieqn-163"><mml:mi>i</mml:mi></mml:math></inline-formula>&#x2019;s own transaction history) and temporal proximity effects (since neighboring transactions with similar timestamps contribute more strongly). The augmented feature matrix is <inline-formula id="ieqn-164"><mml:math id="mml-ieqn-164"><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo>&#x2295;</mml:mo><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow><mml:mn>1</mml:mn><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup><mml:mo>&#x2295;</mml:mo><mml:mo>&#x22EF;</mml:mo><mml:mo>&#x2295;</mml:mo><mml:msubsup><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>J</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>J</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> as defined in Algorithm 1.</p>
<p>Unlike random effects, the GCN embeddings are learned through message-passing and can be treated as fixed effects in the downstream classifier. To empirically validate this interpretation, we conducted an ablation study (<inline-formula id="ieqn-165"><mml:math id="mml-ieqn-165"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>8410</mml:mn></mml:math></inline-formula>, fraud ratio &#x003D; 0.5, logistic regression). The baseline feature matrix <inline-formula id="ieqn-166"><mml:math id="mml-ieqn-166"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:math></inline-formula> consists of <inline-formula id="ieqn-167"><mml:math id="mml-ieqn-167"><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>6</mml:mn></mml:math></inline-formula> continuous features: transaction amount (<monospace>amt</monospace>), customer location (<monospace>lat</monospace>, <monospace>long</monospace>), city population (<monospace>city_pop</monospace>), and merchant location (<monospace>merch_lat</monospace>, <monospace>merch_long</monospace>). These features were selected because they represent the core numerical attributes available for each transaction and are commonly used in fraud detection literature. We also tested adding explicit temporal features (<monospace>trans_hour</monospace>, <monospace>trans_day_of_week</monospace>) as direct features, but found they provide negligible improvement. This is because temporal information is more effectively captured through the graph structure: the edge weights of the transaction graph encode temporal proximity, and the GCN embeddings <inline-formula id="ieqn-168"><mml:math id="mml-ieqn-168"><mml:msup><mml:mrow><mml:mtext mathvariant="bold">H</mml:mtext></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>L</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> learn to exploit this relational structure. The results are summarized in <xref ref-type="table" rid="table-3">Table 3</xref>.</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Ablation study: Contribution of feature components (<inline-formula id="ieqn-169"><mml:math id="mml-ieqn-169"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>8410</mml:mn></mml:math></inline-formula>, fraud ratio &#x003D; 0.5, Logistic Regression)</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Model Configuration</th>
<th>AUC</th>
<th><inline-formula id="ieqn-170"><mml:math id="mml-ieqn-170"><mml:mi mathvariant="normal">&#x0394;</mml:mi></mml:math></inline-formula>AUC</th>
<th>Features</th>
</tr>
</thead>
<tbody>
<tr>
<td>Baseline (X only)</td>
<td>0.831</td>
<td>&#x2014;</td>
<td>6</td>
</tr>
<tr>
<td>X &#x002B; Time</td>
<td>0.831</td>
<td>&#x002B;0.000</td>
<td>8</td>
</tr>
<tr>
<td>X &#x002B; Customer Effects</td>
<td>0.934</td>
<td>&#x002B;0.102</td>
<td>927</td>
</tr>
<tr>
<td>X &#x002B; Time &#x002B; Customer</td>
<td>0.934</td>
<td>&#x002B;0.102</td>
<td>929</td>
</tr>
<tr>
<td>X &#x002B; GCN Embeddings</td>
<td>0.948</td>
<td>&#x002B;0.117</td>
<td>16</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As shown in <xref ref-type="table" rid="table-3">Table 3</xref>, GCN embeddings achieve superior performance (&#x002B;0.117 AUC improvement) with only 16 features, compared to customer dummies which require 927 features for a &#x002B;0.102 improvement.</p>

<p>Notably, explicit temporal features (<monospace>trans_hour</monospace>, <monospace>trans_day_of_week</monospace>) provide essentially no improvement. This reveals a fundamental distinction between two types of temporal patterns. <italic>Absolute temporal patterns</italic> refer to statements like &#x201C;transactions at 23:00 are more likely to be fraudulent,&#x201D; which would require fraud to concentrate at specific hours or days&#x2014;a pattern largely absent in our data. In contrast, <italic>relative temporal patterns</italic> refer to statements like &#x201C;a transaction temporally close to a fraudulent transaction is also likely to be fraudulent&#x201D;&#x2014;this fraud clustering pattern is precisely what our data exhibits.</p>
<p>Conventional feature engineering cannot easily capture relative temporal patterns because doing so would require knowing the fraud labels of neighboring transactions <italic>a priori</italic>. Our GCN-based approach elegantly solves this: the graph structure encodes temporal proximity between transactions, and the message-passing mechanism allows fraud-related signals to propagate through temporally adjacent nodes during training. This explains why the <italic>relational structure</italic> of temporal proximity&#x2014;not raw timestamp values&#x2014;is the key to improved fraud detection. To formally test significance, let <inline-formula id="ieqn-171"><mml:math id="mml-ieqn-171"><mml:mi>f</mml:mi></mml:math></inline-formula> denote the logistic regression classifier. We compare baseline model <inline-formula id="ieqn-172"><mml:math id="mml-ieqn-172"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mn>0</mml:mn></mml:msub><mml:mo>:</mml:mo><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> against the augmented model <inline-formula id="ieqn-173"><mml:math id="mml-ieqn-173"><mml:msub><mml:mrow><mml:mi>&#x02133;</mml:mi></mml:mrow><mml:mn>1</mml:mn></mml:msub><mml:mo>:</mml:mo><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">y</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>, where <inline-formula id="ieqn-174"><mml:math id="mml-ieqn-174"><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> is defined as in Algorithm 1. In our experimental setting: <inline-formula id="ieqn-175"><mml:math id="mml-ieqn-175"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>8410</mml:mn></mml:math></inline-formula> (sample size), <inline-formula id="ieqn-176"><mml:math id="mml-ieqn-176"><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>8</mml:mn></mml:math></inline-formula> (number of baseline features in <inline-formula id="ieqn-177"><mml:math id="mml-ieqn-177"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:math></inline-formula>, including temporal features), and <inline-formula id="ieqn-178"><mml:math id="mml-ieqn-178"><mml:mi>d</mml:mi><mml:mo>=</mml:mo><mml:mn>8</mml:mn></mml:math></inline-formula> (GCN embedding dimensions).</p>
<p>Under <inline-formula id="ieqn-179"><mml:math id="mml-ieqn-179"><mml:msub><mml:mi>H</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:math></inline-formula>: &#x201C;GCN embeddings do not contribute,&#x201D; the likelihood ratio statistic follows <inline-formula id="ieqn-180"><mml:math id="mml-ieqn-180"><mml:mi mathvariant="normal">&#x039B;</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>&#x2113;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>&#x2113;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mover><mml:mo>&#x2192;</mml:mo><mml:mi>d</mml:mi></mml:mover><mml:msubsup><mml:mi>&#x03C7;</mml:mi><mml:mi>d</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:math></inline-formula> asymptotically. This approximation requires regularity conditions, which we verify in <xref ref-type="table" rid="table-4">Table 4</xref>. The first condition ensures adequate sample size relative to the total number of features (<inline-formula id="ieqn-181"><mml:math id="mml-ieqn-181"><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mi>d</mml:mi><mml:mo>=</mml:mo><mml:mn>16</mml:mn></mml:math></inline-formula>). The second condition is satisfied by construction since <inline-formula id="ieqn-182"><mml:math id="mml-ieqn-182"><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> contains all columns of <inline-formula id="ieqn-183"><mml:math id="mml-ieqn-183"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow></mml:math></inline-formula>.</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Verification of <inline-formula id="ieqn-184"><mml:math id="mml-ieqn-184"><mml:msup><mml:mi>&#x03C7;</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:math></inline-formula> approximation conditions for LR test (<inline-formula id="ieqn-185"><mml:math id="mml-ieqn-185"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>8410</mml:mn></mml:math></inline-formula>, <inline-formula id="ieqn-186"><mml:math id="mml-ieqn-186"><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>8</mml:mn></mml:math></inline-formula>, <inline-formula id="ieqn-187"><mml:math id="mml-ieqn-187"><mml:mi>d</mml:mi><mml:mo>=</mml:mo><mml:mn>8</mml:mn></mml:math></inline-formula>)</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Condition</th>
<th>Criterion</th>
<th>Observed</th>
<th>Status</th>
</tr>
</thead>
<tbody>
<tr>
<td>Large sample</td>
<td><inline-formula id="ieqn-188"><mml:math id="mml-ieqn-188"><mml:mi>n</mml:mi><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mi>d</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>&#x003E;</mml:mo><mml:mn>10</mml:mn></mml:math></inline-formula></td>
<td><inline-formula id="ieqn-189"><mml:math id="mml-ieqn-189"><mml:mn>8410</mml:mn><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mn>16</mml:mn><mml:mo>=</mml:mo><mml:mn>525.6</mml:mn></mml:math></inline-formula></td>
<td>Pass</td>
</tr>
<tr>
<td>Nested models</td>
<td><inline-formula id="ieqn-190"><mml:math id="mml-ieqn-190"><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo>&#x2282;</mml:mo><mml:mrow><mml:mover><mml:mrow><mml:mtext mathvariant="bold">X</mml:mtext></mml:mrow><mml:mo stretchy="false">&#x007E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula></td>
<td>By construction</td>
<td>Pass</td>
</tr>
<tr>
<td>Permutation validation</td>
<td>Asymptotic and permutation agree</td>
<td>Both significant</td>
<td>Pass</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Finally, we validate the asymptotic <inline-formula id="ieqn-191"><mml:math id="mml-ieqn-191"><mml:msup><mml:mi>&#x03C7;</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:math></inline-formula> approximation through a permutation test. By randomly permuting the GCN embeddings (100 iterations), we break any true association with the outcome and obtain the null distribution of <inline-formula id="ieqn-192"><mml:math id="mml-ieqn-192"><mml:mi mathvariant="normal">&#x039B;</mml:mi></mml:math></inline-formula>. Under the null hypothesis where GCN embeddings have no predictive value, this distribution has mean <inline-formula id="ieqn-193"><mml:math id="mml-ieqn-193"><mml:mi>&#x03BC;</mml:mi><mml:mo>=</mml:mo><mml:mn>7.11</mml:mn></mml:math></inline-formula> and standard deviation <inline-formula id="ieqn-194"><mml:math id="mml-ieqn-194"><mml:mi>&#x03C3;</mml:mi><mml:mo>=</mml:mo><mml:mn>3.59</mml:mn></mml:math></inline-formula>. Our observed statistic <inline-formula id="ieqn-195"><mml:math id="mml-ieqn-195"><mml:mi mathvariant="normal">&#x039B;</mml:mi><mml:mo>=</mml:mo><mml:mn>2719.92</mml:mn></mml:math></inline-formula> lies approximately 756 standard deviations above this null mean, providing strong evidence against <inline-formula id="ieqn-196"><mml:math id="mml-ieqn-196"><mml:msub><mml:mi>H</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:math></inline-formula>: <inline-formula id="ieqn-197"><mml:math id="mml-ieqn-197"><mml:mi mathvariant="normal">&#x039B;</mml:mi><mml:mo>=</mml:mo><mml:mn>2719.92</mml:mn><mml:mo>&#x226B;</mml:mo><mml:msubsup><mml:mi>&#x03C7;</mml:mi><mml:mrow><mml:mn>8</mml:mn><mml:mo>,</mml:mo><mml:mn>0.001</mml:mn></mml:mrow><mml:mn>2</mml:mn></mml:msubsup><mml:mo>=</mml:mo><mml:mn>26.12</mml:mn><mml:mspace width="1em" /><mml:mo stretchy="false">(</mml:mo><mml:mtext>asymptotic&#xA0;</mml:mtext><mml:mi>p</mml:mi><mml:mo>&#x003C;</mml:mo><mml:mn>0.001</mml:mn><mml:mo>,</mml:mo><mml:mtext>&#xA0;permutation&#xA0;</mml:mtext><mml:mi>p</mml:mi><mml:mo>&#x003C;</mml:mo><mml:mn>0.0001</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo>.</mml:mo></mml:math></inline-formula>The partial F-test confirms these findings: <inline-formula id="ieqn-198"><mml:math id="mml-ieqn-198"><mml:mi>F</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mtext>SSE</mml:mtext><mml:mn>0</mml:mn></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mtext>SSE</mml:mtext><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mtext>SSE</mml:mtext><mml:mn>1</mml:mn></mml:msub><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>k</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>d</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mn>837.23</mml:mn></mml:math></inline-formula> (<inline-formula id="ieqn-199"><mml:math id="mml-ieqn-199"><mml:mi>p</mml:mi></mml:math></inline-formula>-value <inline-formula id="ieqn-200"><mml:math id="mml-ieqn-200"><mml:mo>&#x003C;</mml:mo><mml:mn>0.001</mml:mn></mml:math></inline-formula>), <inline-formula id="ieqn-201"><mml:math id="mml-ieqn-201"><mml:mi mathvariant="normal">&#x0394;</mml:mi><mml:msup><mml:mi>R</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn>0.231</mml:mn></mml:math></inline-formula>.</p>
<p>These tests confirm that GCN embeddings provide statistically significant predictive value by capturing temporal dependencies that traditional methods overlook.</p>
</sec>
<sec id="s7">
<label>7</label>
<title>Conclusion</title>
<sec id="s7_1">
<label>7.1</label>
<title>Why Graph-Augmented Features Work</title>
<p>Sequence-based approaches (LSTM, Gated Recurrent Unit (GRU), Transformer) assume equally-spaced time intervals, but credit card transactions occur at irregular intervals&#x2014;ranging from seconds apart during shopping sessions to days or weeks between purchases. Moreover, these models process transactions as ordered sequences without modeling the connectivity structure between temporally proximate transactions. Our graph-based approach explicitly captures irregular temporal relationships through edge weights that encode temporal proximity, naturally handling non-uniform time spacing without requiring temporal discretization or padding.</p>
<p>While non-Euclidean approaches can capture temporal and customer dependencies, they introduce unnecessary complexity. In practice, the vast majority of legitimate transactions can be correctly classified using Euclidean features alone&#x2014;temporal clustering patterns become critical <italic>only in the vicinity of fraudulent activity</italic>. A purely graph-based model would be computationally expensive and may overfit to relational patterns irrelevant for normal transactions. Our analysis shows that GCN embeddings contribute an additional 23.1% of explained variance (<inline-formula id="ieqn-202"><mml:math id="mml-ieqn-202"><mml:mi mathvariant="normal">&#x0394;</mml:mi><mml:msup><mml:mi>R</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn>0.231</mml:mn></mml:math></inline-formula>), but this contribution is localized to fraud-adjacent regions rather than uniformly distributed across all transactions.</p>
<p>Our approach offers an elegant middle ground. The base Euclidean classifier efficiently handles the majority of straightforward cases, while GCN embeddings provide supplementary non-Euclidean information precisely where it matters&#x2014;near fraud clusters. This design avoids the overhead of full graph-based inference while retaining the benefits of temporal dependency modeling. The concatenation allows the downstream classifier to adaptively weight Euclidean vs. non-Euclidean features based on context, achieving both high precision (above 0.90) and improved recall without the computational burden of end-to-end graph models. Furthermore, this modular design offers practical advantages: the GCN embedding generation is independent of the downstream classifier, allowing practitioners to leverage existing classification infrastructure while benefiting from graph-based feature augmentation.</p>
</sec>
<sec id="s7_2">
<label>7.2</label>
<title>Limitations and Future Work</title>
<p>While our proposed method demonstrates strong performance on the analyzed dataset, several limitations should be acknowledged. First, our evaluation is based on a single publicly available dataset that may not fully represent the diversity of fraud patterns encountered in real-world financial systems. Credit card fraud datasets vary significantly in their characteristics: some contain only anonymized features and lack explicit temporal information. The generalizability of our method to these diverse fraud scenarios requires further validation. Second, our approach constructs individual graphs for each customer, which assumes sufficient transaction history per customer. For customers with very few transactions, the graph structure may be too sparse to provide meaningful embeddings. Third, our approach assumes that temporal proximity is the primary factor determining transaction relationships, which may not hold in all fraud scenarios&#x2014;for instance, when fraudsters deliberately introduce time delays between fraudulent activities. Future work could address these limitations by: (i) evaluating on multiple fraud datasets with different characteristics, (ii) developing adaptive methods for customers with limited transaction history, and (iii) exploring alternative kernel functions that capture more complex temporal patterns.</p>
</sec>
</sec>
</body>
<back>



<ack>
<p>Not applicable.</p>
</ack>
<sec>
<title>Funding Statement</title>
<p>This research was supported by the National Research Foundation of Korea (NRF) funded by the Korea government (RS-2023-00249743). Additionally, this research was supported by the Global-Learning &#x0026; Academic Research Institution for Master&#x2019;s, PhD Students, and Postdocs (LAMP) Program of the National Research Foundation of Korea (NRF) grant funded by the Ministry of Education (RS-2024-00443714). This research was also supported by the &#x201C;Research Base Construction Fund Support Program&#x201D; funded by Jeonbuk National University in 2025.</p>
</sec>
<sec>
<title>Author Contributions</title>
<p>Conceptualization and methodology, Guebin Choi; software, validation, formal analysis, investigation, data curation, visualization, and writing&#x2014;original draft, Boram Kim; writing&#x2014;review and editing, Guebin Choi and Boram Kim; supervision and funding acquisition, Guebin Choi. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability">
<title>Availability of Data and Materials</title>
<p>The data that support the findings of this study are openly available in Kaggle at <ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/dermisfit/fraud-transactions-dataset">https://www.kaggle.com/datasets/dermisfit/fraud-transactions-dataset</ext-link>. Supplementary materials including detailed experimental results are available at <ext-link ext-link-type="uri" xlink:href="https://guebin.github.io/non-euclidean-models-for-fraud-detection/">https://guebin.github.io/non-euclidean-models-for-fraud-detection/</ext-link>.</p>
</sec>
<sec>
<title>Ethics Approval</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement">
<title>Conflicts of Interest</title>
<p>The authors declare no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><collab>Nilson Report</collab></person-group>. <article-title>Global card fraud losses reach $403.88 Billion</article-title>. <year>2025</year> <comment>[cited 2025 Jan 15]</comment>. Available from: <ext-link ext-link-type="uri" xlink:href="https://nilsonreport.com">https://nilsonreport.com</ext-link>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Whitrow</surname> <given-names>C</given-names></string-name>, <string-name><surname>Hand</surname> <given-names>DJ</given-names></string-name>, <string-name><surname>Juszczak</surname> <given-names>P</given-names></string-name>, <string-name><surname>Weston</surname> <given-names>D</given-names></string-name>, <string-name><surname>Adams</surname> <given-names>NM</given-names></string-name></person-group>. <article-title>Transaction aggregation as a strategy for credit card fraud detection</article-title>. <source>Data Min Knowl Discov</source>. <year>2009</year>;<volume>18</volume>(<issue>1</issue>):<fpage>30</fpage>&#x2013;<lpage>55</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s10618-008-0116-z</pub-id>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hochreiter</surname> <given-names>S</given-names></string-name>, <string-name><surname>Schmidhuber</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Long short-term memory</article-title>. <source>Neural Comput</source>. <year>1997</year>;<volume>9</volume>(<issue>8</issue>):<fpage>1735</fpage>&#x2013;<lpage>80</lpage>. doi:<pub-id pub-id-type="doi">10.1162/neco.1997.9.8.1735</pub-id>; <pub-id pub-id-type="pmid">9377276</pub-id></mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Krawczyk</surname> <given-names>B</given-names></string-name></person-group>. <article-title>Learning from imbalanced data: open challenges and future directions</article-title>. <source>Prog Artif Intell</source>. <year>2016</year>;<volume>5</volume>(<issue>4</issue>):<fpage>221</fpage>&#x2013;<lpage>32</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s13748-016-0094-0</pub-id>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Makki</surname> <given-names>S</given-names></string-name>, <string-name><surname>Assaghir</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Taher</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Haque</surname> <given-names>R</given-names></string-name>, <string-name><surname>Hacid</surname> <given-names>MS</given-names></string-name>, <string-name><surname>Zeineddine</surname> <given-names>H</given-names></string-name></person-group>. <article-title>An experimental study with imbalanced classification approaches for credit card fraud detection</article-title>. <source>IEEE Access</source>. <year>2019</year>;<volume>7</volume>:<fpage>93010</fpage>&#x2013;<lpage>22</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2019.2927266</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Alam</surname> <given-names>TM</given-names></string-name>, <string-name><surname>Shaukat</surname> <given-names>K</given-names></string-name>, <string-name><surname>Hameed</surname> <given-names>IA</given-names></string-name>, <string-name><surname>Luo</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sarwar</surname> <given-names>MU</given-names></string-name>, <string-name><surname>Shabbir</surname> <given-names>S</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>An investigation of credit card default prediction in the imbalanced datasets</article-title>. <source>IEEE Access</source>. <year>2020</year>;<volume>8</volume>:<fpage>201173</fpage>&#x2013;<lpage>98</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2020.3033784</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wei</surname> <given-names>W</given-names></string-name>, <string-name><surname>Li</surname> <given-names>J</given-names></string-name>, <string-name><surname>Cao</surname> <given-names>L</given-names></string-name>, <string-name><surname>Ou</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Effective detection of sophisticated online banking fraud on extremely imbalanced data</article-title>. <source>World Wide Web</source>. <year>2013</year>;<volume>16</volume>(<issue>4</issue>):<fpage>449</fpage>&#x2013;<lpage>75</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s11280-012-0178-0</pub-id>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Tayebi</surname> <given-names>M</given-names></string-name>, <string-name><surname>El Kafhali</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Combining autoencoders and deep learning for effective fraud detection in credit card transactions</article-title>. <source>Oper Res Forum</source>. <year>2025</year>;<volume>6</volume>(<issue>1</issue>):<fpage>8</fpage>. doi:<pub-id pub-id-type="doi">10.1007/s43069-024-00409-6</pub-id>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Tayebi</surname> <given-names>M</given-names></string-name>, <string-name><surname>El Kafhali</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Generative modeling for imbalanced credit card fraud transaction detection</article-title>. <source>J Cybersecur Priv</source>. <year>2025</year>;<volume>5</volume>(<issue>1</issue>):<fpage>9</fpage>. doi:<pub-id pub-id-type="doi">10.3390/jcp5010009</pub-id>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Van Vlasselaer</surname> <given-names>V</given-names></string-name>, <string-name><surname>Bravo</surname> <given-names>C</given-names></string-name>, <string-name><surname>Caelen</surname> <given-names>O</given-names></string-name>, <string-name><surname>Eliassi-Rad</surname> <given-names>T</given-names></string-name>, <string-name><surname>Akoglu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Snoeck</surname> <given-names>M</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>APATE: a novel approach for automated credit card transaction fraud detection using network-based extensions</article-title>. <source>Decis Support Syst</source>. <year>2015</year>;<volume>75</volume>(<issue>3</issue>):<fpage>38</fpage>&#x2013;<lpage>48</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.dss.2015.04.013</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Stamile</surname> <given-names>C</given-names></string-name>, <string-name><surname>Marzullo</surname> <given-names>A</given-names></string-name>, <string-name><surname>Deusebio</surname> <given-names>E</given-names></string-name></person-group>. <source>Graph machine learning: take graph data to the next level by applying machine learning techniques and algorithms</source>. <publisher-loc>Birmingham, UK</publisher-loc>: <publisher-name>Packt Publishing Ltd</publisher-name>.; <year>2021</year>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Weber</surname> <given-names>M</given-names></string-name>, <string-name><surname>Domeniconi</surname> <given-names>G</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>J</given-names></string-name>, <string-name><surname>Weidele</surname> <given-names>DKI</given-names></string-name>, <string-name><surname>Bellei</surname> <given-names>C</given-names></string-name>, <string-name><surname>Robinson</surname> <given-names>T</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Anti-money laundering in bitcoin: experimenting with graph convolutional networks for financial forensics</article-title>. <comment>arXiv:1908.02591</comment>. <year>2019</year>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Grover</surname> <given-names>A</given-names></string-name>, <string-name><surname>Leskovec</surname> <given-names>J</given-names></string-name></person-group>. <article-title>node2vec: scalable feature learning for networks</article-title>. In: <conf-name>Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</conf-name>; <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2016</year>. p. <fpage>855</fpage>&#x2013;<lpage>64</lpage>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Yan</surname> <given-names>B</given-names></string-name>, <string-name><surname>Huang</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Internet financial fraud detection based on a distributed big data approach with Node2vec</article-title>. <source>IEEE Access</source>. <year>2021</year>;<volume>9</volume>:<fpage>43378</fpage>&#x2013;<lpage>86</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2021.3062467</pub-id>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Bruss</surname> <given-names>CB</given-names></string-name>, <string-name><surname>McGee</surname> <given-names>A</given-names></string-name>, <string-name><surname>Muench</surname> <given-names>B</given-names></string-name>, <string-name><surname>Chaluvaraju</surname> <given-names>P</given-names></string-name>, <string-name><surname>Rajput</surname> <given-names>S</given-names></string-name></person-group>. <article-title>DeepTrax: embedding graphs of financial transactions</article-title>. In: <conf-name>2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA)</conf-name>; <publisher-loc>Piscataway, NJ, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>; <year>2019</year>. p. <fpage>126</fpage>&#x2013;<lpage>33</lpage>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Heterogeneous graph auto-encoder for credit card fraud detection</article-title>. <comment>arXiv:2410.08121</comment>. <year>2024</year>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Benchaji</surname> <given-names>I</given-names></string-name>, <string-name><surname>Douzi</surname> <given-names>S</given-names></string-name>, <string-name><surname>El Ouahidi</surname> <given-names>B</given-names></string-name></person-group>. <article-title>Credit card fraud detection model based on LSTM recurrent neural networks</article-title>. <source>J Adv Inf Technol</source>. <year>2021</year>;<volume>12</volume>(<issue>2</issue>):<fpage>113</fpage>&#x2013;<lpage>8</lpage>. doi:<pub-id pub-id-type="doi">10.12720/jait.12.2.113-118</pub-id>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Alarfaj</surname> <given-names>FK</given-names></string-name>, <string-name><surname>Malik</surname> <given-names>I</given-names></string-name>, <string-name><surname>Khan</surname> <given-names>HU</given-names></string-name>, <string-name><surname>Almusallam</surname> <given-names>N</given-names></string-name>, <string-name><surname>Ramzan</surname> <given-names>M</given-names></string-name>, <string-name><surname>Ahmed</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Credit card fraud detection using state-of-the-art machine learning and deep learning algorithms</article-title>. <source>IEEE Access</source>. <year>2022</year>;<volume>10</volume>:<fpage>39700</fpage>&#x2013;<lpage>15</lpage>. doi:<pub-id pub-id-type="doi">10.1109/access.2022.3166891</pub-id>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Yu</surname> <given-names>C</given-names></string-name>, <string-name><surname>Xu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Cao</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Jin</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Credit card fraud detection using advanced transformer model</article-title>. <comment>arXiv:2406.03733</comment>. <year>2024</year>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Kipf</surname> <given-names>TN</given-names></string-name>, <string-name><surname>Welling</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Semi-supervised classification with graph convolutional networks</article-title>. <comment>arXiv:1609.02907</comment>. <year>2016</year>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Sankar</surname> <given-names>A</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Gou</surname> <given-names>L</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>W</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>H</given-names></string-name></person-group>. <article-title>DySAT: deep neural representation learning on dynamic graphs via self-attention networks</article-title>. In: <conf-name>Proceedings of the 13th International Conference on Web Search and Data Mining</conf-name>; <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2020</year>. p. <fpage>519</fpage>&#x2013;<lpage>27</lpage>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>You</surname> <given-names>J</given-names></string-name>, <string-name><surname>Du</surname> <given-names>T</given-names></string-name>, <string-name><surname>Leskovec</surname> <given-names>J</given-names></string-name></person-group>. <article-title>ROLAND: graph learning framework for dynamic graphs</article-title>. In: <conf-name>Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining</conf-name>; <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2022</year>. p. <fpage>2358</fpage>&#x2013;<lpage>66</lpage>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Cheng</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>C</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>CaT-GNN: enhancing credit card fraud detection via causal temporal graph neural networks</article-title>. <comment>arXiv:2402.14708</comment>. <year>2024</year>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Wheeler</surname> <given-names>R</given-names></string-name>, <string-name><surname>Aitken</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Multiple algorithms for fraud detection</article-title>. In: <conf-name>Applications and Innovations in Intelligent Systems VII: Proceedings of ES99, the Nineteenth SGES International Conference on Knowledge Based Systems and Applied Artificial Intelligence</conf-name>; <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2000</year>. p. <fpage>219</fpage>&#x2013;<lpage>31</lpage>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Srivastava</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kundu</surname> <given-names>A</given-names></string-name>, <string-name><surname>Sural</surname> <given-names>S</given-names></string-name>, <string-name><surname>Majumdar</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Credit card fraud detection using hidden Markov model</article-title>. <source>IEEE Trans Dependable Secure Comput</source>. <year>2008</year>;<volume>5</volume>(<issue>1</issue>):<fpage>37</fpage>&#x2013;<lpage>48</lpage>. doi:<pub-id pub-id-type="doi">10.1109/tdsc.2007.70228</pub-id>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>S&#x00E1;nchez</surname> <given-names>D</given-names></string-name>, <string-name><surname>Vila</surname> <given-names>M</given-names></string-name>, <string-name><surname>Cerda</surname> <given-names>L</given-names></string-name>, <string-name><surname>Serrano</surname> <given-names>JM</given-names></string-name></person-group>. <article-title>Association rules applied to credit card fraud detection</article-title>. <source>Exp Syst Appl</source>. <year>2009</year>;<volume>36</volume>(<issue>2</issue>):<fpage>3630</fpage>&#x2013;<lpage>40</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.eswa.2008.02.001</pub-id>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>T</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>W</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Effective high-order graph representation learning for credit card fraud detection</article-title>. <comment>arXiv:2503.01556</comment>. <year>2025</year>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Shuman</surname> <given-names>DI</given-names></string-name>, <string-name><surname>Narang</surname> <given-names>SK</given-names></string-name>, <string-name><surname>Frossard</surname> <given-names>P</given-names></string-name>, <string-name><surname>Ortega</surname> <given-names>A</given-names></string-name>, <string-name><surname>Vandergheynst</surname> <given-names>P</given-names></string-name></person-group>. <article-title>The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains</article-title>. <source>IEEE Signal Process Mag</source>. <year>2013</year>;<volume>30</volume>(<issue>3</issue>):<fpage>83</fpage>&#x2013;<lpage>98</lpage>. doi:<pub-id pub-id-type="doi">10.1109/msp.2012.2235192</pub-id>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liaw</surname> <given-names>A</given-names></string-name>, <string-name><surname>Wiener</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Classification and regression by randomForest</article-title>. <source>R News</source>. <year>2002</year>;<volume>2</volume>(<issue>3</issue>):<fpage>18</fpage>&#x2013;<lpage>22</lpage>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Geurts</surname> <given-names>P</given-names></string-name>, <string-name><surname>Ernst</surname> <given-names>D</given-names></string-name>, <string-name><surname>Wehenkel</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Extremely randomized trees</article-title>. <source>Mach Learn</source>. <year>2006</year>;<volume>63</volume>(<issue>1</issue>):<fpage>3</fpage>&#x2013;<lpage>42</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s10994-006-6226-1</pub-id>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Chen</surname> <given-names>T</given-names></string-name>, <string-name><surname>Guestrin</surname> <given-names>C</given-names></string-name></person-group>. <article-title>Xgboost: a scalable tree boosting system</article-title>. In: <conf-name>Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</conf-name>; <publisher-loc>New York, NY, USA</publisher-loc>: <publisher-name>ACM</publisher-name>; <year>2016</year>. p. <fpage>785</fpage>&#x2013;<lpage>94</lpage>.</mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Ke</surname> <given-names>G</given-names></string-name>, <string-name><surname>Meng</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Finley</surname> <given-names>T</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>T</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>W</given-names></string-name>, <string-name><surname>Ma</surname> <given-names>W</given-names></string-name>, <etal>et al</etal></person-group>. <chapter-title>Lightgbm: a highly efficient gradient boosting decision tree</chapter-title>. Vol. <volume>30</volume>. In: <source>Advances in neural information processing systems</source>. <publisher-loc>Red Hook, NY, USA</publisher-loc>: <publisher-name>Curran Associates, Inc</publisher-name>.; <year>2017</year>. doi:<pub-id pub-id-type="doi">10.32614/cran.package.lightgbm</pub-id>.</mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Prokhorenkova</surname> <given-names>L</given-names></string-name>, <string-name><surname>Gusev</surname> <given-names>G</given-names></string-name>, <string-name><surname>Vorobev</surname> <given-names>A</given-names></string-name>, <string-name><surname>Dorogush</surname> <given-names>AV</given-names></string-name>, <string-name><surname>Gulin</surname> <given-names>A</given-names></string-name></person-group>. <chapter-title>CatBoost: unbiased boosting with categorical features</chapter-title>. Vol. <volume>31</volume>. In: <source>Advances in neural information processing systems</source>. <publisher-loc>Red Hook, NY, USA</publisher-loc>: <publisher-name>Curran Associates, Inc.</publisher-name>; <year>2018</year>.</mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Erickson</surname> <given-names>N</given-names></string-name>, <string-name><surname>Mueller</surname> <given-names>J</given-names></string-name>, <string-name><surname>Shirkov</surname> <given-names>A</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Larroy</surname> <given-names>P</given-names></string-name>, <string-name><surname>Li</surname> <given-names>M</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>AutoGluon-Tabular: robust and accurate AutoML for structured data</article-title>. <comment>arXiv:2003.06505. 2020</comment>.</mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Cho</surname> <given-names>K</given-names></string-name>, <string-name><surname>Van Merri&#x00EB;nboer</surname> <given-names>B</given-names></string-name>, <string-name><surname>Gulcehre</surname> <given-names>C</given-names></string-name>, <string-name><surname>Bahdanau</surname> <given-names>D</given-names></string-name>, <string-name><surname>Bougares</surname> <given-names>F</given-names></string-name>, <string-name><surname>Schwenk</surname> <given-names>H</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Learning phrase representations using RNN encoder-decoder for statistical machine translation</article-title>. In: <conf-name>Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</conf-name>; <publisher-loc>Stroudsburg, PA, USA</publisher-loc>: <publisher-name>ACL</publisher-name>; <year>2014</year>. p. <fpage>1724</fpage>&#x2013;<lpage>34</lpage>.</mixed-citation></ref>
<ref id="ref-36"><label>[36]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Bahdanau</surname> <given-names>D</given-names></string-name>, <string-name><surname>Cho</surname> <given-names>K</given-names></string-name>, <string-name><surname>Bengio</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Neural machine translation by jointly learning to align and translate</article-title>. <comment>arXiv:1409.0473</comment>. <year>2015</year>.</mixed-citation></ref>
<ref id="ref-37"><label>[37]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Vaswani</surname> <given-names>A</given-names></string-name>, <string-name><surname>Shazeer</surname> <given-names>N</given-names></string-name>, <string-name><surname>Parmar</surname> <given-names>N</given-names></string-name>, <string-name><surname>Uszkoreit</surname> <given-names>J</given-names></string-name>, <string-name><surname>Jones</surname> <given-names>L</given-names></string-name>, <string-name><surname>Gomez</surname> <given-names>AN</given-names></string-name>, <etal>et al</etal></person-group>. <chapter-title>Attention is all you need</chapter-title>. In: <source>Advances in neural information processing systems</source>. Vol. <volume>30</volume>. <publisher-loc>Red Hook, NY, USA</publisher-loc>: <publisher-name>Curran Associates, Inc.</publisher-name>; <year>2017</year>. doi:<pub-id pub-id-type="doi">10.65215/ctdc8e75</pub-id>.</mixed-citation></ref>
<ref id="ref-38"><label>[38]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Bai</surname> <given-names>S</given-names></string-name>, <string-name><surname>Kolter</surname> <given-names>JZ</given-names></string-name>, <string-name><surname>Koltun</surname> <given-names>V</given-names></string-name></person-group>. <article-title>An empirical evaluation of generic convolutional and recurrent networks for sequence modeling</article-title>. <comment>arXiv:1803.01271</comment>. <year>2018</year>.</mixed-citation></ref>
<ref id="ref-39"><label>[39]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Huang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Khetan</surname> <given-names>A</given-names></string-name>, <string-name><surname>Cvitkovic</surname> <given-names>M</given-names></string-name>, <string-name><surname>Karnin</surname> <given-names>Z</given-names></string-name></person-group>. <article-title>TabTransformer: tabular data modeling using contextual embeddings</article-title>. <comment>arXiv:2012.06678. 2020</comment>.</mixed-citation></ref>
</ref-list>
</back></article>