Most fraud detection systems work something like this: they take a single transaction, analyze it, and decide whether it's suspicious or not. This is a reasonable approach, but it has a fundamental limitation. Fraud rarely manifests in a single data point; its essence lies in the connections.
The same phone number registered to a dozen accounts. A single device used for transactions on behalf of different individuals. A network of cards linked by common delivery addresses. If you look at each case in isolation, everything seems legitimate. However, once you see the whole picture, the scheme becomes obvious.
This is precisely where Graph Neural Networks, or GNNs, come into play.
A graph isn't a chart with axes and bars. In mathematics and computer science, a graph is a set of objects (called nodes) and the connections between them (edges). Simply put: dots and the lines connecting them.
For example, a bank transaction represents a connection between two nodes: the sender and the receiver. A user account can be linked to a device, an IP address, an email, and a phone number. All these connections can be represented as a graph, and then, instead of disparate data points, we get a network of relationships.
Graph Neural Networks are specifically designed to work with such structures. Unlike traditional machine learning models, which analyze table rows independently of one another, GNNs consider the context: who is connected to whom, how often, and how unusual that connection is compared to others.
Classical approaches to fraud detection (logistic regression, decision trees, gradient boosting) work well when a transaction's features are described by a set of numbers: amount, time, geolocation, transaction history. But they are virtually blind to structural patterns.
Imagine an account makes a completely normal transaction. The amount is small, the time is typical, and the history is clean. The model lets it pass. But if you look at the graph, it turns out this account is linked to twenty others that were all blocked for fraud a week ago. Without a graph, this information is invisible.
GNNs allow the model to «look around» – to consider not only the properties of a specific node but also its surroundings. This is called neighbor aggregation: the model gathers information from nearby nodes and uses it to evaluate the current one.
Suppose a financial company builds a graph where nodes are users, cards, devices, IP addresses, and transactions, and edges are the connections between them (who used which device, which card was involved in which transaction, and so on).
A GNN traverses this graph and, for each node, generates a numerical representation – a kind of «portrait» that takes its entire neighborhood into account. These representations are then used for classification: fraud or not.
The key advantage is that the model learns from the structure, not from isolated examples. If, in the training data, fraudulent accounts are often linked through common devices, the model will learn this pattern and look for it in new data, even if the specific values have changed.
This is especially valuable because fraudsters adapt. They change names, addresses, and cards. But hiding the underlying structure of their connections is much more difficult.
Graph Neural Networks are used in anti-fraud for several scenarios.
Anomalous Node Detection. The model flags a specific account or card as suspicious based on its position in the graph and the properties of its neighbors.
Anomalous Link Detection. Sometimes, suspicion arises not from the entity itself, but from a specific transaction between two entities – for example, one that is atypical in amount or direction within the context of the entire network.
Suspicious Subgraph Detection. This involves identifying organized schemes, where multiple nodes and edges together form a structure characteristic of coordinated fraud – such as money mule networks or circular transaction schemes.
Despite the appeal of this approach, GNNs present real challenges that are important to understand.
Scale. Financial graphs are enormous, with millions of users and billions of transactions. Training and deploying GNNs on such volumes is technically difficult and expensive. It requires special sampling and optimization methods just to get the model to operate in real time.
Data Labeling. To train a model, you need examples of fraud. But fraudulent cases are a minority in the data, and they are not always easy to label correctly. Working with imbalanced data is a separate challenge.
Explainability. In the financial sector, it's not enough to get the answer «fraud»; you also need to explain why the model made that decision. Regulators, legal requirements, and internal processes all demand transparency. In this regard, GNNs are more complex than, say, a decision tree, where you can literally trace the logic step by step.
Graph Dynamics. The graph is constantly changing: new users, new transactions, new connections. The model must either be retrained regularly or use approaches that can handle a changing structure in real time. This adds complexity to the system's architecture.
It's important not to see GNNs as a silver bullet that replaces everything that came before. In practice, Graph Neural Networks are most often integrated into existing anti-fraud systems as an additional signal, not as a standalone solution.
Classical models still perform well with tabular features and deliver fast results. GNNs add structural context to this. Together, they provide a more complete picture than either could alone.
Some companies go even further, building hybrid architectures where graph representations are combined with transaction time series or text data. But these are more complex cases that require significant resources and expertise.
Fraud is becoming more sophisticated. Schemes are more coordinated, attacks are more targeted, and fraudsters are getting better at mimicking normal users. Standard rules and thresholds are becoming less effective – they are too easy to bypass if you know how they work.
At the same time, the volume of data that financial companies are collecting about their users is growing. And this data contains more and more information specifically about connections: who interacts with whom, through which channels, and in what sequence.
Graph Neural Networks are a way to extract meaning from these connections. They're not perfect or universal, but they are significantly more powerful for tasks where the core of the problem lies in the structure of relationships, not in individual numbers.
In short: fraudsters have been operating in networks for a long time. Now, the models that catch them are starting to think in networks, too. 🔍