989 research outputs found
A Web-Based kNN Money Laundering Detection System
Money laundering is synonymous to clothes laundering and it is the process of transforming the real nature of the source of an income or money. This transformation of the source is usually from an illegitimate source to a legitimate source. Explicitly programmed system, rule-based system and machine learning system exist as anti-money laundering system, however these systems have one or more setbacks, mostly the explicitly programmed and rule-based systems due to their inability to learn from experiences and to improve their performance as they used. The k nearest Neighbour (kNN) model was developed using open datasets on financial transaction from Kaggle.com, which is an open-source website that holds a lot of data. An accuracy of 98.4% was achieved for the selected model. In this article, we developed a web-based money laundering detection system which is based on the kNN Machine Learning model
LaundroGraph: Self-Supervised Graph Representation Learning for Anti-Money Laundering
Anti-money laundering (AML) regulations mandate financial institutions to
deploy AML systems based on a set of rules that, when triggered, form the basis
of a suspicious alert to be assessed by human analysts. Reviewing these cases
is a cumbersome and complex task that requires analysts to navigate a large
network of financial interactions to validate suspicious movements.
Furthermore, these systems have very high false positive rates (estimated to be
over 95\%). The scarcity of labels hinders the use of alternative systems based
on supervised learning, reducing their applicability in real-world
applications.
In this work we present LaundroGraph, a novel self-supervised graph
representation learning approach to encode banking customers and financial
transactions into meaningful representations. These representations are used to
provide insights to assist the AML reviewing process, such as identifying
anomalous movements for a given customer. LaundroGraph represents the
underlying network of financial interactions as a customer-transaction
bipartite graph and trains a graph neural network on a fully self-supervised
link prediction task. We empirically demonstrate that our approach outperforms
other strong baselines on self-supervised link prediction using a real-world
dataset, improving the best non-graph baseline by p.p. of AUC. The goal is
to increase the efficiency of the reviewing process by supplying these
AI-powered insights to the analysts upon review. To the best of our knowledge,
this is the first fully self-supervised system within the context of AML
detection.Comment: Accepted at ACM International Conference on AI in Finance 2022
(ICAIF'22
Graph Mining for Cybersecurity: A Survey
The explosive growth of cyber attacks nowadays, such as malware, spam, and
intrusions, caused severe consequences on society. Securing cyberspace has
become an utmost concern for organizations and governments. Traditional Machine
Learning (ML) based methods are extensively used in detecting cyber threats,
but they hardly model the correlations between real-world cyber entities. In
recent years, with the proliferation of graph mining techniques, many
researchers investigated these techniques for capturing correlations between
cyber entities and achieving high performance. It is imperative to summarize
existing graph-based cybersecurity solutions to provide a guide for future
studies. Therefore, as a key contribution of this paper, we provide a
comprehensive review of graph mining for cybersecurity, including an overview
of cybersecurity tasks, the typical graph mining techniques, and the general
process of applying them to cybersecurity, as well as various solutions for
different cybersecurity tasks. For each task, we probe into relevant methods
and highlight the graph types, graph approaches, and task levels in their
modeling. Furthermore, we collect open datasets and toolkits for graph-based
cybersecurity. Finally, we outlook the potential directions of this field for
future research
Mining complex trees for hidden fruit : a graph–based computational solution to detect latent criminal networks : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Technology at Massey University, Albany, New Zealand.
The detection of crime is a complex and difficult endeavour. Public and private organisations – focusing on law enforcement, intelligence, and compliance – commonly apply the rational isolated actor approach premised on observability and materiality. This is manifested largely as conducting entity-level risk management sourcing ‘leads’ from reactive covert human intelligence sources and/or proactive sources by applying simple rules-based models. Focusing on discrete observable and material actors simply ignores that criminal activity exists within a complex system deriving its fundamental structural fabric from the complex interactions between actors - with those most unobservable likely to be both criminally proficient and influential. The graph-based computational solution developed to detect latent criminal networks is a response to the inadequacy of the rational isolated actor approach that ignores the connectedness and complexity of criminality.
The core computational solution, written in the R language, consists of novel entity resolution, link discovery, and knowledge discovery technology. Entity resolution enables the fusion of multiple datasets with high accuracy (mean F-measure of 0.986 versus competitors 0.872), generating a graph-based expressive view of the problem. Link discovery is comprised of link prediction and link inference, enabling the high-performance detection (accuracy of ~0.8 versus relevant published models ~0.45) of unobserved relationships such as identity fraud. Knowledge discovery uses the fused graph generated and applies the “GraphExtract” algorithm to create a set of subgraphs representing latent functional criminal groups, and a mesoscopic graph representing how this set of criminal groups are interconnected. Latent knowledge is generated from a range of metrics including the “Super-broker” metric and attitude prediction.
The computational solution has been evaluated on a range of datasets that mimic an applied setting, demonstrating a scalable (tested on ~18 million node graphs) and performant (~33 hours runtime on a non-distributed platform) solution that successfully detects relevant latent functional criminal groups in around 90% of cases sampled and enables the contextual understanding of the broader criminal system through the mesoscopic graph and associated metadata. The augmented data assets generated provide a multi-perspective systems view of criminal activity that enable advanced informed decision making across the microscopic mesoscopic macroscopic spectrum
Topological Anomaly Detection in Dynamic Multilayer Blockchain Networks
Motivated by the recent surge of criminal activities with
cross-cryptocurrency trades, we introduce a new topological perspective to
structural anomaly detection in dynamic multilayer networks. We postulate that
anomalies in the underlying blockchain transaction graph that are composed of
multiple layers are likely to also be manifested in anomalous patterns of the
network shape properties. As such, we invoke the machinery of clique persistent
homology on graphs to systematically and efficiently track evolution of the
network shape and, as a result, to detect changes in the underlying network
topology and geometry. We develop a new persistence summary for multilayer
networks, called stacked persistence diagram, and prove its stability under
input data perturbations. We validate our new topological anomaly detection
framework in application to dynamic multilayer networks from the Ethereum
Blockchain and the Ripple Credit Network, and demonstrate that our stacked PD
approach substantially outperforms state-of-art techniques.Comment: 26 pages, 6 figures, 7 table
Loan maturity aggregation in interbank lending networks obscures mesoscale structure and economic functions
Since the 2007-2009 financial crisis, substantial academic effort has been dedicated to improving our understanding of interbank lending networks (ILNs). Because of data limitations or by choice, the literature largely lacks multiple loan maturities. We employ a complete interbank loan contract dataset to investigate whether maturity details are informative of the network structure. Applying the layered stochastic block model of Peixoto (2015) and other tools from network science on a time series of bilateral loans with multiple maturity layers in the Russian ILN, we find that collapsing all such layers consistently obscures mesoscale structure. The optimal maturity granularity lies between completely collapsing and completely separating the maturity layers and depends on the development phase of the interbank market, with a more developed market requiring more layers for optimal description. Closer inspection of the inferred maturity bins associated with the optimal maturity granularity reveals specific economic functions, from liquidity intermediation to financing. Collapsing a network with multiple underlying maturity layers or extracting one such layer, common in economic research, is therefore not only an incomplete representation of the ILN's mesoscale structure, but also conceals existing economic functions. This holds important insights and opportunities for theoretical and empirical studies on interbank market functioning, contagion, stability, and on the desirable level of regulatory data disclosure
- …