383 research outputs found
A Privacy-Preserving Hybrid Federated Learning Framework for Financial Crime Detection
The recent decade witnessed a surge of increase in financial crimes across
the public and private sectors, with an average cost of scams of $102m to
financial institutions in 2022. Developing a mechanism for battling financial
crimes is an impending task that requires in-depth collaboration from multiple
institutions, and yet such collaboration imposed significant technical
challenges due to the privacy and security requirements of distributed
financial data. For example, consider the modern payment network systems, which
can generate millions of transactions per day across a large number of global
institutions. Training a detection model of fraudulent transactions requires
not only secured transactions but also the private account activities of those
involved in each transaction from corresponding bank systems. The distributed
nature of both samples and features prevents most existing learning systems
from being directly adopted to handle the data mining task. In this paper, we
collectively address these challenges by proposing a hybrid federated learning
system that offers secure and privacy-aware learning and inference for
financial crime detection. We conduct extensive empirical studies to evaluate
the proposed framework's detection performance and privacy-protection
capability, evaluating its robustness against common malicious attacks of
collaborative learning. We release our source code at
https://github.com/illidanlab/HyFL .Comment: PETs prize challenge versio
Applications of Federated Learning in Smart Cities: Recent Advances, Taxonomy, and Open Challenges
Federated learning plays an important role in the process of smart cities.
With the development of big data and artificial intelligence, there is a
problem of data privacy protection in this process. Federated learning is
capable of solving this problem. This paper starts with the current
developments of federated learning and its applications in various fields. We
conduct a comprehensive investigation. This paper summarize the latest research
on the application of federated learning in various fields of smart cities.
In-depth understanding of the current development of federated learning from
the Internet of Things, transportation, communications, finance, medical and
other fields. Before that, we introduce the background, definition and key
technologies of federated learning. Further more, we review the key
technologies and the latest results. Finally, we discuss the future
applications and research directions of federated learning in smart cities
Privacy-Preserving Federated Learning over Vertically and Horizontally Partitioned Data for Financial Anomaly Detection
The effective detection of evidence of financial anomalies requires
collaboration among multiple entities who own a diverse set of data, such as a
payment network system (PNS) and its partner banks. Trust among these financial
institutions is limited by regulation and competition. Federated learning (FL)
enables entities to collaboratively train a model when data is either
vertically or horizontally partitioned across the entities. However, in
real-world financial anomaly detection scenarios, the data is partitioned both
vertically and horizontally and hence it is not possible to use existing FL
approaches in a plug-and-play manner.
Our novel solution, PV4FAD, combines fully homomorphic encryption (HE),
secure multi-party computation (SMPC), differential privacy (DP), and
randomization techniques to balance privacy and accuracy during training and to
prevent inference threats at model deployment time. Our solution provides input
privacy through HE and SMPC, and output privacy against inference time attacks
through DP. Specifically, we show that, in the honest-but-curious threat model,
banks do not learn any sensitive features about PNS transactions, and the PNS
does not learn any information about the banks' dataset but only learns
prediction labels. We also develop and analyze a DP mechanism to protect output
privacy during inference. Our solution generates high-utility models by
significantly reducing the per-bank noise level while satisfying distributed
DP. To ensure high accuracy, our approach produces an ensemble model, in
particular, a random forest. This enables us to take advantage of the
well-known properties of ensembles to reduce variance and increase accuracy.
Our solution won second prize in the first phase of the U.S. Privacy Enhancing
Technologies (PETs) Prize Challenge.Comment: Prize Winner in the U.S. Privacy Enhancing Technologies (PETs) Prize
Challeng
The Future of Cybercrime: AI and Emerging Technologies Are Creating a Cybercrime Tsunami
This paper reviews the impact of AI and emerging technologies on the future of cybercrime and the necessary strategies to combat it effectively. Society faces a pressing challenge as cybercrime proliferates through AI and emerging technologies. At the same time, law enforcement and regulators struggle to keep it up. Our primary challenge is raising awareness as cybercrime operates within a distinct criminal ecosystem. We explore the hijacking of emerging technologies by criminals (CrimeTech) and their use in illicit activities, along with the tools and processes (InfoSec) to protect against future cybercrime. We also explore the role of AI and emerging technologies (DeepTech) in supporting law enforcement, regulation, and legal services (LawTech)
Equipping Federated Graph Neural Networks with Structure-aware Group Fairness
Graph Neural Networks (GNNs) have been widely used for various types of graph
data processing and analytical tasks in different domains. Training GNNs over
centralized graph data can be infeasible due to privacy concerns and regulatory
restrictions. Thus, federated learning (FL) becomes a trending solution to
address this challenge in a distributed learning paradigm. However, as GNNs may
inherit historical bias from training data and lead to discriminatory
predictions, the bias of local models can be easily propagated to the global
model in distributed settings. This poses a new challenge in mitigating bias in
federated GNNs. To address this challenge, we propose GNN, a Fair
Federated Graph Neural Network, that enhances group fairness of federated GNNs.
As bias can be sourced from both data and learning algorithms, GNN
aims to mitigate both types of bias under federated settings. First, we provide
theoretical insights on the connection between data bias in a training graph
and statistical fairness metrics of the trained GNN models. Based on the
theoretical analysis, we design GNN which contains two key
components: a fairness-aware local model update scheme that enhances group
fairness of the local models on the client side, and a fairness-weighted global
model update scheme that takes both data bias and fairness metrics of local
models into consideration in the aggregation process. We evaluate
GNN empirically versus a number of baseline methods, and
demonstrate that GNN outperforms these baselines in terms of both
fairness and model accuracy
Starlit: Privacy-Preserving Federated Learning to Enhance Financial Fraud Detection
Federated Learning (FL) is a data-minimization approach enabling
collaborative model training across diverse clients with local data, avoiding
direct data exchange. However, state-of-the-art FL solutions to identify
fraudulent financial transactions exhibit a subset of the following
limitations. They (1) lack a formal security definition and proof, (2) assume
prior freezing of suspicious customers' accounts by financial institutions
(limiting the solutions' adoption), (3) scale poorly, involving either
computationally expensive modular exponentiation (where is the total number
of financial institutions) or highly inefficient fully homomorphic encryption,
(4) assume the parties have already completed the identity alignment phase,
hence excluding it from the implementation, performance evaluation, and
security analysis, and (5) struggle to resist clients' dropouts. This work
introduces Starlit, a novel scalable privacy-preserving FL mechanism that
overcomes these limitations. It has various applications, such as enhancing
financial fraud detection, mitigating terrorism, and enhancing digital health.
We implemented Starlit and conducted a thorough performance analysis using
synthetic data from a key player in global financial transactions. The
evaluation indicates Starlit's scalability, efficiency, and accuracy
A TAXONOMY OF MACHINE LEARNING-BASED FRAUD DETECTION SYSTEMS
As fundamental changes in information systems drive digitalization, the heavy reliance on computers today significantly increases the risk of fraud. Existing literature promotes machine learning as a potential solution approach for the problem of fraud detection as it is able able to detect patterns in large datasets efficiently. However, there is a lack of clarity and awareness on which components and functionalities of machine learning-based fraud detection systems exist and how these systems can be classified consistently. We draw on 54 identified relevant machine learning-based fraud detection systems to address this research gap and develop a taxonomic scheme. By deriving three archetypes of machine learning-based fraud detection systems, the taxonomy paves the way for research and practice to understand and advance fraud detection knowledge to combat fraud and abuse
- …