16,885 research outputs found
FinDiff: Diffusion Models for Financial Tabular Data Generation
The sharing of microdata, such as fund holdings and derivative instruments,
by regulatory institutions presents a unique challenge due to strict data
confidentiality and privacy regulations. These challenges often hinder the
ability of both academics and practitioners to conduct collaborative research
effectively. The emergence of generative models, particularly diffusion models,
capable of synthesizing data mimicking the underlying distributions of
real-world data presents a compelling solution. This work introduces 'FinDiff',
a diffusion model designed to generate real-world financial tabular data for a
variety of regulatory downstream tasks, for example economic scenario modeling,
stress tests, and fraud detection. The model uses embedding encodings to model
mixed modality financial data, comprising both categorical and numeric
attributes. The performance of FinDiff in generating synthetic tabular
financial data is evaluated against state-of-the-art baseline models using
three real-world financial datasets (including two publicly available datasets
and one proprietary dataset). Empirical results demonstrate that FinDiff excels
in generating synthetic tabular financial data with high fidelity, privacy, and
utility.Comment: 9 pages, 5 figures, 3 tables, preprint version, currently under
revie
Unleashing the Power of Edge-Cloud Generative AI in Mobile Networks: A Survey of AIGC Services
Artificial Intelligence-Generated Content (AIGC) is an automated method for
generating, manipulating, and modifying valuable and diverse data using AI
algorithms creatively. This survey paper focuses on the deployment of AIGC
applications, e.g., ChatGPT and Dall-E, at mobile edge networks, namely mobile
AIGC networks, that provide personalized and customized AIGC services in real
time while maintaining user privacy. We begin by introducing the background and
fundamentals of generative models and the lifecycle of AIGC services at mobile
AIGC networks, which includes data collection, training, finetuning, inference,
and product management. We then discuss the collaborative cloud-edge-mobile
infrastructure and technologies required to support AIGC services and enable
users to access AIGC at mobile edge networks. Furthermore, we explore
AIGCdriven creative applications and use cases for mobile AIGC networks.
Additionally, we discuss the implementation, security, and privacy challenges
of deploying mobile AIGC networks. Finally, we highlight some future research
directions and open issues for the full realization of mobile AIGC networks
Security and Privacy on Generative Data in AIGC: A Survey
The advent of artificial intelligence-generated content (AIGC) represents a
pivotal moment in the evolution of information technology. With AIGC, it can be
effortless to generate high-quality data that is challenging for the public to
distinguish. Nevertheless, the proliferation of generative data across
cyberspace brings security and privacy issues, including privacy leakages of
individuals and media forgery for fraudulent purposes. Consequently, both
academia and industry begin to emphasize the trustworthiness of generative
data, successively providing a series of countermeasures for security and
privacy. In this survey, we systematically review the security and privacy on
generative data in AIGC, particularly for the first time analyzing them from
the perspective of information security properties. Specifically, we reveal the
successful experiences of state-of-the-art countermeasures in terms of the
foundational properties of privacy, controllability, authenticity, and
compliance, respectively. Finally, we summarize the open challenges and
potential exploration directions from each of theses properties
From Generative AI to Generative Internet of Things: Fundamentals, Framework, and Outlooks
Generative Artificial Intelligence (GAI) possesses the capabilities of
generating realistic data and facilitating advanced decision-making. By
integrating GAI into modern Internet of Things (IoT), Generative Internet of
Things (GIoT) is emerging and holds immense potential to revolutionize various
aspects of society, enabling more efficient and intelligent IoT applications,
such as smart surveillance and voice assistants. In this article, we present
the concept of GIoT and conduct an exploration of its potential prospects.
Specifically, we first overview four GAI techniques and investigate promising
GIoT applications. Then, we elaborate on the main challenges in enabling GIoT
and propose a general GAI-based secure incentive mechanism framework to address
them, in which we adopt Generative Diffusion Models (GDMs) for incentive
mechanism designs and apply blockchain technologies for secure GIoT management.
Moreover, we conduct a case study on modern Internet of Vehicle traffic
monitoring, which utilizes GDMs to generate effective contracts for
incentivizing users to contribute sensing data with high quality. Finally, we
suggest several open directions worth investigating for the future popularity
of GIoT
Byzantine Attack and Defense in Cognitive Radio Networks: A Survey
The Byzantine attack in cooperative spectrum sensing (CSS), also known as the
spectrum sensing data falsification (SSDF) attack in the literature, is one of
the key adversaries to the success of cognitive radio networks (CRNs). In the
past couple of years, the research on the Byzantine attack and defense
strategies has gained worldwide increasing attention. In this paper, we provide
a comprehensive survey and tutorial on the recent advances in the Byzantine
attack and defense for CSS in CRNs. Specifically, we first briefly present the
preliminaries of CSS for general readers, including signal detection
techniques, hypothesis testing, and data fusion. Second, we analyze the spear
and shield relation between Byzantine attack and defense from three aspects:
the vulnerability of CSS to attack, the obstacles in CSS to defense, and the
games between attack and defense. Then, we propose a taxonomy of the existing
Byzantine attack behaviors and elaborate on the corresponding attack
parameters, which determine where, who, how, and when to launch attacks. Next,
from the perspectives of homogeneous or heterogeneous scenarios, we classify
the existing defense algorithms, and provide an in-depth tutorial on the
state-of-the-art Byzantine defense schemes, commonly known as robust or secure
CSS in the literature. Furthermore, we highlight the unsolved research
challenges and depict the future research directions.Comment: Accepted by IEEE Communications Surveys and Tutoiral
Recommender Systems
The ongoing rapid expansion of the Internet greatly increases the necessity
of effective recommender systems for filtering the abundant information.
Extensive research for recommender systems is conducted by a broad range of
communities including social and computer scientists, physicists, and
interdisciplinary researchers. Despite substantial theoretical and practical
achievements, unification and comparison of different approaches are lacking,
which impedes further advances. In this article, we review recent developments
in recommender systems and discuss the major challenges. We compare and
evaluate available algorithms and examine their roles in the future
developments. In addition to algorithms, physical aspects are described to
illustrate macroscopic behavior of recommender systems. Potential impacts and
future directions are discussed. We emphasize that recommendation has a great
scientific depth and combines diverse research fields which makes it of
interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports
Can Tabular Generative Models Generate Realistic Synthetic Near Infrared Spectroscopic Data?
In this thesis, we evaluated the performance of two generative models, Conditional Tabular Gen-
erative Adversarial Network (CTGAN) and Tabular Variational Autoencoder (TVAE), from the
open-source library Synthetic Data Vault (SDV), for generating synthetic Near Infrared (NIR)
spectral data. The aim was to assess the viability of these models in synthetic data generation
for predicting Dry Matter Content (DMC) in the field of NIR spectroscopy. The fidelity and
utility of the synthetic data were examined through a series of benchmarks, including statistical
comparisons, dimensionality reduction, and machine learning tasks.
The results showed that while both CTGAN and TVAE could generate synthetic data with
statistical properties similar to real data, TVAE outperformed CTGAN in terms of preserving
the correlation structure of the data and the relationship between the features and the target
variable, DMC. However, the synthetic data fell short in fooling machine learning classifiers,
indicating a persisting challenge in synthetic data generation.
With respect to utility, neither synthetic dataset produced by CTGAN or TVAE could serve as
a satisfactory substitute for real data in training machine learning models for predicting DMC.
Although TVAE-generated synthetic data showed some potential when used with Random For-
est (RF) and K-Nearest Neighbors (KNN) classifiers, the performance was still inadequate for
practical use.
This study offers valuable insights into the use of generative models for synthetic NIR spectral
data generation, highlighting their current limitations and potential areas for future research
- …