157 research outputs found
Biosignal Generation and Latent Variable Analysis with Recurrent Generative Adversarial Networks
The effectiveness of biosignal generation and data augmentation with
biosignal generative models based on generative adversarial networks (GANs),
which are a type of deep learning technique, was demonstrated in our previous
paper. GAN-based generative models only learn the projection between a random
distribution as input data and the distribution of training data.Therefore, the
relationship between input and generated data is unclear, and the
characteristics of the data generated from this model cannot be controlled.
This study proposes a method for generating time-series data based on GANs and
explores their ability to generate biosignals with certain classes and
characteristics. Moreover, in the proposed method, latent variables are
analyzed using canonical correlation analysis (CCA) to represent the
relationship between input and generated data as canonical loadings. Using
these loadings, we can control the characteristics of the data generated by the
proposed method. The influence of class labels on generated data is analyzed by
feeding the data interpolated between two class labels into the generator of
the proposed GANs. The CCA of the latent variables is shown to be an effective
method of controlling the generated data characteristics. We are able to model
the distribution of the time-series data without requiring domain-dependent
knowledge using the proposed method. Furthermore, it is possible to control the
characteristics of these data by analyzing the model trained using the proposed
method. To the best of our knowledge, this work is the first to generate
biosignals using GANs while controlling the characteristics of the generated
data
Data Augmentation for Time-Series Classification: An Extensive Empirical Study and Comprehensive Survey
Data Augmentation (DA) has emerged as an indispensable strategy in Time
Series Classification (TSC), primarily due to its capacity to amplify training
samples, thereby bolstering model robustness, diversifying datasets, and
curtailing overfitting. However, the current landscape of DA in TSC is plagued
with fragmented literature reviews, nebulous methodological taxonomies,
inadequate evaluative measures, and a dearth of accessible, user-oriented
tools. In light of these challenges, this study embarks on an exhaustive
dissection of DA methodologies within the TSC realm. Our initial approach
involved an extensive literature review spanning a decade, revealing that
contemporary surveys scarcely capture the breadth of advancements in DA for
TSC, prompting us to meticulously analyze over 100 scholarly articles to
distill more than 60 unique DA techniques. This rigorous analysis precipitated
the formulation of a novel taxonomy, purpose-built for the intricacies of DA in
TSC, categorizing techniques into five principal echelons:
Transformation-Based, Pattern-Based, Generative, Decomposition-Based, and
Automated Data Augmentation. Our taxonomy promises to serve as a robust
navigational aid for scholars, offering clarity and direction in method
selection. Addressing the conspicuous absence of holistic evaluations for
prevalent DA techniques, we executed an all-encompassing empirical assessment,
wherein upwards of 15 DA strategies were subjected to scrutiny across 8 UCR
time-series datasets, employing ResNet and a multi-faceted evaluation paradigm
encompassing Accuracy, Method Ranking, and Residual Analysis, yielding a
benchmark accuracy of 88.94 +- 11.83%. Our investigation underscored the
inconsistent efficacies of DA techniques, with..
Deep Generative Models: The winning key for large and easily accessible ECG datasets?
Large high-quality datasets are essential for building powerful artificial intelligence (AI) algorithms capable of supporting advancement in cardiac clinical research. However, researchers working with electrocardiogram (ECG) signals struggle to get access and/or to build one. The aim of the present work is to shed light on a potential solution to address the lack of large and easily accessible ECG datasets. Firstly, the main causes of such a lack are identified and examined. Afterward, the potentials and limitations of cardiac data generation via deep generative models (DGMs) are deeply analyzed. These very promising algorithms have been found capable not only of generating large quantities of ECG signals but also of supporting data anonymization processes, to simplify data sharing while respecting patients' privacy. Their application could help research progress and cooperation in the name of open science. However several aspects, such as a standardized synthetic data quality evaluation and algorithm stability, need to be further explored
Wearable Data Generation Using Time-Series Generative Adversarial Networks for Hydration Monitoring
Collection of biosignals data from wearable devices for machine learning tasks can sometimes be expensive and time-consuming and may violate privacy policies and regulations. Successful and accurate generation of these signals can help in many wearable devices applications as well as overcoming the privacy concerns accompanied with healthcare data. Generative adversarial networks (GANs) have been used successfully in generating images in data-limited situations. Using GANs for generating other types of data has been actively researched in the last few years. In this paper, we investigate the possibility of using a time-series GAN (TimeGAN) to generate wearable devices data for a hydration monitoring task to predict the last drinking time of a user. Challenges encountered in the case of biosignals generation and state-of-the-art methods for evaluation of the generated signals are discussed. Results have shown the applicability of using TimeGAN for this task based on quantitative and visual qualitative metrics. Limitations on the quality of the generated signals were highlighted with suggesting ways for improvement
Data Augmentation techniques in time series domain: A survey and taxonomy
With the latest advances in Deep Learning-based generative models, it has not
taken long to take advantage of their remarkable performance in the area of
time series. Deep neural networks used to work with time series heavily depend
on the size and consistency of the datasets used in training. These features
are not usually abundant in the real world, where they are usually limited and
often have constraints that must be guaranteed. Therefore, an effective way to
increase the amount of data is by using Data Augmentation techniques, either by
adding noise or permutations and by generating new synthetic data. This work
systematically reviews the current state-of-the-art in the area to provide an
overview of all available algorithms and proposes a taxonomy of the most
relevant research. The efficiency of the different variants will be evaluated
as a central part of the process, as well as the different metrics to evaluate
the performance and the main problems concerning each model will be analysed.
The ultimate aim of this study is to provide a summary of the evolution and
performance of areas that produce better results to guide future researchers in
this field.Comment: 33 pages, 9 figure
Implementation of Synthesize GAN Model to Detect Outlier in National Stock Exchange Time Series Multivariate Data
This research work explores a novel approach for identifying outliers in stock related time series multivariate datasets, using Generative Adversarial Networks (GANs). The proposed framework harnesses the power of GANs to create synthetic data points that replicate the statistical characteristics of genuine stock related time series. The use of Generative Adversarial Networks to generate tabular data has become more important in a number of industries, including banking, healthcare, and data privacy. The process of synthesizing tabular data with GANs is also provided in this paper. It involves several critical steps, including data collection, preprocessing, and exploration, as well as the design and training using Generator and Discriminator networks. While the discriminator separates genuine samples from synthetic ones, the generator is in charge of producing synthetic data. Generating high quality tabular data with GANs is a complex task, but it has the potential to facilitate data generation in various domains while preserving data privacy and integrity. The results from the experiments confirm that the GAN framework is useful for detecting outliers. The model demonstrates its proficiency in identifying outliers within stock-related time series data. In comparison, our proposed work also examines the statistics and machine learning models in related application fields
If You Like It, GAN It. Probabilistic Multivariate Times Series Forecast With GAN
The contribution of this paper is two-fold. First, we present ProbCast - a
novel probabilistic model for multivariate time-series forecasting. We employ a
conditional GAN framework to train our model with adversarial training. Second,
we propose a framework that lets us transform a deterministic model into a
probabilistic one with improved performance. The motivation of the framework is
to either transform existing highly accurate point forecast models to their
probabilistic counterparts or to train GANs stably by selecting the
architecture of GAN's component carefully and efficiently. We conduct
experiments over two publicly available datasets namely electricity consumption
dataset and exchange-rate dataset. The results of the experiments demonstrate
the remarkable performance of our model as well as the successful application
of our proposed framework
- …