11 research outputs found
Sliced-Wasserstein Flows: Nonparametric Generative Modeling via Optimal Transport and Diffusions
By building upon the recent theory that established the connection between
implicit generative modeling (IGM) and optimal transport, in this study, we
propose a novel parameter-free algorithm for learning the underlying
distributions of complicated datasets and sampling from them. The proposed
algorithm is based on a functional optimization problem, which aims at finding
a measure that is close to the data distribution as much as possible and also
expressive enough for generative modeling purposes. We formulate the problem as
a gradient flow in the space of probability measures. The connections between
gradient flows and stochastic differential equations let us develop a
computationally efficient algorithm for solving the optimization problem. We
provide formal theoretical analysis where we prove finite-time error guarantees
for the proposed algorithm. To the best of our knowledge, the proposed
algorithm is the first nonparametric IGM algorithm with explicit theoretical
guarantees. Our experimental results support our theory and show that our
algorithm is able to successfully capture the structure of different types of
data distributions.Comment: Published at the International Conference on Machine Learning (ICML)
201
Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions
Generative Adversarial Networks (GANs) is a novel class of deep generative
models which has recently gained significant attention. GANs learns complex and
high-dimensional distributions implicitly over images, audio, and data.
However, there exists major challenges in training of GANs, i.e., mode
collapse, non-convergence and instability, due to inappropriate design of
network architecture, use of objective function and selection of optimization
algorithm. Recently, to address these challenges, several solutions for better
design and optimization of GANs have been investigated based on techniques of
re-engineered network architectures, new objective functions and alternative
optimization algorithms. To the best of our knowledge, there is no existing
survey that has particularly focused on broad and systematic developments of
these solutions. In this study, we perform a comprehensive survey of the
advancements in GANs design and optimization solutions proposed to handle GANs
challenges. We first identify key research issues within each design and
optimization technique and then propose a new taxonomy to structure solutions
by key research issues. In accordance with the taxonomy, we provide a detailed
discussion on different GANs variants proposed within each solution and their
relationships. Finally, based on the insights gained, we present the promising
research directions in this rapidly growing field.Comment: 42 pages, Figure 13, Table
Recommended from our members
Early diagnosis and personalised treatment focusing on synthetic data modelling: Novel visual learning approach in healthcare
YesThe early diagnosis and personalised treatment of diseases are facilitated by machine learning. The quality of data has an impact on diagnosis because medical data are usually sparse, imbalanced, and contain irrelevant attributes, resulting in suboptimal diagnosis. To address the impacts of data challenges, improve resource allocation, and achieve better health outcomes, a novel visual learning approach is proposed. This study contributes to the visual learning approach by determining whether less or more synthetic data are required to improve the quality of a dataset, such as the number of observations and features, according to the intended personalised treatment and early diagnosis. In addition, numerous visualisation experiments are conducted, including using statistical characteristics, cumulative sums, histograms, correlation matrix, root mean square error, and principal component analysis in order to visualise both original and synthetic data to address the data challenges. Real medical datasets for cancer, heart disease, diabetes, cryotherapy and immunotherapy are selected as case studies. As a benchmark and point of classification comparison in terms of such as accuracy, sensitivity, and specificity, several models are implemented such as k-Nearest Neighbours and Random Forest. To simulate algorithm implementation and data, Generative Adversarial Network is used to create and manipulate synthetic data, whilst, Random Forest is implemented to classify the data. An amendable and adaptable system is constructed by combining Generative Adversarial Network and Random Forest models. The system model presents working steps, overview and flowchart. Experiments reveal that the majority of data-enhancement scenarios allow for the application of visual learning in the first stage of data analysis as a novel approach. To achieve meaningful adaptable synergy between appropriate quality data and optimal classification performance while maintaining statistical characteristics, visual learning provides researchers and practitioners with practical human-in-the-loop machine learning visualisation tools. Prior to implementing algorithms, the visual learning approach can be used to actualise early, and personalised diagnosis. For the immunotherapy data, the Random Forest performed best with precision, recall, f-measure, accuracy, sensitivity, and specificity of 81%, 82%, 81%, 88%, 95%, and 60%, as opposed to 91%, 96%, 93%, 93%, 96%, and 73% for synthetic data, respectively. Future studies might examine the optimal strategies to balance the quantity and quality of medical data