272 research outputs found
Heartbeat Anomaly Detection using Adversarial Oversampling
Cardiovascular diseases are one of the most common causes of death in the
world. Prevention, knowledge of previous cases in the family, and early
detection is the best strategy to reduce this fact. Different machine learning
approaches to automatic diagnostic are being proposed to this task. As in most
health problems, the imbalance between examples and classes is predominant in
this problem and affects the performance of the automated solution. In this
paper, we address the classification of heartbeats images in different
cardiovascular diseases. We propose a two-dimensional Convolutional Neural
Network for classification after using a InfoGAN architecture for generating
synthetic images to unbalanced classes. We call this proposal Adversarial
Oversampling and compare it with the classical oversampling methods as SMOTE,
ADASYN, and RandomOversampling. The results show that the proposed approach
improves the classifier performance for the minority classes without harming
the performance in the balanced classes
Fr\'echet ChemNet Distance: A metric for generative models for molecules in drug discovery
The new wave of successful generative models in machine learning has
increased the interest in deep learning driven de novo drug design. However,
assessing the performance of such generative models is notoriously difficult.
Metrics that are typically used to assess the performance of such generative
models are the percentage of chemically valid molecules or the similarity to
real molecules in terms of particular descriptors, such as the partition
coefficient (logP) or druglikeness. However, method comparison is difficult
because of the inconsistent use of evaluation metrics, the necessity for
multiple metrics, and the fact that some of these measures can easily be
tricked by simple rule-based systems. We propose a novel distance measure
between two sets of molecules, called Fr\'echet ChemNet distance (FCD), that
can be used as an evaluation metric for generative models. The FCD is similar
to a recently established performance metric for comparing image generation
methods, the Fr\'echet Inception Distance (FID). Whereas the FID uses one of
the hidden layers of InceptionNet, the FCD utilizes the penultimate layer of a
deep neural network called ChemNet, which was trained to predict drug
activities. Thus, the FCD metric takes into account chemically and biologically
relevant information about molecules, and also measures the diversity of the
set via the distribution of generated molecules. The FCD's advantage over
previous metrics is that it can detect if generated molecules are a) diverse
and have similar b) chemical and c) biological properties as real molecules. We
further provide an easy-to-use implementation that only requires the SMILES
representation of the generated molecules as input to calculate the FCD.
Implementations are available at: https://www.github.com/bioinf-jku/FCDComment: Implementations are available at:
https://www.github.com/bioinf-jku/FC
- …