27,241 research outputs found
Methodological Issues in Building, Training, and Testing Artificial Neural Networks
We review the use of artificial neural networks, particularly the feedforward
multilayer perceptron with back-propagation for training (MLP), in ecological
modelling. Overtraining on data or giving vague references to how it was
avoided is the major problem. Various methods can be used to determine when to
stop training in artificial neural networks: 1) early stopping based on
cross-validation, 2) stopping after a analyst defined error is reached or after
the error levels off, 3) use of a test data set. We do not recommend the third
method as the test data set is then not independent of model development. Many
studies used the testing data to optimize the model and training. Although this
method may give the best model for that set of data it does not give
generalizability or improve understanding of the study system. The importance
of an independent data set cannot be overemphasized as we found dramatic
differences in model accuracy assessed with prediction accuracy on the training
data set, as estimated with bootstrapping, and from use of an independent data
set. The comparison of the artificial neural network with a general linear
model (GLM) as a standard procedure is recommended because a GLM may perform as
well or better than the MLP. MLP models should not be treated as black box
models but instead techniques such as sensitivity analyses, input variable
relevances, neural interpretation diagrams, randomization tests, and partial
derivatives should be used to make the model more transparent, and further our
ecological understanding which is an important goal of the modelling process.
Based on our experience we discuss how to build a MLP model and how to optimize
the parameters and architecture.Comment: 22 pages, 2 figures. Presented in ISEI3 (2002). Ecological Modelling
in pres
Classification of red blood cell shapes in flow using outlier tolerant machine learning
The manual evaluation, classification and counting of biological objects
demands for an enormous expenditure of time and subjective human input may be a
source of error. Investigating the shape of red blood cells (RBCs) in
microcapillary Poiseuille flow, we overcome this drawback by introducing a
convolutional neural regression network for an automatic, outlier tolerant
shape classification. From our experiments we expect two stable geometries: the
so-called `slipper' and `croissant' shapes depending on the prevailing flow
conditions and the cell-intrinsic parameters. Whereas croissants mostly occur
at low shear rates, slippers evolve at higher flow velocities. With our method,
we are able to find the transition point between both `phases' of stable shapes
which is of high interest to ensuing theoretical studies and numerical
simulations. Using statistically based thresholds, from our data, we obtain
so-called phase diagrams which are compared to manual evaluations.
Prospectively, our concept allows us to perform objective analyses of
measurements for a variety of flow conditions and to receive comparable
results. Moreover, the proposed procedure enables unbiased studies on the
influence of drugs on flow properties of single RBCs and the resulting
macroscopic change of the flow behavior of whole blood.Comment: 15 pages, published in PLoS Comput Biol, open acces
PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures
Persistence diagrams, the most common descriptors of Topological Data
Analysis, encode topological properties of data and have already proved pivotal
in many different applications of data science. However, since the (metric)
space of persistence diagrams is not Hilbert, they end up being difficult
inputs for most Machine Learning techniques. To address this concern, several
vectorization methods have been put forward that embed persistence diagrams
into either finite-dimensional Euclidean space or (implicit) infinite
dimensional Hilbert space with kernels. In this work, we focus on persistence
diagrams built on top of graphs. Relying on extended persistence theory and the
so-called heat kernel signature, we show how graphs can be encoded by
(extended) persistence diagrams in a provably stable way. We then propose a
general and versatile framework for learning vectorizations of persistence
diagrams, which encompasses most of the vectorization techniques used in the
literature. We finally showcase the experimental strength of our setup by
achieving competitive scores on classification tasks on real-life graph
datasets
A Hybrid Neural Network and Virtual Reality System for Spatial Language Processing
This paper describes a neural network model for the study of spatial language. It deals with both geometric and functional variables, which have been shown to play an important role in the comprehension of spatial prepositions. The network is integrated with a virtual reality interface for the direct manipulation of geometric and functional factors. The training uses experimental stimuli and data. Results show that the networks reach low training and generalization errors. Cluster analyses of hidden activation show that stimuli primarily group according to extra-geometrical variables
PHom-GeM: Persistent Homology for Generative Models
Generative neural network models, including Generative Adversarial Network
(GAN) and Auto-Encoders (AE), are among the most popular neural network models
to generate adversarial data. The GAN model is composed of a generator that
produces synthetic data and of a discriminator that discriminates between the
generator's output and the true data. AE consist of an encoder which maps the
model distribution to a latent manifold and of a decoder which maps the latent
manifold to a reconstructed distribution. However, generative models are known
to provoke chaotically scattered reconstructed distribution during their
training, and consequently, incomplete generated adversarial distributions.
Current distance measures fail to address this problem because they are not
able to acknowledge the shape of the data manifold, i.e. its topological
features, and the scale at which the manifold should be analyzed. We propose
Persistent Homology for Generative Models, PHom-GeM, a new methodology to
assess and measure the distribution of a generative model. PHom-GeM minimizes
an objective function between the true and the reconstructed distributions and
uses persistent homology, the study of the topological features of a space at
different spatial resolutions, to compare the nature of the true and the
generated distributions. Our experiments underline the potential of persistent
homology for Wasserstein GAN in comparison to Wasserstein AE and Variational
AE. The experiments are conducted on a real-world data set particularly
challenging for traditional distance measures and generative neural network
models. PHom-GeM is the first methodology to propose a topological distance
measure, the bottleneck distance, for generative models used to compare
adversarial samples in the context of credit card transactions
- …