15,018 research outputs found
BayesNAS: A Bayesian Approach for Neural Architecture Search
One-Shot Neural Architecture Search (NAS) is a promising method to
significantly reduce search time without any separate training. It can be
treated as a Network Compression problem on the architecture parameters from an
over-parameterized network. However, there are two issues associated with most
one-shot NAS methods. First, dependencies between a node and its predecessors
and successors are often disregarded which result in improper treatment over
zero operations. Second, architecture parameters pruning based on their
magnitude is questionable. In this paper, we employ the classic Bayesian
learning approach to alleviate these two issues by modeling architecture
parameters using hierarchical automatic relevance determination (HARD) priors.
Unlike other NAS methods, we train the over-parameterized network for only one
epoch then update the architecture. Impressively, this enabled us to find the
architecture on CIFAR-10 within only 0.2 GPU days using a single GPU.
Competitive performance can be also achieved by transferring to ImageNet. As a
byproduct, our approach can be applied directly to compress convolutional
neural networks by enforcing structural sparsity which achieves extremely
sparse networks without accuracy deterioration.Comment: International Conference on Machine Learning 201
Uncertainty in phylogenetic tree estimates
Estimating phylogenetic trees is an important problem in evolutionary
biology, environmental policy and medicine. Although trees are estimated, their
uncertainties are discarded by mathematicians working in tree space. Here we
explicitly model the multivariate uncertainty of tree estimates. We consider
both the cases where uncertainty information arises extrinsically (through
covariate information) and intrinsically (through the tree estimates
themselves). The importance of accounting for tree uncertainty in tree space is
demonstrated in two case studies. In the first instance, differences between
gene trees are small relative to their uncertainties, while in the second, the
differences are relatively large. Our main goal is visualization of tree
uncertainty, and we demonstrate advantages of our method with respect to
reproducibility, speed and preservation of topological differences compared to
visualization based on multidimensional scaling. The proposal highlights that
phylogenetic trees are estimated in an extremely high-dimensional space,
resulting in uncertainty information that cannot be discarded. Most
importantly, it is a method that allows biologists to diagnose whether
differences between gene trees are biologically meaningful, or due to
uncertainty in estimation.Comment: Final version accepted to Journal of Computational and Graphical
Statistic
Review of Face Detection Systems Based Artificial Neural Networks Algorithms
Face detection is one of the most relevant applications of image processing
and biometric systems. Artificial neural networks (ANN) have been used in the
field of image processing and pattern recognition. There is lack of literature
surveys which give overview about the studies and researches related to the
using of ANN in face detection. Therefore, this research includes a general
review of face detection studies and systems which based on different ANN
approaches and algorithms. The strengths and limitations of these literature
studies and systems were included also.Comment: 16 pages, 12 figures, 1 table, IJMA Journa
A radio-polarisation and rotation measure study of the Gum Nebula and its environment
The Gum Nebula is 36 degree wide shell-like emission nebula at a distance of
only 450 pc. It has been hypothesised to be an old supernova remnant, fossil
HII region, wind-blown bubble, or combination of multiple objects. Here we
investigate the magneto-ionic properties of the nebula using data from recent
surveys: radio-continuum data from the NRAO VLA and S-band Parkes All Sky
Surveys, and H-alpha data from the Southern H-Alpha Sky Survey Atlas. We model
the upper part of the nebula as a spherical shell of ionised gas expanding into
the ambient medium. We perform a maximum-likelihood Markov chain Monte-Carlo
fit to the NVSS rotation measure data, using the H-halpha data to constrain
average electron density in the shell . Assuming a latitudinal background
gradient in RM we find , angular radius
, shell thickness
, ambient magnetic field strength
and warm gas filling factor
. We constrain the local, small-scale (~260 pc)
pitch-angle of the ordered Galactic magnetic field to
, which represents a significant
deviation from the median field orientation on kiloparsec scales
(~-7.2). The moderate compression factor X=6.0\,^{+5.1}_{-2.5} at
the edge of the H-alpha shell implies that the 'old supernova remnant' origin
is unlikely. Our results support a model of the nebula as a HII region around a
wind-blown bubble. Analysis of depolarisation in 2.3 GHz S-PASS data is
consistent with this hypothesis and our best-fitting values agree well with
previous studies of interstellar bubbles.Comment: 33 pages, 16 figures. Accepted by The Astrophysical Journa
Compressing networks with super nodes
Community detection is a commonly used technique for identifying groups in a
network based on similarities in connectivity patterns. To facilitate community
detection in large networks, we recast the network to be partitioned into a
smaller network of 'super nodes', each super node comprising one or more nodes
in the original network. To define the seeds of our super nodes, we apply the
'CoreHD' ranking from dismantling and decycling. We test our approach through
the analysis of two common methods for community detection: modularity
maximization with the Louvain algorithm and maximum likelihood optimization for
fitting a stochastic block model. Our results highlight that applying community
detection to the compressed network of super nodes is significantly faster
while successfully producing partitions that are more aligned with the local
network connectivity, more stable across multiple (stochastic) runs within and
between community detection algorithms, and overlap well with the results
obtained using the full network
Compression and Conditional Emulation of Climate Model Output
Numerical climate model simulations run at high spatial and temporal
resolutions generate massive quantities of data. As our computing capabilities
continue to increase, storing all of the data is not sustainable, and thus it
is important to develop methods for representing the full datasets by smaller
compressed versions. We propose a statistical compression and decompression
algorithm based on storing a set of summary statistics as well as a statistical
model describing the conditional distribution of the full dataset given the
summary statistics. The statistical model can be used to generate realizations
representing the full dataset, along with characterizations of the
uncertainties in the generated data. Thus, the methods are capable of both
compression and conditional emulation of the climate models. Considerable
attention is paid to accurately modeling the original dataset--one year of
daily mean temperature data--particularly with regard to the inherent spatial
nonstationarity in global fields, and to determining the statistics to be
stored, so that the variation in the original data can be closely captured,
while allowing for fast decompression and conditional emulation on modest
computers
Considerate Approaches to Achieving Sufficiency for ABC model selection
For nearly any challenging scientific problem evaluation of the likelihood is
problematic if not impossible. Approximate Bayesian computation (ABC) allows us
to employ the whole Bayesian formalism to problems where we can use simulations
from a model, but cannot evaluate the likelihood directly. When summary
statistics of real and simulated data are compared --- rather than the data
directly --- information is lost, unless the summary statistics are sufficient.
Here we employ an information-theoretical framework that can be used to
construct (approximately) sufficient statistics by combining different
statistics until the loss of information is minimized. Such sufficient sets of
statistics are constructed for both parameter estimation and model selection
problems. We apply our approach to a range of illustrative and real-world model
selection problems
- β¦