896 research outputs found
Practical Bayesian Optimization of Machine Learning Algorithms
Machine learning algorithms frequently require careful tuning of model
hyperparameters, regularization terms, and optimization parameters.
Unfortunately, this tuning is often a "black art" that requires expert
experience, unwritten rules of thumb, or sometimes brute-force search. Much
more appealing is the idea of developing automatic approaches which can
optimize the performance of a given learning algorithm to the task at hand. In
this work, we consider the automatic tuning problem within the framework of
Bayesian optimization, in which a learning algorithm's generalization
performance is modeled as a sample from a Gaussian process (GP). The tractable
posterior distribution induced by the GP leads to efficient use of the
information gathered by previous experiments, enabling optimal choices about
what parameters to try next. Here we show how the effects of the Gaussian
process prior and the associated inference procedure can have a large impact on
the success or failure of Bayesian optimization. We show that thoughtful
choices can lead to results that exceed expert-level performance in tuning
machine learning algorithms. We also describe new algorithms that take into
account the variable cost (duration) of learning experiments and that can
leverage the presence of multiple cores for parallel experimentation. We show
that these proposed algorithms improve on previous automatic procedures and can
reach or surpass human expert-level optimization on a diverse set of
contemporary algorithms including latent Dirichlet allocation, structured SVMs
and convolutional neural networks
Training Restricted Boltzmann Machines on Word Observations
The restricted Boltzmann machine (RBM) is a flexible tool for modeling
complex data, however there have been significant computational difficulties in
using RBMs to model high-dimensional multinomial observations. In natural
language processing applications, words are naturally modeled by K-ary discrete
distributions, where K is determined by the vocabulary size and can easily be
in the hundreds of thousands. The conventional approach to training RBMs on
word observations is limited because it requires sampling the states of K-way
softmax visible units during block Gibbs updates, an operation that takes time
linear in K. In this work, we address this issue by employing a more general
class of Markov chain Monte Carlo operators on the visible units, yielding
updates with computational complexity independent of K. We demonstrate the
success of our approach by training RBMs on hundreds of millions of word
n-grams using larger vocabularies than previously feasible and using the
learned features to improve performance on chunking and sentiment
classification tasks, achieving state-of-the-art results on the latter
Study on the neuronal circuits implicated in postural tremor and hypokinesia
The effect of various tegmentary lesions at the level of the pontomesenchphalon in monkeys on motor function was observed. The importance of the monoaminergic mechanisms of the brainstem is discussed. The results also show the importance of the descending tegmentary rubral system and the rubroolivocerebellar circuit in controlling peripheral motor activity. The destruction of the sensory motor cortex proves to be a more effective way of eliminating spontaneous or harmaline induced tremor than the complete interruption of the pyramidal system on the level of the cerebral peduncle
Recommended from our members
Low-cost representation for restricted Boltzmann machines
This paper presents a method for extracting a low-cost representation from restricted Boltzmann machines. The new representation can be considered as a compression of the network, requiring much less storage capacity while reasonably preserving the network's performance at feature learning. We show that the compression can be done by converting the weight matrix of real numbers into a matrix of three values {-1, 0, 1} associated with a score vector of real numbers. This set of values is similar enough to Boolean values which help us further translate the representation into logical rules. In the experiments reported in this paper, we evaluate the performance of our compression method on image datasets, obtaining promising results. Experiments on the MNIST handwritten digit classification dataset, for example, have shown that a 95% saving in memory can be achieved with no significant drop in accuracy
Analysis of charged particle emission sources and coalescence in E/A = 61 MeV Ar + Al, Sn and Sn collisions
Single-particle kinetic energy spectra and two-particle small angle
correlations of protons (), deuterons () and tritons () have been
measured simultaneously in 61A MeV Ar + Al, Sn and
Sn collisions. Characteristics of the emission sources have been
derived from a ``source identification plot'' (--
plot), constructed from the single-particle invariant spectra, and compared to
the complementary results from two-particle correlation functions. Furthermore,
the source identification plot has been used to determine the conditions when
the coalescence mechanism can be applied for composite particles. In our data,
this is the case only for the Ar + Al reaction, where , and are
found to originate from a common source of emission (from the overlap region
between target and projectile). In this case, the coalescence model parameter,
-- the radius of the complex particle emission source in momentum
space, has been analyzed.Comment: 20 pages, 5 figures, submitted to Nuclear Physics
A Quasi-Classical Model of Intermediate Velocity Particle Production in Asymmetric Heavy Ion Reactions
The particle emission at intermediate velocities in mass asymmetric reactions
is studied within the framework of classical molecular dynamics. Two reactions
in the Fermi energy domain were modelized, Ni+C and Ni+Au at 34.5
MeV/nucleon. The availability of microscopic correlations at all times allowed
a detailed study of the fragment formation process. Special attention was paid
to the physical origin of fragments and emission timescales, which allowed us
to disentangle the different processes involved in the mid-rapidity particle
production. Consequently, a clear distinction between a prompt pre- equilibrium
emission and a delayed aligned asymmetric breakup of the heavier partner of the
reaction was achieved.Comment: 8 pages, 7 figures. Final version: figures were redesigned, and a new
section discussing the role of Coulomb in IMF production was include
Excitation-emission characterization of ICG in biologically relevant solutions
Please click Additional Files below to see the full abstract
Signal Intensity Analysis and Optimization for in Vivo Imaging of Cherenkov and Excited Luminescence.
During external beam radiotherapy (EBRT), in vivo Cherenkov optical emissions can be used as a dosimetry tool or to excite luminescence, termed Cherenkov-excited luminescence (CEL) with microsecond-level time-gated cameras. The goal of this work was to develop a complete theoretical foundation for the detectable signal strength, in order to provide guidance on optimization of the limits of detection and how to optimize near real time imaging. The key parameters affecting photon production, propagation and detection were considered and experimental validation with both tissue phantoms and a murine model are shown. Both the theoretical analysis and experimental data indicate that the detection level is near a single photon-per-pixel for the detection geometry and frame rates commonly used, with the strongest factor being the signal decrease with the square of distance from tissue to camera. Experimental data demonstrates how the SNR improves with increasing integration time, but only up to the point where the dominance of camera read noise is overcome by stray photon noise that cannot be suppressed. For the current camera in a fixed geometry, the signal to background ratio limits the detection of light signals, and the observed in vivo Cherenkov emission is on the order of 100× stronger than CEL signals. As a result, imaging signals from depths \u3c 15 mm is reasonable for Cherenkov light, and depths \u3c 3 mm is reasonable for CEL imaging. The current investigation modeled Cherenkov and CEL imaging of two oxygen sensing phosphorescent compounds, but the modularity of the code allows for easy comparison of different agents or alternative cameras, geometries or tissues
Variational Deep Semantic Hashing for Text Documents
As the amount of textual data has been rapidly increasing over the past
decade, efficient similarity search methods have become a crucial component of
large-scale information retrieval systems. A popular strategy is to represent
original data samples by compact binary codes through hashing. A spectrum of
machine learning methods have been utilized, but they often lack expressiveness
and flexibility in modeling to learn effective representations. The recent
advances of deep learning in a wide range of applications has demonstrated its
capability to learn robust and powerful feature representations for complex
data. Especially, deep generative models naturally combine the expressiveness
of probabilistic generative models with the high capacity of deep neural
networks, which is very suitable for text modeling. However, little work has
leveraged the recent progress in deep learning for text hashing.
In this paper, we propose a series of novel deep document generative models
for text hashing. The first proposed model is unsupervised while the second one
is supervised by utilizing document labels/tags for hashing. The third model
further considers document-specific factors that affect the generation of
words. The probabilistic generative formulation of the proposed models provides
a principled framework for model extension, uncertainty estimation, simulation,
and interpretability. Based on variational inference and reparameterization,
the proposed models can be interpreted as encoder-decoder deep neural networks
and thus they are capable of learning complex nonlinear distributed
representations of the original documents. We conduct a comprehensive set of
experiments on four public testbeds. The experimental results have demonstrated
the effectiveness of the proposed supervised learning models for text hashing.Comment: 11 pages, 4 figure
- …