896 research outputs found

    Practical Bayesian Optimization of Machine Learning Algorithms

    Full text link
    Machine learning algorithms frequently require careful tuning of model hyperparameters, regularization terms, and optimization parameters. Unfortunately, this tuning is often a "black art" that requires expert experience, unwritten rules of thumb, or sometimes brute-force search. Much more appealing is the idea of developing automatic approaches which can optimize the performance of a given learning algorithm to the task at hand. In this work, we consider the automatic tuning problem within the framework of Bayesian optimization, in which a learning algorithm's generalization performance is modeled as a sample from a Gaussian process (GP). The tractable posterior distribution induced by the GP leads to efficient use of the information gathered by previous experiments, enabling optimal choices about what parameters to try next. Here we show how the effects of the Gaussian process prior and the associated inference procedure can have a large impact on the success or failure of Bayesian optimization. We show that thoughtful choices can lead to results that exceed expert-level performance in tuning machine learning algorithms. We also describe new algorithms that take into account the variable cost (duration) of learning experiments and that can leverage the presence of multiple cores for parallel experimentation. We show that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization on a diverse set of contemporary algorithms including latent Dirichlet allocation, structured SVMs and convolutional neural networks

    Training Restricted Boltzmann Machines on Word Observations

    Get PDF
    The restricted Boltzmann machine (RBM) is a flexible tool for modeling complex data, however there have been significant computational difficulties in using RBMs to model high-dimensional multinomial observations. In natural language processing applications, words are naturally modeled by K-ary discrete distributions, where K is determined by the vocabulary size and can easily be in the hundreds of thousands. The conventional approach to training RBMs on word observations is limited because it requires sampling the states of K-way softmax visible units during block Gibbs updates, an operation that takes time linear in K. In this work, we address this issue by employing a more general class of Markov chain Monte Carlo operators on the visible units, yielding updates with computational complexity independent of K. We demonstrate the success of our approach by training RBMs on hundreds of millions of word n-grams using larger vocabularies than previously feasible and using the learned features to improve performance on chunking and sentiment classification tasks, achieving state-of-the-art results on the latter

    Study on the neuronal circuits implicated in postural tremor and hypokinesia

    Get PDF
    The effect of various tegmentary lesions at the level of the pontomesenchphalon in monkeys on motor function was observed. The importance of the monoaminergic mechanisms of the brainstem is discussed. The results also show the importance of the descending tegmentary rubral system and the rubroolivocerebellar circuit in controlling peripheral motor activity. The destruction of the sensory motor cortex proves to be a more effective way of eliminating spontaneous or harmaline induced tremor than the complete interruption of the pyramidal system on the level of the cerebral peduncle

    Analysis of charged particle emission sources and coalescence in E/A = 61 MeV 36^{36}Ar + 27^{27}Al, 112^{112}Sn and 124^{124}Sn collisions

    Full text link
    Single-particle kinetic energy spectra and two-particle small angle correlations of protons (pp), deuterons (dd) and tritons (tt) have been measured simultaneously in 61A MeV 36^{36}Ar + 27^{27}Al, 112^{112}Sn and 124^{124}Sn collisions. Characteristics of the emission sources have been derived from a ``source identification plot'' (βsource\beta_{source}--ECME_{CM} plot), constructed from the single-particle invariant spectra, and compared to the complementary results from two-particle correlation functions. Furthermore, the source identification plot has been used to determine the conditions when the coalescence mechanism can be applied for composite particles. In our data, this is the case only for the Ar + Al reaction, where pp, dd and tt are found to originate from a common source of emission (from the overlap region between target and projectile). In this case, the coalescence model parameter, p~0\tilde{p}_0 -- the radius of the complex particle emission source in momentum space, has been analyzed.Comment: 20 pages, 5 figures, submitted to Nuclear Physics

    A Quasi-Classical Model of Intermediate Velocity Particle Production in Asymmetric Heavy Ion Reactions

    Full text link
    The particle emission at intermediate velocities in mass asymmetric reactions is studied within the framework of classical molecular dynamics. Two reactions in the Fermi energy domain were modelized, 58^{58}Ni+C and 58^{58}Ni+Au at 34.5 MeV/nucleon. The availability of microscopic correlations at all times allowed a detailed study of the fragment formation process. Special attention was paid to the physical origin of fragments and emission timescales, which allowed us to disentangle the different processes involved in the mid-rapidity particle production. Consequently, a clear distinction between a prompt pre- equilibrium emission and a delayed aligned asymmetric breakup of the heavier partner of the reaction was achieved.Comment: 8 pages, 7 figures. Final version: figures were redesigned, and a new section discussing the role of Coulomb in IMF production was include

    Excitation-emission characterization of ICG in biologically relevant solutions

    Get PDF
    Please click Additional Files below to see the full abstract

    Signal Intensity Analysis and Optimization for in Vivo Imaging of Cherenkov and Excited Luminescence.

    Get PDF
    During external beam radiotherapy (EBRT), in vivo Cherenkov optical emissions can be used as a dosimetry tool or to excite luminescence, termed Cherenkov-excited luminescence (CEL) with microsecond-level time-gated cameras. The goal of this work was to develop a complete theoretical foundation for the detectable signal strength, in order to provide guidance on optimization of the limits of detection and how to optimize near real time imaging. The key parameters affecting photon production, propagation and detection were considered and experimental validation with both tissue phantoms and a murine model are shown. Both the theoretical analysis and experimental data indicate that the detection level is near a single photon-per-pixel for the detection geometry and frame rates commonly used, with the strongest factor being the signal decrease with the square of distance from tissue to camera. Experimental data demonstrates how the SNR improves with increasing integration time, but only up to the point where the dominance of camera read noise is overcome by stray photon noise that cannot be suppressed. For the current camera in a fixed geometry, the signal to background ratio limits the detection of light signals, and the observed in vivo Cherenkov emission is on the order of 100×  stronger than CEL signals. As a result, imaging signals from depths  \u3c 15 mm is reasonable for Cherenkov light, and depths  \u3c 3 mm is reasonable for CEL imaging. The current investigation modeled Cherenkov and CEL imaging of two oxygen sensing phosphorescent compounds, but the modularity of the code allows for easy comparison of different agents or alternative cameras, geometries or tissues

    Variational Deep Semantic Hashing for Text Documents

    Full text link
    As the amount of textual data has been rapidly increasing over the past decade, efficient similarity search methods have become a crucial component of large-scale information retrieval systems. A popular strategy is to represent original data samples by compact binary codes through hashing. A spectrum of machine learning methods have been utilized, but they often lack expressiveness and flexibility in modeling to learn effective representations. The recent advances of deep learning in a wide range of applications has demonstrated its capability to learn robust and powerful feature representations for complex data. Especially, deep generative models naturally combine the expressiveness of probabilistic generative models with the high capacity of deep neural networks, which is very suitable for text modeling. However, little work has leveraged the recent progress in deep learning for text hashing. In this paper, we propose a series of novel deep document generative models for text hashing. The first proposed model is unsupervised while the second one is supervised by utilizing document labels/tags for hashing. The third model further considers document-specific factors that affect the generation of words. The probabilistic generative formulation of the proposed models provides a principled framework for model extension, uncertainty estimation, simulation, and interpretability. Based on variational inference and reparameterization, the proposed models can be interpreted as encoder-decoder deep neural networks and thus they are capable of learning complex nonlinear distributed representations of the original documents. We conduct a comprehensive set of experiments on four public testbeds. The experimental results have demonstrated the effectiveness of the proposed supervised learning models for text hashing.Comment: 11 pages, 4 figure
    corecore