502 research outputs found

    Prediction of the Atomization Energy of Molecules Using Coulomb Matrix and Atomic Composition in a Bayesian Regularized Neural Networks

    Full text link
    Exact calculation of electronic properties of molecules is a fundamental step for intelligent and rational compounds and materials design. The intrinsically graph-like and non-vectorial nature of molecular data generates a unique and challenging machine learning problem. In this paper we embrace a learning from scratch approach where the quantum mechanical electronic properties of molecules are predicted directly from the raw molecular geometry, similar to some recent works. But, unlike these previous endeavors, our study suggests a benefit from combining molecular geometry embedded in the Coulomb matrix with the atomic composition of molecules. Using the new combined features in a Bayesian regularized neural networks, our results improve well-known results from the literature on the QM7 dataset from a mean absolute error of 3.51 kcal/mol down to 3.0 kcal/mol.Comment: Under review ICANN 201

    Bayesian regression filter and the issue of priors

    Get PDF
    We propose a Bayesian framework for regression problems, which covers areas which are usually dealt with by function approximation. An online learning algorithm is derived which solves regression problems with a Kalman filter. Its solution always improves with increasing model complexity, without the risk of over-fitting. In the infinite dimension limit it approaches the true Bayesian posterior. The issues of prior selection and over-fitting are also discussed, showing that some of the commonly held beliefs are misleading. The practical implementation is summarised. Simulations using 13 popular publicly available data sets are used to demonstrate the method and highlight important issues concerning the choice of priors

    Nash Codes for Noisy Channels

    Get PDF
    This paper studies the stability of communication protocols that deal with transmission errors. We consider a coordination game between an informed sender and an uninformed decision maker, the receiver, who communicate over a noisy channel. The sender's strategy, called a code, maps states of nature to signals. The receiver's best response is to decode the received channel output as the state with highest expected receiver payoff. Given this decoding, an equilibrium or "Nash code" results if the sender encodes every state as prescribed. We show two theorems that give sufficient conditions for Nash codes. First, a receiver-optimal code defines a Nash code. A second, more surprising observation holds for communication over a binary channel which is used independently a number of times, a basic model of information transmission: Under a minimal "monotonicity" requirement for breaking ties when decoding, which holds generically, EVERY code is a Nash code.Comment: More general main Theorem 6.5 with better proof. New examples and introductio

    Calibration with confidence:A principled method for panel assessment

    Get PDF
    Frequently, a set of objects has to be evaluated by a panel of assessors, but not every object is assessed by every assessor. A problem facing such panels is how to take into account different standards amongst panel members and varying levels of confidence in their scores. Here, a mathematically-based algorithm is developed to calibrate the scores of such assessors, addressing both of these issues. The algorithm is based on the connectivity of the graph of assessors and objects evaluated, incorporating declared confidences as weights on its edges. If the graph is sufficiently well connected, relative standards can be inferred by comparing how assessors rate objects they assess in common, weighted by the levels of confidence of each assessment. By removing these biases, "true" values are inferred for all the objects. Reliability estimates for the resulting values are obtained. The algorithm is tested in two case studies, one by computer simulation and another based on realistic evaluation data. The process is compared to the simple averaging procedure in widespread use, and to Fisher's additive incomplete block analysis. It is anticipated that the algorithm will prove useful in a wide variety of situations such as evaluation of the quality of research submitted to national assessment exercises; appraisal of grant proposals submitted to funding panels; ranking of job applicants; and judgement of performances on degree courses wherein candidates can choose from lists of options.Comment: 32 pages including supplementary information; 5 figure

    Modelling the emergence of rodent filial huddling from physiological huddling

    Get PDF
    Huddling behaviour in neonatal rodents reduces the metabolic costs of physiological thermoregulation. However, animals continue to huddle into adulthood, at ambient temperatures where they are able to sustain a basal metabolism in isolation from the huddle. This 'filial huddling' in older animals is known to be guided by olfactory rather than thermal cues. The present study aimed to test whether thermally rewarding contacts between young mice, experienced when thermogenesis in brown adipose fat tissue (BAT) is highest, could give rise to olfactory preferences that persist as filial huddling interactions in adults. To this end, a simple model was constructed to fit existing data on the development of mouse thermal physiology and behaviour. The form of the model that emerged yields a remarkable explanation for filial huddling; associative learning maintains huddling into adulthood via processes that reduce thermodynamic entropy from BAT-metabolism and increase information about social ordering amongst littermates

    Fast methods for training Gaussian processes on large data sets

    Get PDF
    Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large data sets. Here, we derive some simple results which we have found useful for speeding up the learning stage in the GPR algorithm, and especially for performing Bayesian model comparison between different covariance functions. We apply our techniques to both synthetic and real data and quantify the speed-up relative to using nested sampling to numerically evaluate model evidences.Comment: Fixed missing reference

    Optimal client recommendation for market makers in illiquid financial products

    Full text link
    The process of liquidity provision in financial markets can result in prolonged exposure to illiquid instruments for market makers. In this case, where a proprietary position is not desired, pro-actively targeting the right client who is likely to be interested can be an effective means to offset this position, rather than relying on commensurate interest arising through natural demand. In this paper, we consider the inference of a client profile for the purpose of corporate bond recommendation, based on typical recorded information available to the market maker. Given a historical record of corporate bond transactions and bond meta-data, we use a topic-modelling analogy to develop a probabilistic technique for compiling a curated list of client recommendations for a particular bond that needs to be traded, ranked by probability of interest. We show that a model based on Latent Dirichlet Allocation offers promising performance to deliver relevant recommendations for sales traders.Comment: 12 pages, 3 figures, 1 tabl

    Fast and flexible selection with a single switch

    Get PDF
    Selection methods that require only a single-switch input, such as a button click or blink, are potentially useful for individuals with motor impairments, mobile technology users, and individuals wishing to transmit information securely. We present a single-switch selection method, "Nomon," that is general and efficient. Existing single-switch selection methods require selectable options to be arranged in ways that limit potential applications. By contrast, traditional operating systems, web browsers, and free-form applications (such as drawing) place options at arbitrary points on the screen. Nomon, however, has the flexibility to select any point on a screen. Nomon adapts automatically to an individual's clicking ability; it allows a person who clicks precisely to make a selection quickly and allows a person who clicks imprecisely more time to make a selection without error. Nomon reaps gains in information rate by allowing the specification of beliefs (priors) about option selection probabilities and by avoiding tree-based selection schemes in favor of direct (posterior) inference. We have developed both a Nomon-based writing application and a drawing application. To evaluate Nomon's performance, we compared the writing application with a popular existing method for single-switch writing (row-column scanning). Novice users wrote 35% faster with the Nomon interface than with the scanning interface. An experienced user (author TB, with > 10 hours practice) wrote at speeds of 9.3 words per minute with Nomon, using 1.2 clicks per character and making no errors in the final text.Comment: 14 pages, 5 figures, 1 table, presented at NIPS 2009 Mini-symposi
    • …
    corecore