9,815 research outputs found

    Particle Gibbs for Bayesian Additive Regression Trees

    Full text link
    Additive regression trees are flexible non-parametric models and popular off-the-shelf tools for real-world non-linear regression. In application domains, such as bioinformatics, where there is also demand for probabilistic predictions with measures of uncertainty, the Bayesian additive regression trees (BART) model, introduced by Chipman et al. (2010), is increasingly popular. As data sets have grown in size, however, the standard Metropolis-Hastings algorithms used to perform inference in BART are proving inadequate. In particular, these Markov chains make local changes to the trees and suffer from slow mixing when the data are high-dimensional or the best fitting trees are more than a few layers deep. We present a novel sampler for BART based on the Particle Gibbs (PG) algorithm (Andrieu et al., 2010) and a top-down particle filtering algorithm for Bayesian decision trees (Lakshminarayanan et al., 2013). Rather than making local changes to individual trees, the PG sampler proposes a complete tree to fit the residual. Experiments show that the PG sampler outperforms existing samplers in many settings

    The genome sequence of Barbarea vulgaris facilitates the study of ecological biochemistry

    Get PDF
    peer-reviewedThe genus Barbarea has emerged as a model for evolution and ecology of plant defense compounds, due to its unusual glucosinolate profile and production of saponins, unique to the Brassicaceae. One species, B. vulgaris, includes two ‘types’, G-type and P-type that differ in trichome density, and their glucosinolate and saponin profiles. A key difference is the stereochemistry of hydroxylation of their common phenethylglucosinolate backbone, leading to epimeric glucobarbarins. Here we report a draft genome sequence of the G-type, and re-sequencing of the P-type for comparison. This enables us to identify candidate genes underlying glucosinolate diversity, trichome density, and study the genetics of biochemical variation for glucosinolate and saponins. B. vulgaris is resistant to the diamondback moth, and may be exploited for “dead-end” trap cropping where glucosinolates stimulate oviposition and saponins deter larvae to the extent that they die. The B. vulgaris genome will promote the study of mechanisms in ecological biochemistry to benefit crop resistance breeding

    Anatomy of a Native XML Base Management System

    Full text link
    Several alternatives to manage large XML document collections exist, ranging from file systems over relational or other database systems to specifically tailored XML repositories. In this paper we give a tour of Natix, a database management system designed from scratch for storing and processing XML data. Contrary to the common belief that management of XML data is just another application for traditional databases like relational systems, we illustrate how almost every component in a database system is affected in terms of adequacy and performance. We show how to design and optimize areas such as storage, transaction management comprising recovery and multi-user synchronisation as well as query processing for XML

    High Performance Computing for DNA Sequence Alignment and Assembly

    Get PDF
    Recent advances in DNA sequencing technology have dramatically increased the scale and scope of DNA sequencing. These data are used for a wide variety of important biological analyzes, including genome sequencing, comparative genomics, transcriptome analysis, and personalized medicine but are complicated by the volume and complexity of the data involved. Given the massive size of these datasets, computational biology must draw on the advances of high performance computing. Two fundamental computations in computational biology are read alignment and genome assembly. Read alignment maps short DNA sequences to a reference genome to discover conserved and polymorphic regions of the genome. Genome assembly computes the sequence of a genome from many short DNA sequences. Both computations benefit from recent advances in high performance computing to efficiently process the huge datasets involved, including using highly parallel graphics processing units (GPUs) as high performance desktop processors, and using the MapReduce framework coupled with cloud computing to parallelize computation to large compute grids. This dissertation demonstrates how these technologies can be used to accelerate these computations by orders of magnitude, and have the potential to make otherwise infeasible computations practical

    Preprocessing Imprecise Points for Delaunay Triangulation: Simplified and Extended

    Get PDF
    Suppose we want to compute the Delaunay triangulation of a set P whose points are restricted to a collection R of input regions known in advance. Building on recent work by Löffler and Snoeyink, we show how to leverage our knowledge of R for faster Delaunay computation. Our approach needs no fancy machinery and optimally handles a wide variety of inputs, e.g., overlapping disks of different sizes and fat regions. Keywords: Delaunay triangulation - Data imprecision - Quadtree

    Sounds of Waitakere: Using practitioner research to explore how Year 6 recorder players compose responses to visual representations of a natural environment

    Get PDF
    How might primary students utilise the stimulus of a painting in a collaborative composition drawing on a non-conventional sound palette of their own making? This practitioner research features 17 recorder players from a Year 6 class (10–11-year-olds) who attend a West Auckland primary school in New Zealand. These children were invited to experiment with the instrument to produce collectively an expanded ‘repertoire’ or ‘palette’ of sounds. In small groups, they then discussed a painting by an established New Zealand painter set in the Waitakere Ranges and attempted to formulate an interpretation in musical terms. On the basis of their interpretation, drawing on sounds from the collective palette (complemented with other sounds), they worked collaboratively to develop, refine and perform a structured composition named for their chosen painting. This case study is primarily descriptive (providing narrative accounts and rich vignettes of practice) and, secondarily, exploratory (description and analysis leading to the development of hypotheses). It has implications for a range of current educational issues, including curriculum integration and the place of composition and notation in the primary-school music programme
    • 

    corecore