15,874 research outputs found
Sequential Gaussian Processes for Online Learning of Nonstationary Functions
Many machine learning problems can be framed in the context of estimating
functions, and often these are time-dependent functions that are estimated in
real-time as observations arrive. Gaussian processes (GPs) are an attractive
choice for modeling real-valued nonlinear functions due to their flexibility
and uncertainty quantification. However, the typical GP regression model
suffers from several drawbacks: i) Conventional GP inference scales
with respect to the number of observations; ii) updating a GP model
sequentially is not trivial; and iii) covariance kernels often enforce
stationarity constraints on the function, while GPs with non-stationary
covariance kernels are often intractable to use in practice. To overcome these
issues, we propose an online sequential Monte Carlo algorithm to fit mixtures
of GPs that capture non-stationary behavior while allowing for fast,
distributed inference. By formulating hyperparameter optimization as a
multi-armed bandit problem, we accelerate mixing for real time inference. Our
approach empirically improves performance over state-of-the-art methods for
online GP estimation in the context of prediction for simulated non-stationary
data and hospital time series data
Transformations of High-Level Synthesis Codes for High-Performance Computing
Specialized hardware architectures promise a major step in performance and
energy efficiency over the traditional load/store devices currently employed in
large scale computing systems. The adoption of high-level synthesis (HLS) from
languages such as C/C++ and OpenCL has greatly increased programmer
productivity when designing for such platforms. While this has enabled a wider
audience to target specialized hardware, the optimization principles known from
traditional software design are no longer sufficient to implement
high-performance codes. Fast and efficient codes for reconfigurable platforms
are thus still challenging to design. To alleviate this, we present a set of
optimizing transformations for HLS, targeting scalable and efficient
architectures for high-performance computing (HPC) applications. Our work
provides a toolbox for developers, where we systematically identify classes of
transformations, the characteristics of their effect on the HLS code and the
resulting hardware (e.g., increases data reuse or resource consumption), and
the objectives that each transformation can target (e.g., resolve interface
contention, or increase parallelism). We show how these can be used to
efficiently exploit pipelining, on-chip distributed fast memory, and on-chip
streaming dataflow, allowing for massively parallel architectures. To quantify
the effect of our transformations, we use them to optimize a set of
throughput-oriented FPGA kernels, demonstrating that our enhancements are
sufficient to scale up parallelism within the hardware constraints. With the
transformations covered, we hope to establish a common framework for
performance engineers, compiler developers, and hardware developers, to tap
into the performance potential offered by specialized hardware architectures
using HLS
Efficient Bayesian-based Multi-View Deconvolution
Light sheet fluorescence microscopy is able to image large specimen with high
resolution by imaging the sam- ples from multiple angles. Multi-view
deconvolution can significantly improve the resolution and contrast of the
images, but its application has been limited due to the large size of the
datasets. Here we present a Bayesian- based derivation of multi-view
deconvolution that drastically improves the convergence time and provide a fast
implementation utilizing graphics hardware.Comment: 48 pages, 20 figures, 1 table, under review at Nature Method
Replication or exploration? Sequential design for stochastic simulation experiments
We investigate the merits of replication, and provide methods for optimal
design (including replicates), with the goal of obtaining globally accurate
emulation of noisy computer simulation experiments. We first show that
replication can be beneficial from both design and computational perspectives,
in the context of Gaussian process surrogate modeling. We then develop a
lookahead based sequential design scheme that can determine if a new run should
be at an existing input location (i.e., replicate) or at a new one (explore).
When paired with a newly developed heteroskedastic Gaussian process model, our
dynamic design scheme facilitates learning of signal and noise relationships
which can vary throughout the input space. We show that it does so efficiently,
on both computational and statistical grounds. In addition to illustrative
synthetic examples, we demonstrate performance on two challenging real-data
simulation experiments, from inventory management and epidemiology.Comment: 34 pages, 9 figure
- …