Search CORE

6 research outputs found

Accelerating Markov chain Monte Carlo via parallel predictive prefetching

Author: C© Elaine Lee Angelino
Elaine Lee Angelino
Publication venue: 'Harvard University Botany Libraries'
Publication date: 21/10/2014
Field of study

We present a general framework for accelerating a large class of widely used Markov chain Monte Carlo (MCMC) algorithms. This dissertation demonstrates that MCMC inference can be accelerated in a model of parallel computation that uses speculation to predict and complete computational work ahead of when it is known to be useful. By exploiting fast, iterative approximations to the target density, we can speculatively evaluate many potential future steps of the chain in parallel. In Bayesian inference problems, this approach can accelerate sampling from the target distribution, without compromising exactness, by exploiting subsets of data. It takes advantage of whatever parallel resources are available, but produces results exactly equivalent to standard serial execution. In the initial burn-in phase of chain evaluation, it achieves speedup over serial evaluation that is close to linear in the number of available cores.Engineering and Applied Science

CiteSeerX

Harvard University - DASH

Recommended from our members

StarFlow: A Script-Centric Data Analysis Environment

Author: Angelino Elaine Lee
Seltzer Margo I.
Yamins Daniel Louis Kanef
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/04/2011
Field of study

We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency network, (3) support for workflow abstractions enabling robust parallel executions of complex analysis pipelines, and (4) a seamless interface with the Python scripting language. We describe a range of real applications of StarFlow, including automatic parallelization of complex workflows in the cloud.Engineering and Applied Science

Harvard University - DASH

Flash Caching on the Storage Client

Author: Angelino Elaine Lee
Holland David A.
Seltzer Margo I.
Wald Gideon
Publication venue: USENIX Association
Publication date: 19/11/2013
Field of study

Flash memory has recently become popular as a caching medium. Most uses to date are on the storage server side. We investigate a different structure: flash as a cache on the client side of a networked storage environment. We use trace-driven simulation to explore the design space. We consider a wide range of configurations and policies to determine the potential client-side caches might offer and how best to arrange them. Our results show that the flash cache writeback policy does not significantly affect performance. Write-through is sufficient; this greatly simplifies cache consistency handling. We also find that the chief benefit of the flash cache is its size, not its persistence. Cache persistence offers additional performance benefits at system restart at essentially no runtime cost. Finally, for some workloads a large flash cache allows using miniscule amounts of RAM for file caching (e.g., 256 KB) leaving more memory available for application use.Engineering and Applied Science

CiteSeerX

Harvard University - DASH

Recommended from our members

Provenance Integration Requires Reconciliation

Author: Angelino Elaine Lee
Braun Uri Jacob
Holland David A
Macko Peter
Margo Daniel Wyatt
Seltzer Margo I.
Publication venue
Publication date: 07/10/2011
Field of study

While there has been a great deal of research on provenance systems, there has been little discussion about challenges that arise when making different provenance systems interoperate. In fact, most of the literature focuses on provenance systems in isolation and does not discuss interoperability – what it means, its requirements, and how to achieve it. We designed the Provenance-Aware Storage System to be a general- purpose substrate on top of which it would be “easy” to add other provenance-aware systems in a way that would provide “seamless integration” for the provenance captured at each level. While the system did exactly what we wanted on toy problems, when we began integrating StarFlow, a Python-based workflow/provenance system, we discovered that integration is far trickier and more subtle than anyone has suggested in the literature. This work describes our experience undertaking the integration of StarFlow and PASS, identifying several important additions to existing provenance models necessary for interoperability among provenance systems.Engineering and Applied Science

Harvard University - DASH

Recommended from our members

Computational Caches

Author: Adams Ryan Prescott
Angelino Elaine Lee
Appavoo Jonathan
Cubuk Ekin Dogus
Kaxiras Efthimios
Seltzer Margo I.
Waterland Amos
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/09/2017
Field of study

Caching is a well-known technique for speeding up computation. We cache data from file systems and databases; we cache dynamically generated code blocks; we cache page translations in TLBs. We propose to cache the act of computation, so that we can apply it later and in different contexts. We use a state-space model of computation to support such caching, involving two interrelated parts: speculatively memoized predicted/resultant state pairs that we use to accelerate sequential computation, and trained probabilistic models that we use to generate predicted states from which to speculatively execute. The key techniques that make this approach feasible are designing probabilistic models that automatically focus on regions of program execution state space in which prediction is tractable and identifying state space equivalence classes so that predictions need not be exact.Engineering and Applied Science

Harvard University - DASH

Recommended from our members

A Composite of Multiple Signals Distinguishes Causal Variants in Regions of Positive Selection

Author: Angelino Elaine Lee
Byrne Elizabeth Hockfield
Frieden Gabriel
Garber Manuel
Grossman Sharon Rachel
Hostetter Elizabeth
Karlsson Elinor Kathryn
Lander Eric Steven
Morales Shannon
Sabeti Pardis Christine
Schaffner Stephen
Shylakhter Ilya
Zuk Or
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 06/09/2011
Field of study

The human genome contains hundreds of regions whose patterns of genetic variation indicate recent positive natural selection, yet for most the underlying gene and the advantageous mutation remain unknown. We developed a method, composite of multiple signals (CMS), that combines tests for multiple signals of selection and increases resolution by up to 100-fold. By applying CMS to candidate regions from the International Haplotype Map, we localized population-specific selective signals to 55 kilobases (median), identifying known and novel causal variants. CMS can not just identify individual loci but implicates precise variants selected by evolution.Organismic and Evolutionary BiologyOther Research Uni

Harvard University - DASH