Search CORE

17,472 research outputs found

Multiple Uncertainties in Time-Variant Cosmological Particle Data

Author: Haroz Steve
Heitmann Dr. Katrin
Ma Dr. Kwan-Liu
Publication venue: IEEE
Publication date: 01/01/2008
Field of study

Though the mediums for visualization are limited, the potential dimensions of a dataset are not. In many areas of scientific study, understanding the correlations between those dimensions and their uncertainties is pivotal to mining useful information from a dataset. Obtaining this insight can necessitate visualizing the many relationships among temporal, spatial, and other dimensionalities of data and its uncertainties. We utilize multiple views for interactive dataset exploration and selection of important features, and we apply those techniques to the unique challenges of cosmological particle datasets. We show how interactivity and incorporation of multiple visualization techniques help overcome the problem of limited visualization dimensions and allow many types of uncertainty to be seen in correlation with other variables

arXiv.org e-Print Archive

CiteSeerX

Crossref

CogPrints Cognitive Sciences Eprint Archive

Leveraging Coding Techniques for Speeding up Distributed Computing

Author: Konstantinidis Konstantinos
Ramamoorthy Aditya
Publication venue
Publication date: 08/02/2018
Field of study

Large scale clusters leveraging distributed computing frameworks such as MapReduce routinely process data that are on the orders of petabytes or more. The sheer size of the data precludes the processing of the data on a single computer. The philosophy in these methods is to partition the overall job into smaller tasks that are executed on different servers; this is called the map phase. This is followed by a data shuffling phase where appropriate data is exchanged between the servers. The final so-called reduce phase, completes the computation. One potential approach, explored in prior work for reducing the overall execution time is to operate on a natural tradeoff between computation and communication. Specifically, the idea is to run redundant copies of map tasks that are placed on judiciously chosen servers. The shuffle phase exploits the location of the nodes and utilizes coded transmission. The main drawback of this approach is that it requires the original job to be split into a number of map tasks that grows exponentially in the system parameters. This is problematic, as we demonstrate that splitting jobs too finely can in fact adversely affect the overall execution time. In this work we show that one can simultaneously obtain low communication loads while ensuring that jobs do not need to be split too finely. Our approach uncovers a deep relationship between this problem and a class of combinatorial structures called resolvable designs. Appropriate interpretation of resolvable designs can allow for the development of coded distributed computing schemes where the splitting levels are exponentially lower than prior work. We present experimental results obtained on Amazon EC2 clusters for a widely known distributed algorithm, namely TeraSort. We obtain over 4.69

\times

improvement in speedup over the baseline approach and more than 2.6

\times

over current state of the art

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Crossref

Systemic: A Testbed For Characterizing the Detection of Extrasolar Planets. I. The Systemic Console Package

Author: Butler Paul
Laughlin Gregory
Meschiari Stefano
Rivera Eugenio
Vogt Steve
Wolf Aaron
Publication venue: 'University of Chicago Press'
Publication date: 09/07/2009
Field of study

We present the systemic Console, a new all-in-one, general-purpose software package for the analysis and combined multiparameter fitting of Doppler radial velocity (RV) and transit timing observations. We give an overview of the computational algorithms implemented in the Console, and describe the tools offered for streamlining the characterization of planetary systems. We illustrate the capabilities of the package by analyzing an updated radial velocity data set for the HD128311 planetary system. HD128311 harbors a pair of planets that appear to be participating in a 2:1 mean motion resonance. We show that the dynamical configuration cannot be fully determined from the current data. We find that if a planetary system like HD128311 is found to undergo transits, then self-consistent Newtonian fits to combined radial velocity data and a small number of timing measurements of transit midpoints can provide an immediate and vastly improved characterization of the planet's dynamical state.Comment: 10 pages, 5 figures, accepted for publication on PASP. Additional material at http://www.ucolick.org/~smeschia/systemic.ph

arXiv.org e-Print Archive

Crossref

Information Gains from Cosmological Probes

Author: Amara A.
Grandis S.
Nicola A.
Refregier A.
Seehars S.
Publication venue: 'IOP Publishing'
Publication date: 19/04/2016
Field of study

In light of the growing number of cosmological observations, it is important to develop versatile tools to quantify the constraining power and consistency of cosmological probes. Originally motivated from information theory, we use the relative entropy to compute the information gained by Bayesian updates in units of bits. This measure quantifies both the improvement in precision and the 'surprise', i.e. the tension arising from shifts in central values. Our starting point is a WMAP9 prior which we update with observations of the distance ladder, supernovae (SNe), baryon acoustic oscillations (BAO), and weak lensing as well as the 2015 Planck release. We consider the parameters of the flat

\Lambda

CDM concordance model and some of its extensions which include curvature and Dark Energy equation of state parameter

w

. We find that, relative to WMAP9 and within these model spaces, the probes that have provided the greatest gains are Planck (10 bits), followed by BAO surveys (5.1 bits) and SNe experiments (3.1 bits). The other cosmological probes, including weak lensing (1.7 bits) and {

\rm H_0

} measures (1.7 bits), have contributed information but at a lower level. Furthermore, we do not find any significant surprise when updating the constraints of WMAP9 with any of the other experiments, meaning that they are consistent with WMAP9. However, when we choose Planck15 as the prior, we find that, accounting for the full multi-dimensionality of the parameter space, the weak lensing measurements of CFHTLenS produce a large surprise of 4.4 bits which is statistically significant at the 8

\sigma

level. We discuss how the relative entropy provides a versatile and robust framework to compare cosmological probes in the context of current and future surveys.Comment: 26 pages, 5 figure

arXiv.org e-Print Archive

Portsmouth University Research Portal (Pure)

Compressing DNA sequence databases with coil

Author: Hendy Michael D.
White W. Timothy J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2008
Field of study

Background: Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results: We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion: coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work

Massey Research Online

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

One simulation to fit them all - changing the background parameters of a cosmological N-body simulation

Author: Angulo
Baugh
Baugh
Bullock
Carbone
Cole
Cole
Cole
Cooray
Croton
Davis
De Lucia
Dolag
Eisenstein
Gaztanaga
Guzzo
Harker
Hoekstra
Huterer
Jennings
Lewis
Manera
Matarrese
Percival
Press
R. E. Angulo
Ross
S. D. M. White
Seljak
Spergel
Spergel
Springel
Springel
Springel
Sánchez
Sánchez
Tinker
Tormen
Wang
Zel'dovich
Zheng
Zheng
Publication venue: 'Wiley'
Publication date: 22/12/2009
Field of study

We demonstrate that the output of a cosmological N-body simulation can, to remarkable accuracy, be scaled to represent the growth of large-scale structure in a cosmology with parameters similar to but different from those originally assumed. Our algorithm involves three steps: a reassignment of length, mass and velocity units, a relabelling of the time axis, and a rescaling of the amplitudes of individual large-scale fluctuation modes. We test it using two matched pairs of simulations. Within each pair, one simulation assumes parameters consistent with analyses of the first-year WMAP data. The other has lower matter and baryon densities and a 15% lower fluctuation amplitude, consistent with analyses of the three-year WMAP data. The pairs differ by a factor of a thousand in mass resolution, enabling performance tests on both linear and nonlinear scales. Our scaling reproduces the mass power spectra of the target cosmology to better than 0.5% on large scales (k < 0.1 h/Mpc) both in real and in redshift space. In particular, the BAO features of the original cosmology are removed and are correctly replaced by those of the target cosmology. Errors are still below 3% for k < 1 h/Mpc. Power spectra of the dark halo distribution are even more precisely reproduced, with errors below 1% on all scales tested. A halo-by-halo comparison shows that centre-of-mass positions and velocities are reproduced to better than 90 kpc/h and 5%, respectively. Halo masses, concentrations and spins are also reproduced at about the 10% level, although with small biases. Halo assembly histories are accurately reproduced, leading to central galaxy magnitudes with errors of about 0.25 magnitudes and a bias of about 0.13 magnitudes for a representative semi-analytic model.Comment: 14 pages, 12 figures. Submitted to MNRA

arXiv.org e-Print Archive

Crossref

MPG.PuRe

From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

Author: Blazewicz Marek
Brandt Steven R.
Ciznicki Milosz
Hinder Ian
Kierzynka Michal
Koppelman David M.
Löffler Frank
Schnetter Erik
Tao Jian
Publication venue: 'IOS Press'
Publication date: 01/01/2013
Field of study

Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.Comment: 18 pages, 4 figures, accepted for publication in Scientific Programmin

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Louisiana State University

MPG.PuRe