34,260 research outputs found

    The National Virtual Observatory

    Get PDF
    As a scientific discipline, Astronomy is rather unique. We only have one laboratory, the Universe, and we cannot, of course, change the initial conditions and study the resulting effects. On top of this, acquiring Astronomical data has historically been a very labor-intensive effort. As a result, data has traditionally been preserved for posterity. With recent technological advances, however, the rate at which we acquire new data has grown exponentially, which has generated a Data Tsunami, whose wave train threatens to overwhelm the field. In this conference proceedings, we present and define the concept of virtual observatories, which we feel is the only logical answer to this dilemma.Comment: 5 pages, uses newpasp.sty (included), to appear in "Extragalactic Gas at Low Redshfit", ASP Conf. Series, J. S. Mulchaey and J. T. Stocke (eds.

    Data Driven Discovery in Astrophysics

    Get PDF
    We review some aspects of the current state of data-intensive astronomy, its methods, and some outstanding data analysis challenges. Astronomy is at the forefront of "big data" science, with exponentially growing data volumes and data rates, and an ever-increasing complexity, now entering the Petascale regime. Telescopes and observatories from both ground and space, covering a full range of wavelengths, feed the data via processing pipelines into dedicated archives, where they can be accessed for scientific analysis. Most of the large archives are connected through the Virtual Observatory framework, that provides interoperability standards and services, and effectively constitutes a global data grid of astronomy. Making discoveries in this overabundance of data requires applications of novel, machine learning tools. We describe some of the recent examples of such applications.Comment: Keynote talk in the proceedings of ESA-ESRIN Conference: Big Data from Space 2014, Frascati, Italy, November 12-14, 2014, 8 pages, 2 figure

    Challenges of Big Data Analysis

    Full text link
    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article give overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasis on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions

    BOSS-LDG: A Novel Computational Framework that Brings Together Blue Waters, Open Science Grid, Shifter and the LIGO Data Grid to Accelerate Gravitational Wave Discovery

    Get PDF
    We present a novel computational framework that connects Blue Waters, the NSF-supported, leadership-class supercomputer operated by NCSA, to the Laser Interferometer Gravitational-Wave Observatory (LIGO) Data Grid via Open Science Grid technology. To enable this computational infrastructure, we configured, for the first time, a LIGO Data Grid Tier-1 Center that can submit heterogeneous LIGO workflows using Open Science Grid facilities. In order to enable a seamless connection between the LIGO Data Grid and Blue Waters via Open Science Grid, we utilize Shifter to containerize LIGO's workflow software. This work represents the first time Open Science Grid, Shifter, and Blue Waters are unified to tackle a scientific problem and, in particular, it is the first time a framework of this nature is used in the context of large scale gravitational wave data analysis. This new framework has been used in the last several weeks of LIGO's second discovery campaign to run the most computationally demanding gravitational wave search workflows on Blue Waters, and accelerate discovery in the emergent field of gravitational wave astrophysics. We discuss the implications of this novel framework for a wider ecosystem of Higher Performance Computing users.Comment: 10 pages, 10 figures. Accepted as a Full Research Paper to the 13th IEEE International Conference on eScienc
    • …
    corecore