90 research outputs found

    Towards a Workload for Evolutionary Analytics

    Full text link
    Emerging data analysis involves the ingestion and exploration of new data sets, application of complex functions, and frequent query revisions based on observing prior query answers. We call this new type of analysis evolutionary analytics and identify its properties. This type of analysis is not well represented by current benchmark workloads. In this paper, we present a workload and identify several metrics to test system support for evolutionary analytics. Along with our metrics, we present methodologies for running the workload that capture this analytical scenario.Comment: 10 page

    Mapping Datasets to Object Storage System

    Full text link
    Access libraries such as ROOT and HDF5 allow users to interact with datasets using high level abstractions, like coordinate systems and associated slicing operations. Unfortunately, the implementations of access libraries are based on outdated assumptions about storage systems interfaces and are generally unable to fully benefit from modern fast storage devices. The situation is getting worse with rapidly evolving storage devices such as non-volatile memory and ever larger datasets. This project explores distributed dataset mapping infrastructures that can integrate and scale out existing access libraries using Ceph's extensible object model, avoiding re-implementation or even modifications of these access libraries as much as possible. These programmable storage extensions coupled with our distributed dataset mapping techniques enable: 1) access library operations to be offloaded to storage system servers, 2) the independent evolution of access libraries and storage systems and 3) fully leveraging of the existing load balancing, elasticity, and failure management of distributed storage systems like Ceph. They also create more opportunities to conduct storage server-local optimizations specific to storage servers. For example, storage servers might include local key/value stores combined with chunk stores that require different optimizations than a local file system. As storage servers evolve to support new storage devices like non-volatile memory, these server-local optimizations can be implemented while minimizing disruptions to applications. We will report progress on the means by which distributed dataset mapping can be abstracted over particular access libraries, including access libraries for ROOT data, and how we address some of the challenges revolving around data partitioning and composability of access operations

    Scihadoop: Array-based query processing in hadoop.

    Get PDF
    ABSTRACT Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop's byte stream data model causes inefficiencies when used to process scientific data that is commonly stored in highly-structured, array-based binary file formats. This limits the scalability of Hadoop applications in science. We introduce SciHadoop, a Hadoop plugin allowing scientists to specify logical queries over array-based data models. SciHadoop executes queries as map/reduce programs defined over the logical data model. We describe the implementation of a SciHadoop prototype for NetCDF data sets, and quantify the performance of five separate optimizations that address the following goals for a representative holistic aggregate function query: reduce total data transfers, reduce remote reads, and reduce unnecessary reads. Two optimizations allow holistic functions to be evaluated opportunistically during the map phase; Two additional optimizations intelligently partition input data to increase read locality, and one optimization avoids block scans by examining the data dependencies of an executing query to prune input partitions. Experiments involving a holistic function show run-time improvements of up to 8x, with drastic reductions of I/O, both locally, and over the network

    Novel Lineage of Methicillin-Resistant Staphylococcus aureus, Hong Kong

    Get PDF
    To determine whether spa type of methicillin-resistant Staphylococcus aureus in pigs belonged to sequence type (ST) 398, we analyzed nasal swabs from pig carcasses at Hong Kong markets in 2008. ST9 belonging to spa type t899 was found for 16/100 samples, which indicates that a distinct lineage has emerged in pigs

    The Imaging X-ray Polarimetry Explorer (IXPE): Technical Overview

    Get PDF
    The Imaging X-ray Polarimetry Explorer (IXPE) will expand the information space for study of cosmic sources, by adding linear polarization to the properties (time, energy, and position) observed in x-ray astronomy. Selected in 2017 January as a NASA Astrophysics Small Explorer (SMEX) mission, IXPE will be launched into an equatorial orbit in 2021. The IXPE mission will provide scientifically meaningful measurements of the x-ray polarization of a few dozen sources in the 2-8 keV band, including polarization maps of several x-ray-bright extended sources and phase-resolved polarimetry of many bright pulsating x-ray sources

    Act now against new NHS competition regulations: an open letter to the BMA and the Academy of Medical Royal Colleges calls on them to make a joint public statement of opposition to the amended section 75 regulations.

    Get PDF

    The Compact Linear Collider (CLIC) - 2018 Summary Report

    Get PDF

    HE-LHC: The High-Energy Large Hadron Collider: Future Circular Collider Conceptual Design Report Volume 4

    Get PDF
    In response to the 2013 Update of the European Strategy for Particle Physics (EPPSU), the Future Circular Collider (FCC) study was launched as a world-wide international collaboration hosted by CERN. The FCC study covered an energy-frontier hadron collider (FCC-hh), a highest-luminosity high-energy lepton collider (FCC-ee), the corresponding 100 km tunnel infrastructure, as well as the physics opportunities of these two colliders, and a high-energy LHC, based on FCC-hh technology. This document constitutes the third volume of the FCC Conceptual Design Report, devoted to the hadron collider FCC-hh. It summarizes the FCC-hh physics discovery opportunities, presents the FCC-hh accelerator design, performance reach, and staged operation plan, discusses the underlying technologies, the civil engineering and technical infrastructure, and also sketches a possible implementation. Combining ingredients from the Large Hadron Collider (LHC), the high-luminosity LHC upgrade and adding novel technologies and approaches, the FCC-hh design aims at significantly extending the energy frontier to 100 TeV. Its unprecedented centre-of-mass collision energy will make the FCC-hh a unique instrument to explore physics beyond the Standard Model, offering great direct sensitivity to new physics and discoveries
    corecore