8,994 research outputs found

    Harnessing the Power of Many: Extensible Toolkit for Scalable Ensemble Applications

    Full text link
    Many scientific problems require multiple distinct computational tasks to be executed in order to achieve a desired solution. We introduce the Ensemble Toolkit (EnTK) to address the challenges of scale, diversity and reliability they pose. We describe the design and implementation of EnTK, characterize its performance and integrate it with two distinct exemplar use cases: seismic inversion and adaptive analog ensembles. We perform nine experiments, characterizing EnTK overheads, strong and weak scalability, and the performance of two use case implementations, at scale and on production infrastructures. We show how EnTK meets the following general requirements: (i) implementing dedicated abstractions to support the description and execution of ensemble applications; (ii) support for execution on heterogeneous computing infrastructures; (iii) efficient scalability up to O(10^4) tasks; and (iv) fault tolerance. We discuss novel computational capabilities that EnTK enables and the scientific advantages arising thereof. We propose EnTK as an important addition to the suite of tools in support of production scientific computing

    Designing Traceability into Big Data Systems

    Full text link
    Providing an appropriate level of accessibility and traceability to data or process elements (so-called Items) in large volumes of data, often Cloud-resident, is an essential requirement in the Big Data era. Enterprise-wide data systems need to be designed from the outset to support usage of such Items across the spectrum of business use rather than from any specific application view. The design philosophy advocated in this paper is to drive the design process using a so-called description-driven approach which enriches models with meta-data and description and focuses the design process on Item re-use, thereby promoting traceability. Details are given of the description-driven design of big data systems at CERN, in health informatics and in business process management. Evidence is presented that the approach leads to design simplicity and consequent ease of management thanks to loose typing and the adoption of a unified approach to Item management and usage.Comment: 10 pages; 6 figures in Proceedings of the 5th Annual International Conference on ICT: Big Data, Cloud and Security (ICT-BDCS 2015), Singapore July 2015. arXiv admin note: text overlap with arXiv:1402.5764, arXiv:1402.575

    The INCF Digital Atlasing Program: Report on Digital Atlasing Standards in the Rodent Brain

    Get PDF
    The goal of the INCF Digital Atlasing Program is to provide the vision and direction necessary to make the rapidly growing collection of multidimensional data of the rodent brain (images, gene expression, etc.) widely accessible and usable to the international research community. This Digital Brain Atlasing Standards Task Force was formed in May 2008 to investigate the state of rodent brain digital atlasing, and formulate standards, guidelines, and policy recommendations.

Our first objective has been the preparation of a detailed document that includes the vision and specific description of an infrastructure, systems and methods capable of serving the scientific goals of the community, as well as practical issues for achieving
the goals. This report builds on the 1st INCF Workshop on Mouse and Rat Brain Digital Atlasing Systems (Boline et al., 2007, _Nature Preceedings_, doi:10.1038/npre.2007.1046.1) and includes a more detailed analysis of both the current state and desired state of digital atlasing along with specific recommendations for achieving these goals
    • …
    corecore