105 research outputs found

    Alexander Saxton — The Great Midland

    Get PDF

    Enabling On-Demand Database Computing with MIT SuperCloud Database Management System

    Full text link
    The MIT SuperCloud database management system allows for rapid creation and flexible execution of a variety of the latest scientific databases, including Apache Accumulo and SciDB. It is designed to permit these databases to run on a High Performance Computing Cluster (HPCC) platform as seamlessly as any other HPCC job. It ensures the seamless migration of the databases to the resources assigned by the HPCC scheduler and centralized storage of the database files when not running. It also permits snapshotting of databases to allow researchers to experiment and push the limits of the technology without concerns for data or productivity loss if the database becomes unstable.Comment: 6 pages; accepted to IEEE High Performance Extreme Computing (HPEC) conference 2015. arXiv admin note: text overlap with arXiv:1406.492

    Lustre, Hadoop, Accumulo

    Full text link
    Data processing systems impose multiple views on data as it is processed by the system. These views include spreadsheets, databases, matrices, and graphs. There are a wide variety of technologies that can be used to store and process data through these different steps. The Lustre parallel file system, the Hadoop distributed file system, and the Accumulo database are all designed to address the largest and the most challenging data storage problems. There have been many ad-hoc comparisons of these technologies. This paper describes the foundational principles of each technology, provides simple models for assessing their capabilities, and compares the various technologies on a hypothetical common cluster. These comparisons indicate that Lustre provides 2x more storage capacity, is less likely to loose data during 3 simultaneous drive failures, and provides higher bandwidth on general purpose workloads. Hadoop can provide 4x greater read bandwidth on special purpose workloads. Accumulo provides 10,000x lower latency on random lookups than either Lustre or Hadoop but Accumulo's bulk bandwidth is 10x less. Significant recent work has been done to enable mix-and-match solutions that allow Lustre, Hadoop, and Accumulo to be combined in different ways.Comment: 6 pages; accepted to IEEE High Performance Extreme Computing conference, Waltham, MA, 201

    The Economic Impacts of New Technologies and Promotions on the Australian Beef Industry

    Get PDF
    Around $100 million has been spent annually on R&D and promotion in the Australian red meat industries in recent years. Producer groups have been questioning the pay-offs from these investments. These pay-offs are also a public policy issue since the coercive powers of government are used to underpin the levy system and government also directly contributes to research expenditures. In this thesis, an equilibrium displacement model (EDM) of the Australian beef industry is specified and simulated to study the returns from alternative research and promotion investments. The model is more disaggregated than existing studies of the beef industry. It provides an economic framework for cost-benefit analysis of various investments in the industry, as well as for examining the impacts of other exogenous changes such as government price and tax policies

    Lessons Learned from a Decade of Providing Interactive, On-Demand High Performance Computing to Scientists and Engineers

    Full text link
    For decades, the use of HPC systems was limited to those in the physical sciences who had mastered their domain in conjunction with a deep understanding of HPC architectures and algorithms. During these same decades, consumer computing device advances produced tablets and smartphones that allow millions of children to interactively develop and share code projects across the globe. As the HPC community faces the challenges associated with guiding researchers from disciplines using high productivity interactive tools to effective use of HPC systems, it seems appropriate to revisit the assumptions surrounding the necessary skills required for access to large computational systems. For over a decade, MIT Lincoln Laboratory has been supporting interactive, on-demand high performance computing by seamlessly integrating familiar high productivity tools to provide users with an increased number of design turns, rapid prototyping capability, and faster time to insight. In this paper, we discuss the lessons learned while supporting interactive, on-demand high performance computing from the perspectives of the users and the team supporting the users and the system. Building on these lessons, we present an overview of current needs and the technical solutions we are building to lower the barrier to entry for new users from the humanities, social, and biological sciences.Comment: 15 pages, 3 figures, First Workshop on Interactive High Performance Computing (WIHPC) 2018 held in conjunction with ISC High Performance 2018 in Frankfurt, German

    Measuring the Impact of Spectre and Meltdown

    Full text link
    The Spectre and Meltdown flaws in modern microprocessors represent a new class of attacks that have been difficult to mitigate. The mitigations that have been proposed have known performance impacts. The reported magnitude of these impacts varies depending on the industry sector and expected workload characteristics. In this paper, we measure the performance impact on several workloads relevant to HPC systems. We show that the impact can be significant on both synthetic and realistic workloads. We also show that the performance penalties are difficult to avoid even in dedicated systems where security is a lesser concern

    Benchmarking SciDB Data Import on HPC Systems

    Full text link
    SciDB is a scalable, computational database management system that uses an array model for data storage. The array data model of SciDB makes it ideally suited for storing and managing large amounts of imaging data. SciDB is designed to support advanced analytics in database, thus reducing the need for extracting data for analysis. It is designed to be massively parallel and can run on commodity hardware in a high performance computing (HPC) environment. In this paper, we present the performance of SciDB using simulated image data. The Dynamic Distributed Dimensional Data Model (D4M) software is used to implement the benchmark on a cluster running the MIT SuperCloud software stack. A peak performance of 2.2M database inserts per second was achieved on a single node of this system. We also show that SciDB and the D4M toolbox provide more efficient ways to access random sub-volumes of massive datasets compared to the traditional approaches of reading volumetric data from individual files. This work describes the D4M and SciDB tools we developed and presents the initial performance results. This performance was achieved by using parallel inserts, a in-database merging of arrays as well as supercomputing techniques, such as distributed arrays and single-program-multiple-data programming.Comment: 5 pages, 4 figures, IEEE High Performance Extreme Computing (HPEC) 2016, best paper finalis

    No evidence of differential impact of sunflower and rapeseed oil on biomarkers of coronary artery disease or chronic kidney disease in healthy adults with overweight and obesity: result from a randomised control trial

    Get PDF
    Purpose: The perceived benefits and risks associated with seed oil intake remain controversial, with a limited number of studies investigating the impact of intake on a range of compounds used as cardiometabolic markers. This study aimed to explore the proteomic and cardiometabolic effects of commonly consumed seed oils in the UK, with different fatty acid profiles. Methods: In a parallel randomised control design, healthy adults (n = 84), aged 25–72 with overweight or obesity were randomised to one of three groups: control (habitual diet, CON); 20 mL rapeseed oil per day (RO), or 20 mL sunflower oil per day (SO). Blood, spot urine and anthropometric measures were obtained at 0, 6 and 12 weeks. Proteomic biomarkers analysis was conducted for coronary arterial disease (CAD) and chronic kidney disease (CKD) using capillary electrophoresis coupled to mass spectrometry (CE-MS). Blood lipids, fasting blood glucose, glycative/oxidative stress and inflammatory markers were also analysed. Results: No differences in change between time points were observed between groups for CAD or CKD peptide fingerprint scores. No change was detected within groups for CAD or CKD scores. No detectable differences were observed between groups at week 6 or 12 for the secondary outcomes, except median 8-isoprostane, ~ 50% higher in the SO group after 12-weeks compared to RO and CON groups (p = 0.03). Conclusion: The replacement of habitual fat with either RO or SO for 12 weeks does not lead to an improvement or worsening in cardiovascular health markers in people with overweight or obesity. Trial registration: Trial registration clinicaltrials.gov NCT04867629, retrospectively registered 30/04/2021
    • …
    corecore