25 research outputs found

    ClimSim: A large multi-scale dataset for hybrid physics-ML climate emulation

    Get PDF
    Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever dataset designed for hybrid ML-physics research. It comprises multi-scale climate simulations, developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally-nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state.The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring. The data (https://huggingface.co/datasets/LEAP/ClimSim_high-res) and code (https://leap-stc.github.io/ClimSim) are released openly to support the development of hybrid ML-physics and high-fidelity climate simulations for the benefit of science and society

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Stormwater Management Educational Materials for Central Massachusetts Municipalities

    No full text
    Stormwater runoff is the leading cause of water pollution in the United States. To help with this issue, the United States Environmental Protection Agency issued an updated Municipal Separate Storm Sewer System permit for Massachusetts in April 2016, which includes more stringent requirements. Municipalities in Massachusetts anticipate struggling to comply with the permit given their limited resources. The goal of this project, in collaboration with the Massachusetts Department of Environmental Protection and the Central Massachusetts Regional Stormwater Coalition, was to develop educational materials to help municipal officials comply with the permit. From our interviews and survey, we created a compliance guideline and provided suggestions for municipalities on preparing for the permit

    Some technical aspects of foreign trade statistics with special reference to valuation,

    No full text
    "International convention relating to economic statistics. Geneva, December 14, 1928": p. 211-216.Thesis (PH.D.)--Catholic university of America.Bibliography: p. 217-247.Mode of access: Internet
    corecore