268 research outputs found

    Analysis on vector product spaces

    Get PDF

    AIG Email from Sherwood to Cassano and Viniar regarding CDO Spreadsheet

    Get PDF

    AIG Email from Michael Roemer to Joe Cassano regarding FP Meeting

    Get PDF

    Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs

    Full text link
    Over the past few years, Large Language Models of Code (Code LLMs) have started to have a significant impact on programming practice. Code LLMs are also emerging as a building block for research in programming languages and software engineering. However, the quality of code produced by a Code LLM varies significantly by programming languages. Code LLMs produce impressive results on programming languages that are well represented in their training data (e.g., Java, Python, or JavaScript), but struggle with low-resource languages, like OCaml and Racket. This paper presents an effective approach for boosting the performance of Code LLMs on low-resource languages using semi-synthetic data. Our approach generates high-quality datasets for low-resource languages, which can then be used to fine-tune any pretrained Code LLM. Our approach, called MultiPL-T, translates training data from high-resource languages into training data for low-resource languages. We apply our approach to generate tens of thousands of new, validated training items for Racket, OCaml, and Lua from Python. Moreover, we use an open dataset (The Stack) and model (StarCoderBase), which allow us to decontaminate benchmarks and train models on this data without violating the model license. With MultiPL-T generated data, we present fine-tuned versions of StarCoderBase that achieve state-of-the-art performance for Racket, OCaml, and Lua on benchmark problems. For Lua, our fine-tuned model achieves the same performance as StarCoderBase as Python -- a very high-resource language -- on the MultiPL-E benchmarks. For Racket and OCaml, we double their performance on MultiPL-E, bringing their performance close to higher-resource languages such as Ruby and C#

    Multiethnic meta-analysis identifies ancestry-specific and cross-ancestry loci for pulmonary function

    Get PDF
    Nearly 100 loci have been identified for pulmonary function, almost exclusively in studies of European ancestry populations. We extend previous research by meta-analyzing genome-wide association studies of 1000 Genomes imputed variants in relation to pulmonary function in a multiethnic population of 90,715 individuals of European (N = 60,552), African (N = 8429), Asian (N = 9959), and Hispanic/Latino (N = 11,775) ethnicities. We identify over 50 additional loci at genome-wide significance in ancestry-specific or multiethnic meta-analyses. Using recent fine-mapping methods incorporating functional annotation, gene expression, and differences in linkage disequilibrium between ethnicities, we further shed light on potential causal variants and genes at known and newly identified loci. Several of the novel genes encode proteins with predicted or established drug targets, including KCNK2 and CDK12. Our study highlights the utility of multiethnic and integrative genomics approaches to extend existing knowledge of the genetics of l

    Confronting Arctic Troposphere, Clouds, and Surface Energy Budget Representations in Regional Climate Models With Observations

    Get PDF
    A coordinated regional climate model (RCM) evaluation and intercomparison project based on observations from a July–October 2014 trans‐Arctic Ocean field experiment (ACSE‐Arctic Clouds during Summer Experiment) is presented. Six state‐of‐the‐art RCMs were constrained with common reanalysis lateral boundary forcing and upper troposphere nudging techniques to explore how the RCMs represented the evolution of the surface energy budget (SEB) components and their relation to cloud properties. We find that the main reasons for the modeled differences in the SEB components are a direct consequence of the RCM treatment of cloud and cloud‐radiative interactions. The RCMs could be separated into groups by their overestimation or underestimation of cloud liquid. While radiative and turbulent heat flux errors were relatively large, they often invoke compensating errors. In addition, having the surface sea‐ice concentrations constrained by the reanalysis or satellite observations limited how errors in the modeled radiative fluxes could affect the SEB and ultimately the surface evolution and its coupling with lower tropospheric mixing and cloud properties. Many of these results are consistent with RCM biases reported in studies over a decade ago. One of the six models was a fully coupled ocean‐ice‐atmosphere model. Despite the biases in overestimating cloud liquid, and associated SEB errors due to too optically thick clouds, its simulations were useful in understanding how the fully coupled system is forced by, and responds to, the SEB evolution. Moving forward, we suggest that development of RCM studies need to consider the fully coupled climate system

    An arctic hydrologic system in transition: Feedbacks and impacts on terrestrial, marine, and human life

    Get PDF
    The pace of change in the arctic system during recent decades has captured the world\u27s attention. Observations and model simulations both indicate that the arctic experiences an amplified response to climate forcing relative to that at lower latitudes. At the core of these changes is the arctic hydrologic system, which includes ice, gaseous vapor in the atmosphere, liquid water in soils and fluvial networks on land, and the freshwater content of the ocean. The changes in stores and fluxes of freshwater have a direct impact on biological systems, not only of the arctic region itself, but also well beyond its bounds. In this investigation, we used a heuristic, graphical approach to distill the system into its fundamental parts, documented the key relationships between those parts as best we know them, and identified the feedback loops within the system. The analysis illustrates relationships that are well understood, but also reveals others that are either unfamiliar, uncertain, or unexplored. The graphical approach was used to provide a visual assessment of the arctic hydrologic system in one possible future state in which the Arctic Ocean is seasonally ice free

    Land Surface Climate in the Regional Arctic System Model

    Get PDF
    The article of record as published may be found at http://dx.doi.org/10.1175/JCLI-D-15-0415.1The Regional Arctic System Model (RASM) is a fully coupled, regional Earth system model applied over the pan-Arctic domain. This paper discusses the implementation of the Variable Infiltration Capacity land surface model (VIC) in RASM and evaluates the ability of RASM, version 1.0, to capture key features of the land surface climate and hydrologic cycle for the period 1979-2014 in comparison with uncoupled VIC simulations, reanalysis datasets, satellite measurements, and in situ observations. RASM reproduces the dominant features of the land surface climatology in the Arctic, such as the amount and regional distribution of precipitation, the partitioning of precipitation between runoff and evapotranspiration, the effects of snow on the water and energy balance, and the differences in turbulent fluxes between the tundra and taiga biomes. Surface air temperature biases in RASM, compared to reanalysis datasets ERA-Interim and MERRA, are generally less than 2 degrees C; however, in the cold seasons there are local biases that exceed 6 degrees C. Compared to satellite observations, RASM captures the annual cycle of snow-covered area well, although melt progresses about two weeks faster than observations in the late spring at high latitudes. With respect to derived fluxes, such as latent heat or runoff, RASM is shown to have similar performance statistics as ERA-Interim while differing substantially from MERRA, which consistently overestimates the evaporative flux across the Arctic region.U.S. Department of Energy (DOE) [DE-FG02-07ER64460, DE-SC0006856, DE-SC0006178]; DO
    corecore