268 research outputs found
Recommended from our members
Folate Network Genetic Variation, Plasma Homocysteine, and Global Genomic Methylation Content: A Genetic Association Study
Background: Sequence variants in genes functioning in folate-mediated one-carbon metabolism are hypothesized to lead to changes in levels of homocysteine and DNA methylation, which, in turn, are associated with risk of cardiovascular disease. Methods: 330 SNPs in 52 genes were studied in relation to plasma homocysteine and global genomic DNA methylation. SNPs were selected based on functional effects and gene coverage, and assays were completed on the Illumina Goldengate platform. Age-, smoking-, and nutrient-adjusted genotype--phenotype associations were estimated in regression models. Results: Using a nominal P 0.005 threshold for statistical significance, 20 SNPs were associated with plasma homocysteine, 8 with Alu methylation, and 1 with LINE-1 methylation. Using a more stringent false discovery rate threshold, SNPs in FTCD, SLC19A1, and SLC19A3 genes remained associated with plasma homocysteine. Gene by vitamin B-6 interactions were identified for both Alu and LINE-1 methylation, and epistatic interactions with the MTHFR rs1801133 SNP were identified for the plasma homocysteine phenotype. Pleiotropy involving the MTHFD1L and SARDH genes for both plasma homocysteine and Alu methylation phenotypes was identified. Conclusions: No single gene was associated with all three phenotypes, and the set of the most statistically significant SNPs predictive of homocysteine or Alu or LINE-1 methylation was unique to each phenotype. Genetic variation in folate-mediated one-carbon metabolism, other than the well-known effects of the MTHFR c.665C>T (known as c.677 C>T, rs1801133, p.Ala222Val), is predictive of cardiovascular disease biomarkers
Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs
Over the past few years, Large Language Models of Code (Code LLMs) have
started to have a significant impact on programming practice. Code LLMs are
also emerging as a building block for research in programming languages and
software engineering. However, the quality of code produced by a Code LLM
varies significantly by programming languages. Code LLMs produce impressive
results on programming languages that are well represented in their training
data (e.g., Java, Python, or JavaScript), but struggle with low-resource
languages, like OCaml and Racket.
This paper presents an effective approach for boosting the performance of
Code LLMs on low-resource languages using semi-synthetic data. Our approach
generates high-quality datasets for low-resource languages, which can then be
used to fine-tune any pretrained Code LLM. Our approach, called MultiPL-T,
translates training data from high-resource languages into training data for
low-resource languages. We apply our approach to generate tens of thousands of
new, validated training items for Racket, OCaml, and Lua from Python. Moreover,
we use an open dataset (The Stack) and model (StarCoderBase), which allow us to
decontaminate benchmarks and train models on this data without violating the
model license.
With MultiPL-T generated data, we present fine-tuned versions of
StarCoderBase that achieve state-of-the-art performance for Racket, OCaml, and
Lua on benchmark problems. For Lua, our fine-tuned model achieves the same
performance as StarCoderBase as Python -- a very high-resource language -- on
the MultiPL-E benchmarks. For Racket and OCaml, we double their performance on
MultiPL-E, bringing their performance close to higher-resource languages such
as Ruby and C#
Multiethnic meta-analysis identifies ancestry-specific and cross-ancestry loci for pulmonary function
Nearly 100 loci have been identified for pulmonary function, almost exclusively in studies of European ancestry populations. We extend previous research by meta-analyzing genome-wide association studies of 1000 Genomes imputed variants in relation to pulmonary function in a multiethnic population of 90,715 individuals of European (N = 60,552), African (N = 8429), Asian (N = 9959), and Hispanic/Latino (N = 11,775) ethnicities. We identify over 50 additional loci at genome-wide significance in ancestry-specific or multiethnic meta-analyses. Using recent fine-mapping methods incorporating functional annotation, gene expression, and differences in linkage disequilibrium between ethnicities, we further shed light on potential causal variants and genes at known and newly identified loci. Several of the novel genes encode proteins with predicted or established drug targets, including KCNK2 and CDK12. Our study highlights the utility of multiethnic and integrative genomics approaches to extend existing knowledge of the genetics of l
Confronting Arctic Troposphere, Clouds, and Surface Energy Budget Representations in Regional Climate Models With Observations
A coordinated regional climate model (RCM) evaluation and intercomparison project based on observations from a July–October 2014 trans‐Arctic Ocean field experiment (ACSE‐Arctic Clouds during Summer Experiment) is presented. Six state‐of‐the‐art RCMs were constrained with common reanalysis lateral boundary forcing and upper troposphere nudging techniques to explore how the RCMs represented the evolution of the surface energy budget (SEB) components and their relation to cloud properties. We find that the main reasons for the modeled differences in the SEB components are a direct consequence of the RCM treatment of cloud and cloud‐radiative interactions. The RCMs could be separated into groups by their overestimation or underestimation of cloud liquid. While radiative and turbulent heat flux errors were relatively large, they often invoke compensating errors. In addition, having the surface sea‐ice concentrations constrained by the reanalysis or satellite observations limited how errors in the modeled radiative fluxes could affect the SEB and ultimately the surface evolution and its coupling with lower tropospheric mixing and cloud properties. Many of these results are consistent with RCM biases reported in studies over a decade ago. One of the six models was a fully coupled ocean‐ice‐atmosphere model. Despite the biases in overestimating cloud liquid, and associated SEB errors due to too optically thick clouds, its simulations were useful in understanding how the fully coupled system is forced by, and responds to, the SEB evolution. Moving forward, we suggest that development of RCM studies need to consider the fully coupled climate system
An arctic hydrologic system in transition: Feedbacks and impacts on terrestrial, marine, and human life
The pace of change in the arctic system during recent decades has captured the world\u27s attention. Observations and model simulations both indicate that the arctic experiences an amplified response to climate forcing relative to that at lower latitudes. At the core of these changes is the arctic hydrologic system, which includes ice, gaseous vapor in the atmosphere, liquid water in soils and fluvial networks on land, and the freshwater content of the ocean. The changes in stores and fluxes of freshwater have a direct impact on biological systems, not only of the arctic region itself, but also well beyond its bounds. In this investigation, we used a heuristic, graphical approach to distill the system into its fundamental parts, documented the key relationships between those parts as best we know them, and identified the feedback loops within the system. The analysis illustrates relationships that are well understood, but also reveals others that are either unfamiliar, uncertain, or unexplored. The graphical approach was used to provide a visual assessment of the arctic hydrologic system in one possible future state in which the Arctic Ocean is seasonally ice free
Land Surface Climate in the Regional Arctic System Model
The article of record as published may be found at http://dx.doi.org/10.1175/JCLI-D-15-0415.1The Regional Arctic System Model (RASM) is a fully coupled, regional Earth system model applied over the pan-Arctic domain. This paper discusses the implementation of the Variable Infiltration Capacity land surface model (VIC) in RASM and evaluates the ability of RASM, version 1.0, to capture key features of the land surface climate and hydrologic cycle for the period 1979-2014 in comparison with uncoupled VIC simulations, reanalysis datasets, satellite measurements, and in situ observations. RASM reproduces the dominant features of the land surface climatology in the Arctic, such as the amount and regional distribution of precipitation, the partitioning of precipitation between runoff and evapotranspiration, the effects of snow on the water and energy balance, and the differences in turbulent fluxes between the tundra and taiga biomes. Surface air temperature biases in RASM, compared to reanalysis datasets ERA-Interim and MERRA, are generally less than 2 degrees C; however, in the cold seasons there are local biases that exceed 6 degrees C. Compared to satellite observations, RASM captures the annual cycle of snow-covered area well, although melt progresses about two weeks faster than observations in the late spring at high latitudes. With respect to derived fluxes, such as latent heat or runoff, RASM is shown to have similar performance statistics as ERA-Interim while differing substantially from MERRA, which consistently overestimates the evaporative flux across the Arctic region.U.S. Department of Energy (DOE) [DE-FG02-07ER64460, DE-SC0006856, DE-SC0006178]; DO
- …
