4,613 research outputs found
Revealing the structure of language model capabilities
Building a theoretical understanding of the capabilities of large language
models (LLMs) is vital for our ability to predict and explain the behavior of
these systems. Here, we investigate the structure of LLM capabilities by
extracting latent capabilities from patterns of individual differences across a
varied population of LLMs. Using a combination of Bayesian and frequentist
factor analysis, we analyzed data from 29 different LLMs across 27 cognitive
tasks. We found evidence that LLM capabilities are not monolithic. Instead,
they are better explained by three well-delineated factors that represent
reasoning, comprehension and core language modeling. Moreover, we found that
these three factors can explain a high proportion of the variance in model
performance. These results reveal a consistent structure in the capabilities of
different LLMs and demonstrate the multifaceted nature of these capabilities.
We also found that the three abilities show different relationships to model
properties such as model size and instruction tuning. These patterns help
refine our understanding of scaling laws and indicate that changes to a model
that improve one ability might simultaneously impair others. Based on these
findings, we suggest that benchmarks could be streamlined by focusing on tasks
that tap into each broad model ability.Comment: 10 pages, 3 figures + references and appendices, for data and
analysis code see https://github.com/RyanBurnell/revealing-LLM-capabilitie
Development of FuGO: An ontology for functional genomics investigations
The development of the Functional Genomics Investigation Ontology (FuGO) is a collaborative, international effort that will provide a resource for annotating functional genomics investigations, including the study design, protocols and instrumentation used, the data generated and the types of analysis performed on the data. FuGO will contain both terms that are universal to all functional genomics investigations and those that are domain specific. In this way, the ontology will serve as the âsemantic glueâ to provide a common understanding of data from across these disparate data
sources. In addition, FuGO will reference out to existing mature ontologies to avoid the need to duplicate these resources, and will do so in such a way as to enable their ease of use in annotation. This project is in the early stages of development; the paper will describe efforts to initiate the project, the scope and organization of the project, the work accomplished to date, and the challenges encountered, as well as future plans
Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program
The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)(1). In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%
Axial Vector Charmonium and Bottomonium Hybrid Mass Predictions with QCD Sum-Rules
Axial vector charmonium and bottomonium hybrid masses are
determined via QCD Laplace sum-rules. Previous sum-rule studies in this channel
did not incorporate the dimension-six gluon condensate, which has been shown to
be important for and heavy quark hybrids. An updated analysis
of axial vector charmonium and bottomonium hybrids is presented, including the
effects of the dimension-six gluon condensate. The axial vector charmonium and
bottomonium hybrid masses are predicted to be 5.13 GeV and 11.32 GeV,
respectively. We discuss the implications of this result for the
charmonium-like XYZ states and the charmonium hybrid multiplet structure
observed in recent lattice calculations.Comment: 10 pages, 7 figures. Updated to match published versio
Exact Solution for the Exterior Field of a Rotating Neutron Star
A four-parameter class of exact asymptotically flat solutions of the
Einstein-Maxwell equations involving only rational functions is presented. It
is able to describe the exterior field of a slowly or rapidly rotating neutron
star with poloidal magnetic field.Comment: Accepted for publication in Phys. Rev. D as Rapid Communication. 8
pages, 2 eps figure
Heavy-Light Mesons with Quenched Lattice NRQCD: Results on Decay Constants
We present a quenched lattice calculation of heavy-light meson decay
constants, using non-relativistic (NRQCD) heavy quarks in the mass region of
the quark and heavier, and clover-improved light quarks. The NRQCD
Hamiltonian and the heavy-light current include the corrections at first order
in the expansion in the inverse heavy quark mass. We study the dependence of
the decay constants on the heavy meson mass , for light quarks with the tree
level ( = 1), as well as the tadpole improved clover coefficient. We
compare decay constants from NRQCD with results from clover () heavy
quarks.
Having calculated the current renormalisation constant in one-loop
perturbation theory, we demonstrate how the heavy mass dependence of the
pseudoscalar decay constants changes after renormalisation. For the first time,
we quote a result for from NRQCD including the full one-loop matching
factors at .Comment: 45 pages, latex, 24 postscript figure
Assessing the Evolutionary Impact of Amino Acid Mutations in the Human Genome
Quantifying the distribution of fitness effects among newly arising mutations in the human genome is key to resolving important debates in medical and evolutionary genetics. Here, we present a method for inferring this distribution using Single Nucleotide Polymorphism (SNP) data from a population with non-stationary demographic history (such as that of modern humans). Application of our method to 47,576 coding SNPs found by direct resequencing of 11,404 protein coding-genes in 35 individuals (20 European Americans and 15 African Americans) allows us to assess the relative contribution of demographic and selective effects to patterning amino acid variation in the human genome. We find evidence of an ancient population expansion in the sample with African ancestry and a relatively recent bottleneck in the sample with European ancestry. After accounting for these demographic effects, we find strong evidence for great variability in the selective effects of new amino acid replacing mutations. In both populations, the patterns of variation are consistent with a leptokurtic distribution of selection coefficients (e.g., gamma or log-normal) peaked near neutrality. Specifically, we predict 27â29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral (|s|<0.01%), 30â42% are moderately deleterious (0.01%<|s|<1%), and nearly all the remainder are highly deleterious or lethal (|s|>1%). Our results are consistent with 10â20% of amino acid differences between humans and chimpanzees having been fixed by positive selection with the remainder of differences being neutral or nearly neutral. Our analysis also predicts that many of the alleles identified via whole-genome association mapping may be selectively neutral or (formerly) positively selected, implying that deleterious genetic variation affecting disease phenotype may be missed by this widely used approach for mapping genes underlying complex traits
Angular Momentum and Vortex Formation in Bose-Einstein-Condensed Cold Dark Matter Haloes
(Abridged) Extensions of the standard model of particle physics predict very
light bosons, ranging from about 10^{-5} eV for the QCD axion to 10^{-33} eV
for ultra-light particles, which could be the cold dark matter (CDM) in the
Universe. If so, their phase-space density must be high enough to form a
Bose-Einstein condensate (BEC). The fluid-like nature of BEC-CDM dynamics
differs from that of standard collisionless CDM (sCDM), so observations of
galactic haloes may distinguish them. sCDM has problems with galaxy
observations on small scales, which BEC-CDM may overcome for a large range of
particle mass m and self-interaction strength g. For quantum-coherence on
galactic scales of radius R and mass M, either the de-Broglie wavelength
lambda_deB ~ m_H \cong 10^{-25}(R/100 kpc)^{-1/2}(M/10^{12}
M_solar)^{-1/2} eV, or else lambda_deB << R but self-interaction balances
gravity, requiring m >> m_H and g >> g_H \cong 2 x 10^{-64} (R/100
kpc)(M/10^{12} M_solar)^{-1} eV cm^3. Here we study the largely-neglected
effects of angular momentum. Spin parameters lambda \cong 0.05 are expected
from tidal-torquing by large-scale structure, just as for sCDM. Since lab BECs
develop quantum vortices if rotated rapidly enough, we ask if this angular
momentum is sufficient to form vortices in BEC haloes, affecting their
structure with potentially observable consequences. The minimum angular
momentum for this, L_{QM} = , requires m >= 9.5 m_H for lambda =
0.05, close to the particle mass required to influence structure on galactic
scales. We study the equilibrium of self-gravitating, rotating BEC haloes which
satisfy the Gross-Pitaevskii-Poisson equations, to calculate if and when
vortices are energetically favoured. Vortices form as long as self-interaction
is strong enough, which includes a large part of the range of m and g of
interest for BEC-CDM haloes.Comment: Several typos and numerical typos (incl. in Fig.6, Table 2 and Table
3) have been corrected and references have been updated after proof-reading
stage; MNRAS in press; 29 pages; 11 figure
The gut microbiota of people with asthma influences lung inflammation in gnotobiotic mice
The gut microbiota in early childhood is linked to asthma risk, but may continue to affect older patients with asthma. Here, we profile the gut microbiota of 38 children (19 asthma, median age 8) and 57 adults (17 asthma, median age 28) by 16S rRNA sequencing and find individuals with asthma harbored compositional differences from healthy controls in both adults and children. We develop a model to aid the design of mechanistic experiments in gnotobiotic mice and show enterotoxigeni
Resequencing Candidate Genes Implicates Rare Variants in Asthma Susceptibility
Common variation in over 100 genes has been implicated in the risk of developing asthma, but the contribution of rare variants to asthma susceptibility remains largely unexplored. We selected nine genes that showed the strongest signatures of weak purifying selection from among 53 candidate asthma-associated genes, and we sequenced the coding exons and flanking noncoding regions in 450 asthmatic cases and 515 nonasthmatic controls. We observed an overall excess of p values <0.05 (p = 0.02), and rare variants in four genes (AGT, DPP10, IKBKAP, and IL12RB1) contributed to asthma susceptibility among African Americans. Rare variants in IL12RB1 were also associated with asthma susceptibility among European Americans, despite the fact that the majority of rare variants in IL12RB1 were specific to either one of the populations. The combined evidence of association with rare noncoding variants in IL12RB1 remained significant (p = 3.7Â Ă 10â4) after correcting for multiple testing. Overall, the contribution of rare variants to asthma susceptibility was predominantly due to noncoding variants in sequences flanking the exons, although nonsynonymous rare variants in DPP10 and in IL12RB1 were associated with asthma in African Americans and European Americans, respectively. This study provides evidence that rare variants contribute to asthma susceptibility. Additional studies are required for testing whether prioritizing genes for resequencing on the basis of signatures of purifying selection is an efficient means of identifying novel rare variants that contribute to complex disease
- âŠ