68 research outputs found
Bounding generalization error with input compression: An empirical study with infinite-width networks
Estimating the Generalization Error (GE) of Deep Neural Networks (DNNs) is an
important task that often relies on availability of held-out data. The ability
to better predict GE based on a single training set may yield overarching DNN
design principles to reduce a reliance on trial-and-error, along with other
performance assessment advantages. In search of a quantity relevant to GE, we
investigate the Mutual Information (MI) between the input and final layer
representations, using the infinite-width DNN limit to bound MI. An existing
input compression-based GE bound is used to link MI and GE. To the best of our
knowledge, this represents the first empirical study of this bound. In our
attempt to empirically falsify the theoretical bound, we find that it is often
tight for best-performing models. Furthermore, it detects randomization of
training labels in many cases, reflects test-time perturbation robustness, and
works well given only few training samples. These results are promising given
that input compression is broadly applicable where MI can be estimated with
confidence.Comment: 12 pages main content, 26 pages tota
Predictions for the X-ray circumgalactic medium of edge-on discs and spheroids
We investigate how the X-ray circumgalactic medium (CGM) of present-day
galaxies depends on galaxy morphology and azimuthal angle using mock
observations generated from the EAGLE cosmological hydrodynamic simulation. By
creating mock stacks of {\it eROSITA}-observed galaxies oriented to be edge-on,
we make several observationally-testable predictions for galaxies in the
stellar mass range M. The soft X-ray CGM of
disc galaxies is between 60 and 100\% brighter along the semi-major axis
compared to the semi-minor axis, between 10-30 kpc. This azimuthal dependence
is a consequence of the hot ( K) CGM being non-spherical: specifically
it is flattened along the minor axis such that denser and more luminous gas
resides in the disc plane and co-rotates with the galaxy. Outflows enrich and
heat the CGM preferentially perpendicular to the disc, but we do not find an
observationally-detectable signature along the semi-minor axis. Spheroidal
galaxies have hotter CGMs than disc galaxies related to spheroids residing at
higher halos masses, which may be measurable through hardness ratios spanning
the keV band. While spheroids appear to have brighter CGMs than discs
for the selected fixed bin, this owes to spheroids having higher
stellar and halo masses within that bin, and obscures the fact that
both simulated populations have similar total CGM luminosities at the exact
same . Discs have brighter emission inside 20 kpc and more steeply
declining profiles with radius than spheroids. We predict that the {\it
eROSITA} 4-year all-sky survey should detect many of the signatures we predict
here, although targeted follow-up observations of highly inclined nearby discs
after the survey may be necessary to observe some of our azimuthally-dependent
predictions.Comment: 12 pages, 11 figures, 1 table. Submitted to MNRAS. Comments welcom
Eight common genetic variants associated with serum dheas levels suggest a key role in ageing mechanisms
Dehydroepiandrosterone sulphate (DHEAS) is the most abundant circulating steroid secreted by adrenal glands-yet its function is unknown. Its serum concentration declines significantly with increasing age, which has led to speculation that a relative DHEAS deficiency may contribute to the development of common age-related diseases or diminished longevity. We conducted a meta-analysis of genome-wide association data with 14,846 individuals and identified eight independent common SNPs associated with serum DHEAS concentrations. Genes at or near the identified loci include ZKSCAN5 (rs11761528; p = 3.15Ă10-36), SULT2A1 (rs2637125; p = 2.61Ă10-19), ARPC1A (rs740160; p = 1.56Ă10-16), TRIM4 (rs17277546; p = 4.50Ă10-11), BMF (rs7181230; p = 5.44Ă10-11), HHEX (rs2497306; p = 4.64Ă10-9), BCL2L11 (rs6738028; p = 1.72Ă10-8), and CYP2C9 (rs2185570; p = 2.29Ă10-8). These genes are associated with type 2 diabetes, lymphoma, actin filament assembly, drug and xenobiotic metabolism, and zinc finger proteins. Several SNPs were associated with changes in gene expression levels, and the related genes are connected to biological pathways linking DHEAS with ageing. This study provides much needed insight into the function of DHEAS
Allelic heterogeneity and more detailed analyses of known loci explain additional phenotypic variation and reveal complex patterns of association
The identification of multiple signals at individual loci could explain additional phenotypic variance (âmissing heritabilityâ) of common traits, and help identify causal genes. We examined gene expression levels as a model trait because of the large number of strong genetic effects acting in cis. Using expression profiles from 613 individuals, we performed genome-wide single nucleotide polymorphism (SNP) analyses to identify cis-expression quantitative trait loci (eQTLs), and conditional analysis to identify second signals. We examined patterns of association when accounting for multiple SNPs at a locus and when including additional SNPs from the 1000 Genomes Project. We identified 1298 cis-eQTLs at an approximate false discovery rate 0.01, of which 118 (9%) showed evidence of a second independent signal. For this subset of 118 traits, accounting for two signals resulted in an average 31% increase in phenotypic variance explained (Wilcoxon P< 0.0001). The association of SNPs with cis gene expression could increase, stay similar or decrease in significance when accounting for linkage disequilibrium with second signals at the same locus. Pairs of SNPs increasing in significance tended to have gene expression increasing alleles on opposite haplotypes, whereas pairs of SNPs decreasing in significance tended to have gene expression increasing alleles on the same haplotypes. Adding data from the 1000 Genomes Project showed that apparently independent signals could be potentially explained by a single association signal. Our results show that accounting for multiple variants at a locus will increase the variance explained in a substantial fraction of loci, but that allelic heterogeneity will be difficult to define without resequencing loci and functional work
High Genetic Diversity among Community-Associated Staphylococcus aureus in Europe: Results from a Multicenter Study
Background: Several studies have addressed the epidemiology of community-associated Staphylococcus aureus (CA-SA) in Europe; nonetheless, a comprehensive perspective remains unclear. In this study, we aimed to describe the population structure of CA-SA and to shed light on the origin of methicillin-resistant S. aureus (MRSA) in this continent. Methods and Findings: A total of 568 colonization and infection isolates, comprising both MRSA and methicillin-susceptible S. aureus (MSSA), were recovered in 16 European countries, from community and community-onset infections. The genetic background of isolates was characterized by molecular typing techniques (spa typing, pulsed-field gel electrophoresis and multilocus sequence typing) and the presence of PVL and ACME was tested by PCR. MRSA were further characterized by SCCmec typing. We found that 59 % of all isolates were associated with community-associated clones. Most MRSA were related with USA300 (ST8-IVa and variants) (40%), followed by the European clone (ST80-IVc and derivatives) (28%) and the Taiwan clone (ST59-IVa and related clonal types) (15%). A total of 83 % of MRSA carried Panton-Valentine leukocidin (PVL) and 14 % carried the arginine catabolic mobile element (ACME). Surprisingly, we found a high genetic diversity among MRSA clonal types (ST-SCCmec), Simpsonâs index of diversity = 0.852 (0.788â0.916). Specifically, about half of the isolates carried novel associations between genetic background and SCCmec. Analysis by BURP showed that some CA-MSSA and CA-MRS
Association of the Type 2 Diabetes Mellitus Susceptibility Gene, TCF7L2, with Schizophrenia in an Arab-Israeli Family Sample
Many reports in different populations have demonstrated linkage of the 10q24âq26 region to schizophrenia, thus encouraging further analysis of this locus for detection of specific schizophrenia genes. Our group previously reported linkage of the 10q24âq26 region to schizophrenia in a unique, homogeneous sample of Arab-Israeli families with multiple schizophrenia-affected individuals, under a dominant model of inheritance. To further explore this candidate region and identify specific susceptibility variants within it, we performed re-analysis of the 10q24-26 genotype data, taken from our previous genome-wide association study (GWAS) (Alkelai et al, 2011). We analyzed 2089 SNPs in an extended sample of 57 Arab Israeli families (189 genotyped individuals), under the dominant model of inheritance, which best fits this locus according to previously performed MOD score analysis. We found significant association with schizophrenia of the TCF7L2 gene intronic SNP, rs12573128, (pâ=â7.01Ă10â6) and of the nearby intergenic SNP, rs1033772, (pâ=â6.59Ă10â6) which is positioned between TCF7L2 and HABP2. TCF7L2 is one of the best confirmed susceptibility genes for type 2 diabetes (T2D) among different ethnic groups, has a role in pancreatic beta cell function and may contribute to the comorbidity of schizophrenia and T2D. These preliminary results independently support previous findings regarding a possible role of TCF7L2 in susceptibility to schizophrenia, and strengthen the importance of integrating linkage analysis models of inheritance while performing association analyses in regions of interest. Further validation studies in additional populations are required
Characterization of Expression Quantitative Trait Loci in Pedigrees from Colombia and Costa Rica Ascertained for Bipolar Disorder
The observation that variants regulating gene expression (expression quantitative trait loci, eQTL) are at a high frequency among SNPs associated with complex traits has made the genome-wide characterization of gene expression an important tool in genetic mapping studies of such traits. As part of a study to identify genetic loci contributing to bipolar disorder and other quantitative traits in members of 26 pedigrees from Costa Rica and Colombia, we measured gene expression in lymphoblastoid cell lines derived from 786 pedigree members. The study design enabled us to comprehensively reconstruct the genetic regulatory network in these families, provide estimates of heritability, identify eQTL, evaluate missing heritability for the eQTL, and quantify the number of different alleles contributing to any given locus. In the eQTL analysis, we utilize a recently proposed hierarchical multiple testing strategy which controls error rates regarding the discovery of functional variants. Our results elucidate the heritability and regulation of gene expression in this unique Latin American study population and identify a set of regulatory SNPs which may be relevant in future investigations of complex disease in this population. Since our subjects belong to extended families, we are able to compare traditional kinship-based estimates with those from more recent methods that depend only on genotype information.National Institutes for Health/[R01 HG006695]/NIH/Estados UnidosNational Institutes for Health/[R01 MH101782]/NIH/Estados UnidosNational Institutes for Health/[R01 MH075007]/NIH/Estados UnidosIsrael Science Foundation/[1112/14]/ISF/IsraelUCR::VicerrectorĂa de InvestigaciĂłn::Unidades de InvestigaciĂłn::Ciencias BĂĄsicas::Centro de InvestigaciĂłn en BiologĂa Celular y Molecular (CIBCM
Recommended from our members
Genome-Wide Association Identifies Nine Common Variants Associated With Fasting Proinsulin Levels and Provides New Insights Into the Pathophysiology of Type 2 Diabetes
OBJECTIVE: Proinsulin is a precursor of mature insulin and C-peptide. Higher circulating proinsulin levels are associated with impaired ÎČ-cell function, raised glucose levels, insulin resistance, and type 2 diabetes (T2D). Studies of the insulin processing pathway could provide new insights about T2D pathophysiology. RESEARCH DESIGN AND METHODS: We have conducted a meta-analysis of genome-wide association tests of âŒ2.5 million genotyped or imputed single nucleotide polymorphisms (SNPs) and fasting proinsulin levels in 10,701 nondiabetic adults of European ancestry, with follow-up of 23 loci in up to 16,378 individuals, using additive genetic models adjusted for age, sex, fasting insulin, and study-specific covariates. RESULTS: Nine SNPs at eight loci were associated with proinsulin levels (P < 5 Ă 10â8). Two loci (LARP6 and SGSM2) have not been previously related to metabolic traits, one (MADD) has been associated with fasting glucose, one (PCSK1) has been implicated in obesity, and four (TCF7L2, SLC30A8, VPS13C/C2CD4A/B, and ARAP1, formerly CENTD2) increase T2D risk. The proinsulin-raising allele of ARAP1 was associated with a lower fasting glucose (P = 1.7 Ă 10â4), improved ÎČ-cell function (P = 1.1 Ă 10â5), and lower risk of T2D (odds ratio 0.88; P = 7.8 Ă 10â6). Notably, PCSK1 encodes the protein prohormone convertase 1/3, the first enzyme in the insulin processing pathway. A genotype score composed of the nine proinsulin-raising alleles was not associated with coronary disease in two large case-control datasets. CONCLUSIONS: We have identified nine genetic variants associated with fasting proinsulin. Our findings illuminate the biology underlying glucose homeostasis and T2D development in humans and argue against a direct role of proinsulin in coronary artery disease pathogenesis
New genetic loci link adipose and insulin biology to body fat distribution.
Body fat distribution is a heritable trait and a well-established predictor of adverse metabolic outcomes, independent of overall adiposity. To increase our understanding of the genetic basis of body fat distribution and its molecular links to cardiometabolic traits, here we conduct genome-wide association meta-analyses of traits related to waist and hip circumferences in up to 224,459 individuals. We identify 49 loci (33 new) associated with waist-to-hip ratio adjusted for body mass index (BMI), and an additional 19 loci newly associated with related waist and hip circumference measures (PÂ <Â 5Â ĂÂ 10(-8)). In total, 20 of the 49 waist-to-hip ratio adjusted for BMI loci show significant sexual dimorphism, 19 of which display a stronger effect in women. The identified loci were enriched for genes expressed in adipose tissue and for putative regulatory elements in adipocytes. Pathway analyses implicated adipogenesis, angiogenesis, transcriptional regulation and insulin resistance as processes affecting fat distribution, providing insight into potential pathophysiological mechanisms
Clinical features and outcomes of elderly hospitalised patients with chronic obstructive pulmonary disease, heart failure or both
Background and objective: Chronic obstructive pulmonary disease (COPD) and heart failure (HF) mutually increase the risk of being present in the same patient, especially if older. Whether or not this coexistence may be associated with a worse prognosis is debated. Therefore, employing data derived from the REPOSI register, we evaluated the clinical features and outcomes in a population of elderly patients admitted to internal medicine wards and having COPD, HF or COPDâ+âHF. Methods: We measured socio-demographic and anthropometric characteristics, severity and prevalence of comorbidities, clinical and laboratory features during hospitalization, mood disorders, functional independence, drug prescriptions and discharge destination. The primary study outcome was the risk of death. Results: We considered 2,343 elderly hospitalized patients (median age 81 years), of whom 1,154 (49%) had COPD, 813 (35%) HF, and 376 (16%) COPDâ+âHF. Patients with COPDâ+âHF had different characteristics than those with COPD or HF, such as a higher prevalence of previous hospitalizations, comorbidities (especially chronic kidney disease), higher respiratory rate at admission and number of prescribed drugs. Patients with COPDâ+âHF (hazard ratio HR 1.74, 95% confidence intervals CI 1.16-2.61) and patients with dementia (HR 1.75, 95% CI 1.06-2.90) had a higher risk of death at one year. The Kaplan-Meier curves showed a higher mortality risk in the group of patients with COPDâ+âHF for all causes (pâ=â0.010), respiratory causes (pâ=â0.006), cardiovascular causes (pâ=â0.046) and respiratory plus cardiovascular causes (pâ=â0.009). Conclusion: In this real-life cohort of hospitalized elderly patients, the coexistence of COPD and HF significantly worsened prognosis at one year. This finding may help to better define the care needs of this population
- âŠ