166 research outputs found

    Fundamental Limits and Tradeoffs in Invariant Representation Learning

    Full text link
    Many machine learning applications, e.g., privacy-preserving learning, algorithmic fairness and domain adaptation/generalization, involve learning the so-called invariant representations that achieve two competing goals: To maximize information or accuracy with respect to a target while simultaneously maximizing invariance or independence with respect to a set of protected features (e.g.\ for fairness, privacy, etc). Despite its abundant applications in the aforementioned domains, theoretical understanding on the limits and tradeoffs of invariant representations is still severely lacking. In this paper, we provide an information theoretic analysis of this general and important problem under both classification and regression settings. In both cases, we analyze the inherent tradeoffs between accuracy and invariance by providing a geometric characterization of the feasible region in the information plane, where we connect the geometric properties of this feasible region to the fundamental limitations of the tradeoff problem. In the regression setting, we further give a complete and exact characterization of the frontier between accuracy and invariance. Although our contributions are mainly theoretical, we also demonstrate the practical applications of our results in certifying the suboptimality of certain representation learning algorithms in both classification and regression tasks. Our results shed new light on this fundamental problem by providing insights on the interplay between accuracy and invariance. These results deepen our understanding of this fundamental problem and may be useful in guiding the design of future representation learning algorithms.Comment: Updated results in the regression setting to fully characterize the frontier. Additional numerical experiment

    The Natural History of Left Ventricular Geometry in the Community Clinical Correlates and Prognostic Significance of Change in LV Geometric Pattern

    Get PDF
    AbstractObjectivesThis study sought to evaluate pattern and clinical correlates of change in left ventricular (LV) geometry over a 4-year period in the community; it also assessed whether the pattern of change in LV geometry over 4 years predicts incident cardiovascular disease (CVD), including myocardial infarction, heart failure, and cardiovascular death, during an additional subsequent follow-up period.BackgroundIt is unclear how LV geometric patterns change over time and whether changes in LV geometry have prognostic significance.MethodsThis study evaluated 4,492 observations (2,604 unique Framingham Heart Study participants attending consecutive examinations) to categorize LV geometry at baseline and after 4 years. Four groups were defined on the basis of the sex-specific distributions of left ventricular mass (LVM) and relative wall thickness (RWT) (normal: LVM and RWT <80th percentile; concentric remodeling: LVM <80th percentile but RWT ≄80th percentile; eccentric hypertrophy: LVM ≄80th percentile but RWT <80th percentile; and concentric hypertrophy: LVM and RWT ≄80th percentile).ResultsAt baseline, 2,874 of 4,492 observations (64%) had normal LVM and RWT. Participants with normal geometry or concentric remodeling progressed infrequently (4% to 8%) to eccentric or concentric hypertrophy. Change from eccentric to concentric hypertrophy was uncommon (8%). Among participants with concentric hypertrophy, 19% developed eccentric hypertrophy within the 4-year period. Among participants with abnormal LV geometry at baseline, a significant proportion (29% to 53%) reverted to normal geometry within 4 years. Higher blood pressure, greater body mass index (BMI), advancing age, and male sex were key correlates of developing an abnormal geometry. Development of an abnormal LV geometric pattern over 4 years was associated with increased CVD risk (140 events) during a subsequent median follow-up of 12 years (adjusted-hazards ratio: 1.59; 95% confidence interval: 1.04 to 2.43).ConclusionsThe longitudinal observations in the community suggest that dynamic changes in LV geometric pattern over time are common. Higher blood pressure and greater BMI are modifiable factors associated with the development of abnormal LV geometry, and such progression portends an adverse prognosis

    Genome-wide association of echocardiographic dimensions, brachial artery endothelial function and treadmill exercise responses in the Framingham Heart Study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Echocardiographic left ventricular (LV) measurements, exercise responses to standardized treadmill test (ETT) and brachial artery (BA) vascular function are heritable traits that are associated with cardiovascular disease risk. We conducted a genome-wide association study (GWAS) in the community-based Framingham Heart Study.</p> <p>Methods</p> <p>We estimated multivariable-adjusted residuals for quantitative echocardiography, ETT and BA function traits. Echocardiography residuals were averaged across 4 examinations and included LV mass, diastolic and systolic dimensions, wall thickness, fractional shortening, left atrial and aortic root size. ETT measures (single exam) included systolic blood pressure and heart rate responses during exercise stage 2, and at 3 minutes post-exercise. BA measures (single exam) included vessel diameter, flow-mediated dilation (FMD), and baseline and hyperemic flow responses. Generalized estimating equations (GEE), family-based association tests (FBAT) and variance-components linkage were used to relate multivariable-adjusted trait residuals to 70,987 SNPs (Human 100K GeneChip, Affymetrix) restricted to autosomal SNPs with minor allele frequency ≄0.10, genotype call rate ≄0.80, and Hardy-Weinberg equilibrium p ≄ 0.001.</p> <p>Results</p> <p>We summarize results from 17 traits in up to 1238 related middle-aged to elderly men and women. Results of all association and linkage analyses are web-posted at <url>http://ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?id=phs000007</url>. We confirmed modest-to-strong heritabilities (estimates 0.30–0.52) for several Echo, ETT and BA function traits. Overall, p < 10<sup>-5 </sup>in either GEE or FBAT models were observed for 21 SNPs (nine for echocardiography, eleven for ETT and one for BA function). The top SNPs associated were (GEE results): LV diastolic dimension, rs1379659 (<it>SLIT2</it>, p = 1.17*10<sup>-7</sup>); LV systolic dimension, rs10504543 (<it>KCNB2</it>, p = 5.18*10<sup>-6</sup>); LV mass, rs10498091 (p = 5.68*10<sup>-6</sup>); Left atrial size, rs1935881 (<it>FAM5C</it>, p = 6.56*10<sup>-6</sup>); exercise heart rate, rs6847149 (<it>NOLA1</it>, p = 2.74*10<sup>-6</sup>); exercise systolic blood pressure, rs2553268 (<it>WRN</it>, p = 6.3*10<sup>-6</sup>); BA baseline flow, rs3814219 (<it>OBFC1</it>, 9.48*10<sup>-7</sup>), and FMD, rs4148686 (<it>CFTR</it>, p = 1.13*10<sup>-5</sup>). Several SNPs are reasonable biological candidates, with some being related to multiple traits suggesting pleiotropy. The peak LOD score was for LV mass (4.38; chromosome 5); the 1.5 LOD support interval included <it>NRG2</it>.</p> <p>Conclusion</p> <p>In hypothesis-generating GWAS of echocardiography, ETT and BA vascular function in a moderate-sized community-based sample, we identified several SNPs that are candidates for replication attempts and we provide a web-based GWAS resource for the research community.</p

    Corrected score methods for estimating Bayesian networks with error-prone nodes

    Full text link
    Motivated by inferring cellular signaling networks using noisy flow cytometry data, we develop procedures to draw inference for Bayesian networks based on error-prone data. Two methods for inferring causal relationships between nodes in a network are proposed based on penalized estimation methods that account for measurement error and encourage sparsity. We discuss consistency of the proposed network estimators and develop an approach for selecting the tuning parameter in the penalized estimation methods. Empirical studies are carried out to compare the proposed methods and a naive method that ignores measurement error with applications to synthetic data and to single cell flow cytometry data

    Secular trends in echocardiographic left ventricular mass in the community: the Framingham Heart Study

    Get PDF
    Objective: To investigate secular trends in echocardiographically determined left ventricular mass (LVM). Design, setting and participants: Longitudinal community-based cohort study in Framingham, Massachussetts. LVM was calculated from routine echocardiography in 4320 participants (52% women) of the Framingham offspring cohort at examination cycles 4 (1987–1991), 5 (1991–1995), 6 (1995–1998) and 8 (2005–2008), totalling 13 971 person-observations. Main outcome measures: Sex-specific trends in mean LVM (and its components, LV diastolic diameter (LVDD) and LV wall thickness (LVWT)), and LVM indexed to body surface area (BSA). Results: In men, age-adjusted LVM modestly increased from examination 4 to 8 (192 g to 198 g, p-trend=0.0005), whereas, in women it decreased from 147 g at examination 4 to 140 g at examination 8 (p-trend<0.0001). The trend for increasing LVM in men tracked with an increasing LVDD (p-trend=0.0002), whereas the decline in LVM in women was accompanied by a decrease in LVWT (p-trend<0.0001). Indexing LVM to BSA abolished the increasing trend in men (p-trend=0.49), whereas, the decreasing trend in women was maintained. Conclusions: In our longitudinal analysis of a large community-based sample spanning two decades, we observed sex-related differences in trends in LVM, with a modest increase of LVM in men (likely attributable to increasing body size), but a decrease in women. Additional studies are warranted to elucidate the basis for these sex-related differences

    Pooled Deep Sequencing of Plasmodium falciparum Isolates: An Efficient and Scalable Tool to Quantify Prevailing Malaria Drug-Resistance Genotypes

    Get PDF
    Molecular surveillance for drug-resistant malaria parasites requires reliable, timely, and scalable methods. These data may be efficiently produced by genotyping parasite populations using second-generation sequencing (SGS). We designed and validated a SGS protocol to quantify mutant allele frequencies in the Plasmodium falciparum genes dhfr and dhps in mixed isolates. We applied this new protocol to field isolates from children and compared it to standard genotyping using Sanger sequencing. The SGS protocol accurately quantified dhfr and dhps allele frequencies in a mixture of parasite strains. Using SGS of DNA that was extracted and then pooled from individual isolates, we estimated mutant allele frequencies that were closely correlated to those estimated by Sanger sequencing (correlations, >0.98). The SGS protocol obviated most molecular steps in conventional methods and is cost saving for parasite populations >50. This SGS genotyping method efficiently and reproducibly estimates parasite allele frequencies within populations of P. falciparum for molecular epidemiologic studies

    Use of Massively Parallel Pyrosequencing to Evaluate the Diversity of and Selection on Plasmodium falciparum csp T-Cell Epitopes in Lilongwe, Malawi

    Get PDF
    The development of an effective malaria vaccine has been hampered by the genetic diversity of commonly used target antigens. This diversity has led to concerns about allele-specific immunity limiting the effectiveness of vaccines. Despite extensive genetic diversity of circumsporozoite protein (CS), the most successful malaria vaccine is RTS/S, a monovalent CS vaccine. By use of massively parallel pyrosequencing, we evaluated the diversity of CS haplotypes across the T-cell epitopes in parasites from Lilongwe, Malawi. We identified 57 unique parasite haplotypes from 100 participants. By use of ecological and molecular indexes of diversity, we saw no difference in the diversity of CS haplotypes between adults and children. We saw evidence of weak variant-specific selection within this region of CS, suggesting naturally acquired immunity does induce variant-specific selection on CS. Therefore, the impact of CS vaccines on variant frequencies with widespread implementation of vaccination requires further study

    Polygenic Prediction of Weight and Obesity Trajectories from Birth to Adulthood

    Get PDF
    Severe obesity is a rapidly growing global health threat. Although often attributed to unhealthy lifestyle choices or environmental factors, obesity is known to be heritable and highly polygenic; the majority of inherited susceptibility is related to the cumulative effect of many common DNA variants. Here we derive and validate a new polygenic predictor comprised of 2.1 million common variants to quantify this susceptibility and test this predictor in more than 300,000 individuals ranging from middle age to birth. Among middle-aged adults, we observe a 13-kg gradient in weight and a 25-fold gradient in risk of severe obesity across polygenic score deciles. In a longitudinal birth cohort, we note minimal differences in birthweight across score deciles, but a significant gradient emerged in early childhood and reached 12 kg by 18 years of age. This new approach to quantify inherited susceptibility to obesity affords new opportunities for clinical prevention and mechanistic assessment. © 2019 Author(s)National Human Genome Research Institute (1K08HG0101)Wellcome Trust (202802/Z/16/Z)University of Bristol NIHR Biomedical Research Centre (S- BRC-1215-20011)National Human Genome Research Institute (HG008895)National Heart, Lung, and Blood Institute (NHLBI) HHSN268201300025CNational Heart, Lung, and Blood Institute (NHLBI) HHSN268201300026CNational Heart, Lung, and Blood Institute (NHLBI) HHSN268201300027CNational Heart, Lung, and Blood Institute (NHLBI) HHSN268201300028CNational Heart, Lung, and Blood Institute (NHLBI) HHSN268201300029CNational Heart, Lung, and Blood Institute (NHLBI) HHSN268200900041CNational Institute on Aging (AG0005)NHLBI (AG0005)National Human Genome Research Institute (U01-HG004729)National Human Genome Research Institute (U01-HG04424)National Human Genome Research Institute (U01-HG004446)Wellcome (102215/2/13/2

    Burden of Rare Sarcomere Gene Variants in the Framingham and Jackson Heart Study Cohorts

    Get PDF
    Rare sarcomere protein variants cause dominant hypertrophic and dilated cardiomyopathies. To evaluate whether allelic variants in eight sarcomere genes are associated with cardiac morphology and function in the community, we sequenced 3,600 individuals from the Framingham Heart Study (FHS) and Jackson Heart Study (JHS) cohorts. Out of the total, 11.2% of individuals had one or more rare nonsynonymous sarcomere variants. The prevalence of likely pathogenic sarcomere variants was 0.6%, twice the previous estimates; however, only four of the 22 individuals had clinical manifestations of hypertrophic cardiomyopathy. Rare sarcomere variants were associated with an increased risk for adverse cardiovascular events (hazard ratio: 2.3) in the FHS cohort, suggesting that cardiovascular risk assessment in the general population can benefit from rare variant analysis
    • 

    corecore