126 research outputs found

    An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning

    Get PDF
    Leaves are the most abundant and visible plant organ, both in the modern world and the fossil record. Identifying foliage to the correct plant family based on leaf architecture is a fundamental botanical skill that is also critical for isolated fossil leaves, which often, especially in the Cenozoic, represent extinct genera and species from extant families. Resources focused on leaf identification are remarkably scarce; however, the situation has improved due to the recent proliferation of digitized herbarium material, live-plant identification applications, and online collections of cleared and fossil leaf images. Nevertheless, the need remains for a specialized image dataset for comparative leaf architecture. We address this gap by assembling an open-access database of 30,252 images of vouchered leaf specimens vetted to family level, primarily of angiosperms, including 26,176 images of cleared and x-rayed leaves representing 354 families and 4,076 of fossil leaves from 48 families. The images maintain original resolution, have user-friendly filenames, and are vetted using APG and modern paleobotanical standards. The cleared and x-rayed leaves include the Jack A. Wolfe and Leo J. Hickey contributions to the National Cleared Leaf Collection and a collection of high-resolution scanned x-ray negatives, housed in the Division of Paleobotany, Department of Paleobiology, Smithsonian National Museum of Natural History, Washington D.C.; and the Daniel I. Axelrod Cleared Leaf Collection, housed at the University of California Museum of Paleontology, Berkeley. The fossil images include a sampling of Late Cretaceous to Eocene paleobotanical sites from the Western Hemisphere held at numerous institutions, especially from Florissant Fossil Beds National Monument (late Eocene, Colorado), as well as several other localities from the Late Cretaceous to Eocene of the Western USA and the early Paleogene of Colombia and southern Argentina. The dataset facilitates new research and education opportunities in paleobotany, comparative leaf architecture, systematics, and machine learning.Fil: Wilf, Peter. State University of Pennsylvania; Estados UnidosFil: Wing, Scott L.. National Museum of Natural History; Estados UnidosFil: Meyer, Herbert W.. State University of Pennsylvania; Estados UnidosFil: Rose, Jacob A.. State University of Pennsylvania; Estados UnidosFil: Saha, Rohit. State University of Pennsylvania; Estados UnidosFil: Serre, Thomas. State University of Pennsylvania; Estados UnidosFil: CĂșneo, NĂ©stor RubĂ©n. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas; Argentina. Museo PaleontolĂłgico Egidio Feruglio; ArgentinaFil: Donovan, Michael P.. State University of Pennsylvania; Estados UnidosFil: Erwin, Diane M.. State University of Pennsylvania; Estados UnidosFil: Gandolfo, MarĂ­a A.. Cornell University; Estados UnidosFil: GonzĂĄlez Akre, Erika. State University of Pennsylvania; Estados UnidosFil: Herrera, Fabiany. National Museum of Natural History; Estados UnidosFil: Hu, Shusheng. State University of Pennsylvania; Estados UnidosFil: Iglesias, Ari. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas. Centro CientĂ­fico TecnolĂłgico Conicet - Patagonia Norte. Instituto de Investigaciones en Biodiversidad y Medioambiente. Universidad Nacional del Comahue. Centro Regional Universidad Bariloche. Instituto de Investigaciones en Biodiversidad y Medioambiente; ArgentinaFil: Johnson, Kirk R.. Smithsonian Tropical Research Institute; PanamĂĄFil: Karim, Talia S.. University of Colorado; Estados UnidosFil: Zou, Xiaoyu. State University of Pennsylvania; Estados Unido

    Using observational data to estimate an upper bound on the reduction in cancer mortality due to periodic screening

    Get PDF
    BACKGROUND: Because randomized cancer screening trials are very expensive, observational cancer screening studies can play an important role in the early phases of screening evaluation. Periodic screening evaluation (PSE) is a methodology for estimating the reduction in population cancer mortality from data on subjects who receive regularly scheduled screens. Although PSE does not require assumptions about natural history of cancer it requires other assumptions, particularly progressive detection – the assumption that once a cancer is detected by a screening test, it will always be detected by the screening test. METHODS: We formulate a simple version of PSE and show that it leads to an upper bound on screening efficacy if the progressive detection assumption does not hold (and any effect of birth cohort is minimal) To determine if the upper bound is reasonable, for three randomized screening trials, we compared PSE estimates based only on screened subjects with PSE estimates based on all subjects. RESULTS: In the three randomized screening trials, PSE estimates based on screened subjects gave fairly close results to PSE estimates based on all subjects. CONCLUSION: PSE has promise for obtaining an upper bound on the reduction in population cancer mortality rates based on observational screening data. If the upper bound estimate is found to be small and any birth cohort effects are likely minimal, then a definitive randomized trial would not be warranted

    Copy number variations in 375 patients with oesophageal atresia and/or tracheoesophageal fistula

    Get PDF
    Oesophageal atresia (OA) with or without tracheoesophageal fistula (TOF) are rare anatomical congenital malformations whose cause is unknown in over 90% of patients. A genetic background is suggested, and among the reported genetic defects are copy number variations (CNVs). We hypothesized that CNVs contribute to OA/TOF development. Quantifying their prevalence could aid in genetic diagnosis and clinical care strategies. Therefore, we profiled 375 patients in a combined Dutch, American and German cohort via genomic microarray and compared the CNV profiles with their unaffected parents and published control cohorts. We identified 167 rare CNVs containing genes (frequency<0.0005 in our in-house cohort). Eight rare CNVs - in six patients - were de novo, including one CNV previously associated with oesophageal disease. (hg19 chr7:g.(143820444-143839360)-(159119486-159138663)del) 1.55% of isolated OA/TOF patients and 1.62% of patients with additional congenital anomalies had de novo CNVs. Furthermore, three (15q13.3, 16p13.3 and 22q11.2) susceptibility loci were identified based on their overlap with known OA/TOF-associated CNV syndromes and overlap with loci in published CNV association case-control studies in developmental delay. Our study suggests that CNVs contribute to OA/TOF development. In addition to the identified likely deleterious de novo CNVs, we detected 167 rare CNVs. Although not directly disease-causing, these CNVs might be of interest, as they can act as a modifier in a multiple hit model, or as the second hit in a recessive condition

    Platelet-Related Variants Identified by Exomechip Meta-analysis in 157,293 Individuals

    Get PDF
    Platelet production, maintenance, and clearance are tightly controlled processes indicative of platelets important roles in hemostasis and thrombosis. Platelets are common targets for primary and secondary prevention of several conditions. They are monitored clinically by complete blood counts, specifically with measurements of platelet count (PLT) and mean platelet volume (MPV). Identifying genetic effects on PLT and MPV can provide mechanistic insights into platelet biology and their role in disease. Therefore, we formed the Blood Cell Consortium (BCX) to perform a large-scale meta-analysis of Exomechip association results for PLT and MPV in 157,293 and 57,617 individuals, respectively. Using the low-frequency/rare coding variant-enriched Exomechip genotyping array, we sought to identify genetic variants associated with PLT and MPV. In addition to confirming 47 known PLT and 20 known MPV associations, we identified 32 PLT and 18 MPV associations not previously observed in the literature across the allele frequency spectrum, including rare large effect (FCER1A), low-frequency (IQGAP2, MAP1A, LY75), and common (ZMIZ2, SMG6, PEAR1, ARFGAP3/PACSIN2) variants. Several variants associated with PLT/MPV (PEAR1, MRVI1, PTGES3) were also associated with platelet reactivity. In concurrent BCX analyses, there was overlap of platelet-associated variants with red (MAP1A, TMPRSS6, ZMIZ2) and white (PEAR1, ZMIZ2, LY75) blood cell traits, suggesting common regulatory pathways with shared genetic architecture among these hematopoietic lineages. Our large-scale Exomechip analyses identified previously undocumented associations with platelet traits and further indicate that several complex quantitative hematological, lipid, and cardiovascular traits share genetic factors

    Trans-ethnic Meta-analysis and Functional Annotation Illuminates the Genetic Architecture of Fasting Glucose and Insulin

    Get PDF
    Knowledge of the genetic basis of the type 2 diabetes (T2D)-related quantitative traits fasting glucose (FG) and insulin (FI) in African ancestry (AA) individuals has been limited. In non-diabetic subjects of AA (n = 20,209) and European ancestry (EA; n = 57,292), we performed trans-ethnic (AA+EA) fine-mapping of 54 established EA FG or FI loci with detailed functional annotation, assessed their relevance in AA individuals, and sought previously undescribed loci through trans-ethnic (AA+EA) meta-analysis. We narrowed credible sets of variants driving association signals for 22/54 EA-associated loci; 18/22 credible sets overlapped with active islet-specific enhancers or transcription factor (TF) binding sites, and 21/22 contained at least one TF motif. Of the 54 EA-associated loci, 23 were shared between EA and AA. Replication with an additional 10,096 AA individuals identified two previously undescribed FI loci, chrX FAM133A (rs213676) and chr5 PELO (rs6450057). Trans-ethnic analyses with regulatory annotation illuminate the genetic architecture of glycemic traits and suggest gene regulation as a target to advance precision medicine for T2D. Our approach to utilize state-of-the-art functional annotation and implement trans-ethnic association analysis for discovery and fine-mapping offers a framework for further follow-up and characterization of GWAS signals of complex trait loc

    Discovery and fine-mapping of adiposity loci using high density imputation of genome-wide association studies in individuals of African ancestry: African Ancestry Anthropometry Genetics Consortium

    Get PDF
    Genome-wide association studies (GWAS) have identified >300 loci associated with measures of adiposity including body mass index (BMI) and waist-to-hip ratio (adjusted for BMI, WHRadjBMI), but few have been identified through screening of the African ancestry genomes. We performed large scale meta-analyses and replications in up to 52,895 individuals for BMI and up to 23,095 individuals for WHRadjBMI from the African Ancestry Anthropometry Genetics Consortium (AAAGC) using 1000 Genomes phase 1 imputed GWAS to improve coverage of both common and low frequency variants in the low linkage disequilibrium African ancestry genomes. In the sex-combined analyses, we identified one novel locus (TCF7L2/HABP2) for WHRadjBMI and eight previously established loci at P < 5×10−8: seven for BMI, and one for WHRadjBMI in African ancestry individuals. An additional novel locus (SPRYD7/DLEU2) was identified for WHRadjBMI when combined with European GWAS. In the sex-stratified analyses, we identified three novel loci for BMI (INTS10/LPL and MLC1 in men, IRX4/IRX2 in women) and four for WHRadjBMI (SSX2IP, CASC8, PDE3B and ZDHHC1/HSD11B2 in women) in individuals of African ancestry or both African and European ancestry. For four of the novel variants, the minor allele frequency was low (<5%). In the trans-ethnic fine mapping of 47 BMI loci and 27 WHRadjBMI loci that were locus-wide significant (P < 0.05 adjusted for effective number of variants per locus) from the African ancestry sex-combined and sex-stratified analyses, 26 BMI loci and 17 WHRadjBMI loci contained ≀ 20 variants in the credible sets that jointly account for 99% posterior probability of driving the associations. The lead variants in 13 of these loci had a high probability of being causal. As compared to our previous HapMap imputed GWAS for BMI and WHRadjBMI including up to 71,412 and 27,350 African ancestry individuals, respectively, our results suggest that 1000 Genomes imputation showed modest improvement in identifying GWAS loci including low frequency variants. Trans-ethnic meta-analyses further improved fine mapping of putative causal variants in loci shared between the African and European ancestry populations

    Type 2 Diabetes Variants Disrupt Function of SLC16A11 through Two Distinct Mechanisms

    Get PDF
    Type 2 diabetes (T2D) affects Latinos at twice the rate seen in populations of European descent. We recently identified a risk haplotype spanning SLC16A11 that explains ∌20% of the increased T2D prevalence in Mexico. Here, through genetic fine-mapping, we define a set of tightly linked variants likely to contain the causal allele(s). We show that variants on the T2D-associated haplotype have two distinct effects: (1) decreasing SLC16A11 expression in liver and (2) disrupting a key interaction with basigin, thereby reducing cell-surface localization. Both independent mechanisms reduce SLC16A11 function and suggest SLC16A11 is the causal gene at this locus. To gain insight into how SLC16A11 disruption impacts T2D risk, we demonstrate that SLC16A11 is a proton-coupled monocarboxylate transporter and that genetic perturbation of SLC16A11 induces changes in fatty acid and lipid metabolism that are associated with increased T2D risk. Our findings suggest that increasing SLC16A11 function could be therapeutically beneficial for T2D. Video Abstract [Figure presented] Keywords: type 2 diabetes (T2D); genetics; disease mechanism; SLC16A11; MCT11; solute carrier (SLC); monocarboxylates; fatty acid metabolism; lipid metabolism; precision medicin

    Pathogenetics of alveolar capillary dysplasia with misalignment of pulmonary veins.

    Get PDF
    Alveolar capillary dysplasia with misalignment of pulmonary veins (ACDMPV) is a lethal lung developmental disorder caused by heterozygous point mutations or genomic deletion copy-number variants (CNVs) of FOXF1 or its upstream enhancer involving fetal lung-expressed long noncoding RNA genes LINC01081 and LINC01082. Using custom-designed array comparative genomic hybridization, Sanger sequencing, whole exome sequencing (WES), and bioinformatic analyses, we studied 22 new unrelated families (20 postnatal and two prenatal) with clinically diagnosed ACDMPV. We describe novel deletion CNVs at the FOXF1 locus in 13 unrelated ACDMPV patients. Together with the previously reported cases, all 31 genomic deletions in 16q24.1, pathogenic for ACDMPV, for which parental origin was determined, arose de novo with 30 of them occurring on the maternally inherited chromosome 16, strongly implicating genomic imprinting of the FOXF1 locus in human lungs. Surprisingly, we have also identified four ACDMPV families with the pathogenic variants in the FOXF1 locus that arose on paternal chromosome 16. Interestingly, a combination of the severe cardiac defects, including hypoplastic left heart, and single umbilical artery were observed only in children with deletion CNVs involving FOXF1 and its upstream enhancer. Our data demonstrate that genomic imprinting at 16q24.1 plays an important role in variable ACDMPV manifestation likely through long-range regulation of FOXF1 expression, and may be also responsible for key phenotypic features of maternal uniparental disomy 16. Moreover, in one family, WES revealed a de novo missense variant in ESRP1, potentially implicating FGF signaling in the etiology of ACDMPV

    Multi-ancestry GWAS of the electrocardiographic PR interval identifies 202 loci underlying cardiac conduction

    Get PDF
    The electrocardiographic PR interval reflects atrioventricular conduction, and is associated with conduction abnormalities, pacemaker implantation, atrial fibrillation (AF), and cardiovascular mortality. Here we report a multi-ancestry (N=293,051) genome-wide association meta-analysis for the PR interval, discovering 202 loci of which 141 have not previously been reported. Variants at identified loci increase the percentage of heritability explained, from 33.5% to 62.6%. We observe enrichment for cardiac muscle developmental/contractile and cytoskeletal genes, highlighting key regulation processes for atrioventricular conduction. Additionally, 8 loci not previously reported harbor genes underlying inherited arrhythmic syndromes and/or cardiomyopathies suggesting a role for these genes in cardiovascular pathology in the general population. We show that polygenic predisposition to PR interval duration is an endophenotype for cardiovascular disease, including distal conduction disease, AF, and atrioventricular pre-excitation. These findings advance our understanding of the polygenic basis of cardiac conduction, and the genetic relationship between PR interval duration and cardiovascular disease. On the electrocardiogram, the PR interval reflects conduction from the atria to ventricles and also serves as risk indicator of cardiovascular morbidity and mortality. Here, the authors perform genome-wide meta-analyses for PR interval in multiple ancestries and identify 141 previously unreported genetic loci.Peer reviewe
    • 

    corecore