81 research outputs found

    CONTRAfold: RNA secondary structure prediction without physics-based models

    Get PDF
    doi:10.1093/bioinformatics/btl24

    Genome-wide analysis points to roles for extracellular matrix remodeling, the visual cycle, and neuronal development in myopia

    Get PDF
    Myopia, or nearsightedness, is the most common eye disorder, resulting primarily from excess elongation of the eye. The etiology of myopia, although known to be complex, is poorly understood. Here we report the largest ever genome-wide association study (43,360 participants) on myopia in Europeans. We performed a survival analysis on age of myopia onset and identified 19 significant associations (p < 5e-8), two of which are replications of earlier associations with refractive error. These 19 associations in total explain 2.7% of the variance in myopia age of onset, and point towards a number of different mechanisms behind the development of myopia. One association is in the gene PRSS56, which has previously been linked to abnormally small eyes; one is in a gene that forms part of the extracellular matrix (LAMA2); two are in or near genes involved in the regeneration of 11-cis-retinal (RGR and RDH5); two are near genes known to be involved in the growth and guidance of retinal ganglion cells (ZIC2, SFRP1); and five are in or near genes involved in neuronal signaling or development. These novel findings point towards multiple genetic factors involved in the development of myopia and suggest that complex interactions between extracellular matrix remodeling, neuronal development, and visual signals from the retina may underlie the development of myopia in humans

    CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction

    Get PDF
    CONTRAST is a gene predictor that directly incorporates information from multiple alignments and uses discriminative machine learning techniques to give large improvements in prediction over previous methods

    Efficient Replication of Over 180 Genetic Associations with Self-Reported Medical Data

    Get PDF
    While the cost and speed of generating genomic data have come down dramatically in recent years, the slow pace of collecting medical data for large cohorts continues to hamper genetic research. Here we evaluate a novel online framework for amassing large amounts of medical information in a recontactable cohort by assessing our ability to replicate genetic associations using these data. Using web-based questionnaires, we gathered self-reported data on 50 medical phenotypes from a generally unselected cohort of over 20,000 genotyped individuals. Of a list of genetic associations curated by NHGRI, we successfully replicated about 75% of the associations that we expected to (based on the number of cases in our cohort and reported odds ratios, and excluding a set of associations with contradictory published evidence). Altogether we replicated over 180 previously reported associations, including many for type 2 diabetes, prostate cancer, cholesterol levels, and multiple sclerosis. We found significant variation across categories of conditions in the percentage of expected associations that we were able to replicate, which may reflect systematic inflation of the effects in some initial reports, or differences across diseases in the likelihood of misdiagnosis or misreport. We also demonstrated that we could improve replication success by taking advantage of our recontactable cohort, offering more in-depth questions to refine self-reported diagnoses. Our data suggests that online collection of self-reported data in a recontactable cohort may be a viable method for both broad and deep phenotyping in large populations

    Multiple alignment of protein sequences with repeats and rearrangements

    Get PDF
    Multiple sequence alignments are the usual starting point for analyses of protein structure and evolution. For proteins with repeated, shuffled and missing domains, however, traditional multiple sequence alignment algorithms fail to provide an accurate view of homology between related proteins, because they either assume that the input sequences are globally alignable or require locally alignable regions to appear in the same order in all sequences. In this paper, we present ProDA, a novel system for automated detection and alignment of homologous regions in collections of proteins with arbitrary domain architectures. Given an input set of unaligned sequences, ProDA identifies all homologous regions appearing in one or more sequences, and returns a collection of local multiple alignments for these regions. On a subset of the BAliBASE benchmarking suite containing curated alignments of proteins with complicated domain architectures, ProDA performs well in detecting conserved domain boundaries and clustering domain segments, achieving the highest accuracy to date for this task. We conclude that ProDA is a practical tool for automated alignment of protein sequences with repeats and rearrangements in their domain architecture

    Clearance kinetics and matrix binding partners of the receptor for advanced glycation end products

    Get PDF
    Elucidating the sites and mechanisms of sRAGE action in the healthy state is vital to better understand the biological importance of the receptor for advanced glycation end products (RAGE). Previous studies in animal models of disease have demonstrated that exogenous sRAGE has an anti-inflammatory effect, which has been reasoned to arise from sequestration of pro-inflammatory ligands away from membrane-bound RAGE isoforms. We show here that sRAGE exhibits in vitro binding with high affinity and reversibly to extracellular matrix components collagen I, collagen IV, and laminin. Soluble RAGE administered intratracheally, intravenously, or intraperitoneally, does not distribute in a specific fashion to any healthy mouse tissue, suggesting against the existence of accessible sRAGE sinks and receptors in the healthy mouse. Intratracheal administration is the only effective means of delivering exogenous sRAGE to the lung, the organ in which RAGE is most highly expressed; clearance of sRAGE from lung does not differ appreciably from that of albumin. Copyright: © 2014 Milutinovic et al

    Comprehensive Research Synopsis and Systematic Meta-Analyses in Parkinson's Disease Genetics: The PDGene Database

    Get PDF
    More than 800 published genetic association studies have implicated dozens of potential risk loci in Parkinson's disease (PD). To facilitate the interpretation of these findings, we have created a dedicated online resource, PDGene, that comprehensively collects and meta-analyzes all published studies in the field. A systematic literature screen of ∼27,000 articles yielded 828 eligible articles from which relevant data were extracted. In addition, individual-level data from three publicly available genome-wide association studies (GWAS) were obtained and subjected to genotype imputation and analysis. Overall, we performed meta-analyses on more than seven million polymorphisms originating either from GWAS datasets and/or from smaller scale PD association studies. Meta-analyses on 147 SNPs were supplemented by unpublished GWAS data from up to 16,452 PD cases and 48,810 controls. Eleven loci showed genome-wide significant (P<5×10−8) association with disease risk: BST1, CCDC62/HIP1R, DGKQ/GAK, GBA, LRRK2, MAPT, MCCC1/LAMP3, PARK16, SNCA, STK39, and SYT11/RAB25. In addition, we identified novel evidence for genome-wide significant association with a polymorphism in ITGA8 (rs7077361, OR 0.88, P = 1.3×10−8). All meta-analysis results are freely available on a dedicated online database (www.pdgene.org), which is cross-linked with a customized track on the UCSC Genome Browser. Our study provides an exhaustive and up-to-date summary of the status of PD genetics research that can be readily scaled to include the results of future large-scale genetics projects, including next-generation sequencing studies

    A genome-wide association meta-analysis of self-reported allergy identifies shared and allergy-specific susceptibility loci

    Get PDF
    Allergic disease is very common and carries substantial public-health burdens. We conducted a meta-analysis of genome-wide associations with self-reported cat, dust-mite and pollen allergies in 53,862 individuals. We used generalized estimating equations to model shared and allergy-specific genetic effects. We identified 16 shared susceptibility loci with association P < 5 × 10-8, including 8 loci previously associated with asthma, as well as 4p14 near TLR1, TLR6 and TLR10 (rs2101521, P = 5.3 × 10 -21); 6p21.33 near HLA-C and MICA (rs9266772, P = 3.2 × 10 -12); 5p13.1 near PTGER4 (rs7720838, P = 8.2 × 10 -11); 2q33.1 in PLCL1 (rs10497813, P = 6.1 × 10-10), 3q28 in LPP (rs9860547, P = 1.2 × 10-9); 20q13.2 in NFATC2 (rs6021270, P = 6.9 × 10-9), 4q27 in ADAD1 (rs17388568, P = 3.9 × 10-8); and 14q21.1 near FOXA1 and TTC6 (rs1998359, P = 4.8 × 10-8). We identified one locus with substantial evidence of differences in effects across allergies at 6p21.32 in the class II human leukocyte antigen (HLA) region (rs17533090, P = 1.7 × 10-12), which was strongly associated with cat allergy. Our study sheds new light on the shared etiology of immune and autoimmune disease
    corecore