2,789 research outputs found

    LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Transposable elements are abundant in eukaryotic genomes and it is believed that they have a significant impact on the evolution of gene and chromosome structure. While there are several completed eukaryotic genome projects, there are only few high quality genome wide annotations of transposable elements. Therefore, there is a considerable demand for computational identification of transposable elements. LTR retrotransposons, an important subclass of transposable elements, are well suited for computational identification, as they contain long terminal repeats (LTRs).</p> <p>Results</p> <p>We have developed a software tool <it>LTRharvest </it>for the <it>de novo </it>detection of full length LTR retrotransposons in large sequence sets. <it>LTRharvest </it>efficiently delivers high quality annotations based on known LTR transposon features like length, distance, and sequence motifs. A quality validation of <it>LTRharvest </it>against a gold standard annotation for <it>Saccharomyces cerevisae </it>and <it>Drosophila melanogaster </it>shows a sensitivity of up to 90% and 97% and specificity of 100% and 72%, respectively. This is comparable or slightly better than annotations for previous software tools. The main advantage of <it>LTRharvest </it>over previous tools is (a) its ability to efficiently handle large datasets from finished or unfinished genome projects, (b) its flexibility in incorporating known sequence features into the prediction, and (c) its availability as an open source software.</p> <p>Conclusion</p> <p><it>LTRharvest </it>is an efficient software tool delivering high quality annotation of LTR retrotransposons. It can, for example, process the largest human chromosome in approx. 8 minutes on a Linux PC with 4 GB of memory. Its flexibility and small space and run-time requirements makes <it>LTRharvest </it>a very competitive candidate for future LTR retrotransposon annotation projects. Moreover, the structured design and implementation and the availability as open source provides an excellent base for incorporating novel concepts to further improve prediction of LTR retrotransposons.</p

    FPGA-based Acceleration of Detecting Statistical Epistasis in GWAS

    Get PDF
    AbstractGenotype-by-genotype interactions (epistasis) are believed to be a significant source of unexplained genetic variation causing complex chronic diseases but have been ignored in genome-wide association studies (GWAS) due to the computational burden of analysis. In this work we show how to benefit from FPGA technology for highly parallel creation of contingency tables in a systolic chain with a subsequent statistical test. We present the implementation for the FPGA-based hardware platform RIVYERA S6-LX150 containing 128 Xilinx Spartan6-LX150 FPGAs. For performance evaluation we compare against the method iLOCi[9]. iLOCi claims to outperform other available tools in terms of accuracy. However, analysis of a dataset from the Wellcome Trust Case Control Consortium (WTCCC) with about 500,000 SNPs and 5,000 samples still takes about 19hours on a MacPro workstation with two Intel Xeon quad-core CPUs, while our FPGA-based implementation requires only 4minutes

    Parallelizing Epistasis Detection in GWAS on FPGA and GPU-Accelerated Computing Systems

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in IEEE - ACM Transactions on Computational Biology and Bioinformatics. The final authenticated version is available online at: http://dx.doi.org/10.1109/TCBB.2015.2389958[Abstract] High-throughput genotyping technologies (such as SNP-arrays) allow the rapid collection of up to a few million genetic markers of an individual. Detecting epistasis (based on 2-SNP interactions) in Genome-Wide Association Studies is an important but time consuming operation since statistical computations have to be performed for each pair of measured markers. Computational methods to detect epistasis therefore suffer from prohibitively long runtimes; e.g., processing a moderately-sized dataset consisting of about 500,000 SNPs and 5,000 samples requires several days using state-of-the-art tools on a standard 3 GHz CPU. In this paper, we demonstrate how this task can be accelerated using a combination of fine-grained and coarse-grained parallelism on two different computing systems. The first architecture is based on reconfigurable hardware (FPGAs) while the second architecture uses multiple GPUs connected to the same host. We show that both systems can achieve speedups of around four orders-of-magnitude compared to the sequential implementation. This significantly reduces the runtimes for detecting epistasis to only a few minutes for moderatelysized datasets and to a few hours for large-scale datasets.London. Wellcome Trust; 076113London. Wellcome Trust; 08547

    Replication study of ulcerative colitis risk loci in a Lithuanian-Latvian case control sample

    Get PDF
    Background: Differences between populations might be reflected in their different genetic risk maps to complex diseases, for example, inflammatory bowel disease. We here investigated the role of known inflammatory bowel disease associated single nucleotide polymorphisms (SNPs) in a subset of patients with ulcerative colitis (UC) from the Northeastern European countries Lithuania and Latvia and evaluated possible epistatic interactions between these genetic variants. Methods: We investigated 77 SNPs derived from 5 previously published genome-wide association studies for Crohn's disease and UC. Our study panel comprised 444 Lithuanian and Latvian patients with UC and 1154 healthy controls. Single marker case control association and SNP-SNP epistasis analyses were performed. Results: We found 14 SNPs tagging 9 loci, including 21q21.1, NKX2-3, MST1, the HLA region, 1p36.13, IL10, JAK2, ORMDL3, and IL23R, to be associated with UC. Interestingly, the association of UC with previously identified variants in the HLA region was not the strongest association in our study (P = 4.34 × 1023, odds ratio [OR] = 1.25), which is in contrast to all previously published studies. No association with any disease subphenotype was found. SNP-SNP interaction analysis showed significant epistasis between SNPs in the PTPN22 (rs2476601) and C13orf31 (rs3764147) genes and increased risk for UC (P = 1.64 × 1026, OR = 2.44). The association has been confirmed in the Danish study group (P = 0.04, OR = 3.25). Conclusions: We confirmed the association of the 9 loci (21q21.1, 1p36.13, NKX2-3, MST1, the HLA region, IL10, JAK2, ORMDL3, and IL23R) with UC in the Lithuanian Latvian population. SNP-SNP interaction analyses showed that the combination of SNPs in the PTPN22 (rs2476601) and C13orf31 (rs3764147) genes increase the risk for UC.publishersversionPeer reviewe

    Response to Comment on "ApoE e4e4 Genotype and Mortality With COVID-19 in UK Biobank" by Kuo et al

    Get PDF
    This article is freely available via Open Access. Click on the Publisher URL to access it via the publisher's site.C.L.K. and D.M. are supported by an R21 grant (R21AG060018) funded by National Institute on Aging, National Institute of Health, USA. D.M. also is supported by the University of Connecticut School of Medicine.published version, accepted version (12 month embargo), submitted versio

    Local genetic variation of inflammatory bowel disease in Basque population and its effect in risk prediction

    Get PDF
    [EN] Inflammatory bowel disease (IBD) is characterised by chronic inflammation of the gastrointestinal tract. Although its aetiology remains unknown, environmental and genetic factors are involved in its development. Regarding genetics, more than 200 loci have been associated with IBD but the transferability of those signals to the Basque population living in Northern Spain, a population with distinctive genetic background, remains unknown. We have analysed 5,411,568 SNPs in 498 IBD cases and 935 controls from the Basque population. We found 33 suggestive loci (p 0.68. In conclusion, we report on the genetic architecture of IBD in the Basque population, and explore the performance of European-descent genetic risk scores in this population.Samples and data used in the present work were provided by the Basque Biobank (http://www.biobancovasco.org).We want to thank Miguel Angel Vesga from the Basque Centre of Transfusion and Human Tissues for providing the access to control samples. This work was founded to MD by Gipuzkoako Foru Aldundia/Diputacion Foral de Gipuzkoa. The project that gave rise to these results rece

    Detailed stratified GWAS analysis for severe COVID-19 in four European populations

    Get PDF
    Publisher Copyright: © The Author(s) 2022.Given the highly variable clinical phenotype of Coronavirus disease 2019 (COVID-19), a deeper analysis of the host genetic contribution to severe COVID-19 is important to improve our understanding of underlying disease mechanisms. Here, we describe an extended genome-wide association meta-analysis of a well-characterized cohort of 3255 COVID-19 patients with respiratory failure and 12 488 population controls from Italy, Spain, Norway and Germany/Austria, including stratified analyses based on age, sex and disease severity, as well as targeted analyses of chromosome Y haplotypes, the human leukocyte antigen region and the SARS-CoV-2 peptidome. By inversion imputation, we traced a reported association at 17q21.31 to a ∼0.9-Mb inversion polymorphism that creates two highly differentiated haplotypes and characterized the potential effects of the inversion in detail. Our data, together with the 5th release of summary statistics from the COVID-19 Host Genetics Initiative including non-Caucasian individuals, also identified a new locus at 19q13.33, including NAPSA, a gene which is expressed primarily in alveolar cells responsible for gas exchange in the lung.Peer reviewe

    HLA-DPA1*02:01~B1*01:01 is a risk haplotype for primary sclerosing cholangitis mediating activation of NKp44+ NK cells

    Get PDF
    Objective Primary sclerosing cholangitis (PSC) is characterised by bile duct strictures and progressive liver disease, eventually requiring liver transplantation. Although the pathogenesis of PSC remains incompletely understood, strong associations with HLA-class II haplotypes have been described. As specific HLA-DP molecules can bind the activating NK-cell receptor NKp44, we investigated the role of HLA-DP/NKp44-interactions in PSC. Design Liver tissue, intrahepatic and peripheral blood lymphocytes of individuals with PSC and control individuals were characterised using flow cytometry, immunohistochemical and immunofluorescence analyses. HLA-DPA1 and HLA-DPB1 imputation and association analyses were performed in 3408 individuals with PSC and 34 213 controls. NK cell activation on NKp44/HLA-DP interactions was assessed in vitro using plate-bound HLA-DP molecules and HLA-DPB wildtype versus knock-out human cholangiocyte organoids. Results NKp44+NK cells were enriched in livers, and intrahepatic bile ducts of individuals with PSC showed higher expression of HLA-DP. HLA-DP haplotype analysis revealed a highly elevated PSC risk for HLA-DPA1*02:01~B1*01:01 (OR 1.99, p=6.7×10-50). Primary NKp44+NK cells exhibited significantly higher degranulation in response to plate-bound HLA-DPA1*02:01-DPB1*01:01 compared with control HLA-DP molecules, which were inhibited by anti-NKp44-blocking. Human cholangiocyte organoids expressing HLA-DPA1*02:01-DPB1*01:01 after IFN-γ-exposure demonstrated significantly increased binding to NKp44-Fc constructs compared with unstimulated controls. Importantly, HLA-DPA1*02:01-DPB1*01:01-expressing organoids increased degranulation of NKp44+NK cells compared with HLA-DPB1-KO organoids. Conclusion Our studies identify a novel PSC risk haplotype HLA-DP A1*02:01~DPB1*01:01 and provide clinical and functional data implicating NKp44+NK cells that recognise HLA-DPA1*02:01-DPB1*01:01 expressed on cholangiocytes in PSC pathogenesis
    corecore