Search CORE

34 research outputs found

Technology dictates algorithms: Recent developments in read alignment

Author: Alkan Can
Alser Mohammed
Balliu Brunilda
Deshpande Dhrithi
Icer Baykal Pelin
Knyazev Sergey
Koslicki David
Mangul Serghei
Mutlu Onur
Rotman Jeremy
Shi Huwenbo
Singer Benjamin D.
Skums Pavel
Taraszka Kodi
Xue Victor
Yang Harry T.
Zelikovsky Alex
Publication venue
Publication date: 09/07/2020
Field of study

Massively parallel sequencing techniques have revolutionized biological and medical sciences by providing unprecedented insight into the genomes of humans, animals, and microbes. Modern sequencing platforms generate enormous amounts of genomic data in the form of nucleotide sequences or reads. Aligning reads onto reference genomes enables the identification of individual-specific genetic variants and is an essential step of the majority of genomic analysis pipelines. Aligned reads are essential for answering important biological questions, such as detecting mutations driving various human diseases and complex traits as well as identifying species present in metagenomic samples. The read alignment problem is extremely challenging due to the large size of analyzed datasets and numerous technological limitations of sequencing platforms, and researchers have developed novel bioinformatics algorithms to tackle these difficulties. Importantly, computational algorithms have evolved and diversified in accordance with technological advances, leading to todays diverse array of bioinformatics tools. Our review provides a survey of algorithmic foundations and methodologies across 107 alignment methods published between 1988 and 2020, for both short and long reads. We provide rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read aligners. We separately discuss how longer read lengths produce unique advantages and limitations to read alignment techniques. We also discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology, including whole transcriptome, adaptive immune repertoire, and human microbiome studies

arXiv.org e-Print Archive

Repository for Publications and Research Data

Directory of Open Access Journals

Atlas of prostate cancer heritability in European and African-American men pinpoints tissue-specific regulation.

Author: Adjei Andrew A
Al Olama Ali Amin
Albanes Demetrius
Aly Markus
Arndt Volker
Batra Jyotsna
Benlloch Sara
Berg Christine
Berndt Sonja I
Biritwum Richard B
Blot William J
Brenner Hermann
Campa Daniele
Cannon-Albright Lisa
Carpten John
Casey Graham
Chokkalingam Anand P
Chu Lisa
Clements Judith A
Conti David V
Cook Michael B
Crawford E David
Cybulski Cezary
Dieffenbach Aida K
Diver W Ryan
Donovan Jenny L
Easton Douglas
Eeles Rosalind A
Fitzgerald Liesel M
Freedman Matthew L
Gapstur Susan M
Gaziano J Michael
Giles Graham G
Giovannucci Edward
Goodman Phyllis J
Gronberg Henrik
Gusev Alexander
Haiman Christopher A
Hamdy Freddie C
Henderson Brian E
Hennis Anslem JM
Herkommer Kathleen
Hoover Robert
Hsing Ann W
Hunter David J
Ingles Sue A
Isaacs William B
Johansson Mattias
John Esther M
Kaggwa Sam
Kaneva Radka
Key Tim J
Khaw Kay-Tee
Kibel Adam S
Kichaev Gleb
Kierzek Andrzej
Kittles Rick A
Klein Eric A
Kluzniak Wojciech
Kote-Jarai Zsofia
Kraft Peter
Le Marchand Loic
Leske M Cristina
Li Fugen
Lin Hui-Yi
Lindström Sara
Long Henry W
Luedeke Manuel
Maia Sofia
Maier Christiane
McDonnell Shannon K
Michael Agnieszka
Mitev Vanio
Muir Kenneth
Murphy Adam B
Navarro Carmen
Neal David E
Nemesure Barbara
Neslund-Dudas Christine
Niwa Shelley
Nordestgaard Børge G
Overvad Kim
Pandha Hardev
Park Jong Y
Pasaniuc Bogdan
Pashayan Nora
Paulo Paula
Pettaway Curtis A
Pharoah Paul
Pomerantz Mark
PRACTICAL consortium
Price Alkes L
Raychaudhuri Soumya
Riboli Elio
Rybicki Benjamin A
Schaid Daniel J
Schleutker Johanna
Schumacher Frederick R
Sellers Thomas A
Shi Huwenbo
Siddiq Afshan
Signorello Lisa B
Slavov Chavdar
Southey Melissa C
Spurdle Amanda
Stanford Janet L
Stevens Victoria L
Stram Daniel O
Strom Sara S
Tammela Teuvo LJ
Tay Evelyn
Teerlink Craig
Teixeira Manuel R
Tettey Yao
Thibodeau Stephen N
Travis Ruth C
Trichopoulos Dimitrios
Truelove Ann
Trynka Gosia
Vineis Paolo
Vogel Walther
Wahlfors Tiina
Wiklund Fredrik
Witte John S
Wokolorczyk Dominika
Wu Suh-Yuh
Yeager Meredith
Yeboah Edward D
Zheng Wei
Publication venue: Nat Commun
Publication date: 01/01/2016
Field of study

Although genome-wide association studies have identified over 100 risk loci that explain ∼33% of familial risk for prostate cancer (PrCa), their functional effects on risk remain largely unknown. Here we use genotype data from 59,089 men of European and African American ancestries combined with cell-type-specific epigenetic data to build a genomic atlas of single-nucleotide polymorphism (SNP) heritability in PrCa. We find significant differences in heritability between variants in prostate-relevant epigenetic marks defined in normal versus tumour tissue as well as between tissue and cell lines. The majority of SNP heritability lies in regions marked by H3k27 acetylation in prostate adenoc7arcinoma cell line (LNCaP) or by DNaseI hypersensitive sites in cancer cell lines. We find a high degree of similarity between European and African American ancestries suggesting a similar genetic architecture from common variation underlying PrCa risk. Our findings showcase the power of integrating functional annotation with genetic data to understand the genetic basis of PrCa.This work was supported by NIH fellowship F32 GM106584 (AG), NIH grants R01 MH101244(A.G.), R01 CA188392 (B.P.), U01 CA194393(B.P.), R01 GM107427 (M.L.F.), R01 CA193910 (M.L.F./M.P.) and Prostate Cancer Foundation Challenge Award (M.L.F./M.P.). This study makes use of data generated by the Wellcome Trust Case Control Consortium and the Wellcome Trust Sanger Institute. A full list of the investigators who contributed to the generation of the Wellcome Trust Case Control Consortium data is available on www.wtccc.org.uk. Funding for the Wellcome Trust Case Control Consortium project was provided by the Wellcome Trust under award 076113. This study makes use of data generated by the UK10K Consortium. A full list of the investigators who contributed to the generation of the data is available online (http://www.UK10K.org). The PRACTICAL consortium was supported by the following grants: European Commission's Seventh Framework Programme grant agreement n° 223175 (HEALTH-F2-2009-223175), Cancer Research UK Grants C5047/A7357, C1287/A10118, C5047/A3354, C5047/A10692, C16913/A6135 and The National Institute of Health (NIH) Cancer Post-Cancer GWAS initiative Grant: no. 1 U19 CA 148537-01 (the GAME-ON initiative); Cancer Research UK (C1287/A10118, C1287/A 10710, C12292/A11174, C1281/A12014, C5047/A8384, C5047/A15007 and C5047/A10692), the National Institutes of Health (CA128978) and Post-Cancer GWAS initiative (1U19 CA148537, 1U19 CA148065 and 1U19 CA148112—the GAME-ON initiative), the Department of Defense (W81XWH-10-1-0341), A Linneus Centre (Contract ID 70867902), Swedish Research Council (grant no K2010-70X-20430-04-3), the Swedish Cancer Foundation (grant no 09-0677), grants RO1CA056678, RO1CA082664 and RO1CA092579 from the US National Cancer Institute, National Institutes of Health; US National Cancer Institute (R01CA72818); support from The National Health and Medical Research Council, Australia (126402, 209057, 251533, 396414, 450104, 504700, 504702, 504715, 623204, 940394 and 614296); NIH grants CA63464, CA54281 and CA098758; US National Cancer Institute (R01CA128813, PI: J.Y. Park); Bulgarian National Science Fund, Ministry of Education and Science (contract DOO-119/2009; DUNK01/2–2009; DFNI-B01/28/2012); Cancer Research UK grants [C8197/A10123] and [C8197/A10865]; grant code G0500966/75466; NIHR Health Technology Assessment Programme (projects 96/20/06 and 96/20/99); Cancer Research UK grant number C522/A8649, Medical Research Council of England grant number G0500966, ID 75466 and The NCRI, UK; The US Dept of Defense award W81XWH-04-1-0280; Australia Project Grant [390130, 1009458] and Enabling Grant [614296 to APCB]; the Prostate Cancer Foundation of Australia (Project Grant [PG7] and Research infrastructure grant [to APCB]); NIH grant R01 CA092447; Vanderbilt-Ingram Cancer Center (P30 CA68485); Cancer Research UK [C490/A10124] and supported by the UK National Institute for Health Research Biomedical Research Centre at the University of Cambridge; Competitive Research Funding of the Tampere University Hospital (9N069 and X51003); Award Number P30CA042014 from the National Cancer Institute.This is the final version of the article. It first appeared from Nature Publishing Group via http://dx.doi.org/0.1038/ncomms1097

Publikationer från Umeå universitet

Henry Ford Health System Scholarly Commons

Archivio della Ricerca - Università di Pisa

PubMed Central

eScholarship - University of California

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Apollo (Cambridge)

Explore Bristol Research

Atlas of prostate cancer heritability in European and African-American men pinpoints tissue-specific regulation

Author: Adjei Andrew A.
Al Olama Ali Amin
Benlloch Sara
Biritwum Richard B.
Blot William J.
Carpten John
Casey Graham
Chokkalingam Anand P.
Chu Lisa
Cook Michael B.
Easton Douglas
Eeles Rosalind A.
Fitzgerald Liesel M.
Giles Graham G.
Goodman Phyllis J.
Gusev Alexander
Hennis Anslem J. M.
Hsing Ann W.
Ingles Sue A.
Isaacs William B.
John Esther M.
Kaggwa Sam
Kichaev Gleb
Kittles Rick A.
Klein Eric A.
Kote-Jarai ZSofia
Leske M. Cristina
Li Fugen
Long Henry W.
Muir Kenneth
Murphy Adam B.
Nemesure Barbara
Neslund-Dudas Christine
Niwa Shelley
Pettaway Curtis A.
Pomerantz Mark
PRACTICAL Consortium
Rybicki Benjamin A.
Shi Huwenbo
Signorello Lisa B.
Southey Melissa C.
Stram Daniel O.
Strom Sara S.
Taari Kimmo
Tay Evelyn
Tettey Yao
Truelove Ann
Witte John S.
Wu Suh-Yuh
Yeboah Edward D.
Zheng Wei
Publication venue
Publication date: 01/01/2016
Field of study

Although genome-wide association studies have identified over 100 risk loci that explain similar to 33% of familial risk for prostate cancer (PrCa), their functional effects on risk remain largely unknown. Here we use genotype data from 59,089 men of European and African American ancestries combined with cell-type-specific epigenetic data to build a genomic atlas of single-nucleotide polymorphism (SNP) heritability in PrCa. We find significant differences in heritability between variants in prostate-relevant epigenetic marks defined in normal versus tumour tissue as well as between tissue and cell lines. The majority of SNP heritability lies in regions marked by H3k27 acetylation in prostate adenoc7arcinoma cell line (LNCaP) or by DNaseI hypersensitive sites in cancer cell lines. We find a high degree of similarity between European and African American ancestries suggesting a similar genetic architecture from common variation underlying PrCa risk. Our findings showcase the power of integrating functional annotation with genetic data to understand the genetic basis of PrCa.Peer reviewe

Copenhagen University Research Information System

The University of Manchester - Institutional Repository

Helsingin yliopiston digitaalinen arkisto

Recommended from our members

Atlas of prostate cancer heritability in European and African-American men pinpoints tissue-specific regulation

Author: Adjei Andrew A.
Adolfson Jan
Al Olama Ali Amin
Albanes Demetrius
Alexander Kimberly
Aly Markus
Arndt Volker
Auvinen Anssi
Barros-Silva Joao
Batra Jyotsna
Benlloch Sara
Berg Christine
Berndt Sonja I.
Biritwum Richard B.
Blot William J.
Brenner Hermann
Broms Michael
Brown Paul
Campa Daniele
Cannon-Albright Lisa
Carpten John
Casey Graham
Cavalli-Bjoerkman Carin
Chokkalingam Anand P.
Christova Svetlana
Chu Lisa
Clements Judith A.
Collins Angus
Conti David V.
Cook Margaret
Cook Michael B.
Cox Angela
Crawford E. David
Cybulski Cezary
Dadaev Tokhir
Davis Michael
Dieffenbach Aida K.
Dikov Tihomir
Diver W. Ryan
Donovan Jenny L.
Easton Douglas
Eckert Allison
Eeles Rosalind A.
Fisher Cyril
Fitzgerald Liesel M.
Freedman Matthew L.
Gapstur Susan M.
Gaziano J. Michael
George Anne
Giles Graham G.
Giovannucci Edward
Goodman Phyllis J.
Govindasami Koveela
Gronberg Henrik
Gusev Alexander
Gustafsson Sven
Guy Michelle
Haiman Christopher A.
Haley James
Hamdy Freddie C.
Hazel Steve
Heathcote Peter
Henderson Brian E.
Hennis Anslem J. M.
Henrique Rui
Herkommer Kathleen
Hoover Robert
Hopper John L.
Hsing Ann W.
Hunter David J.
Ingles Sue A.
Isaacs William B.
Iversen Peter
Johansson Jan-Erik
Johansson Mattias
John Esther M.
Kachakova Darina
Kaggwa Sam
Kaneva Radka
Karlsson Ami
Kedda Mary-Anne
Kerr Kris
Key Tim J.
Khaw Kay-Tee
Kibel Adam S.
Kichaev Gleb
Kierzek Andrzej
Kittles Rick A.
Klarskov Peter
Klein Eric A.
Kluzniak Wojciech
Kote-Jarai ZSofia
Kraft Peter
Kujala Paula
Lane Athene
Le Marchand Loic
Leongamornlert Daniel
Leske M. Cristina
Li Fugen
Lin Hui-Yi
Lindström Sara
Livni Naomi
Long Henry W.
Lophatananon Artitaya
Lubiski Jan
Luedeke Manuel
Maeaettaenen Liisa
Maia Sofia
Maier Christiane
Malone Greg
Marsden Gemma
McDonnell Shannon K.
Michael Agnieszka
Mitev Vanio
Mitkova Atanaska
Morgan Angela
Muir Kenneth
Murphy Adam B.
Murtola Teemu
Navarro Carmen
Neal David E.
Nemesure Barbara
Neslund-Dudas Christine
Nielsen Sune F.
Niwa Shelley
Nordestgaard Børge G.
Omara Tracy
Overvad Kim
Pandha Hardev
Park Hyun
Park Jong Y.
Pasaniuc Bogdan
Pashayan Nora
Paulo Paula
Pedersen John
Pettaway Curtis A.
Pharoah Paul
Pinto Pedro
Pomerantz Mark
Popov Elenko
Pow-Sang Julio
Price Alkes L.
Radlein Selina
Raychaudhuri Soumya
Riboli Elio
Rinckleb Antje
Rincon Maria
Riska Shaun
Roder Andreas
Rybicki Benjamin A.
Santos Joana
Saunders Edward J.
Saunders Pamela
Sawyer Emma J.
Schaid Daniel J.
Schleutker Johanna
Schumacher Frederick R.
Sellers Thomas A.
Shi Huwenbo
Siddiq Afshan
Signorello Lisa B.
Slavov Chavdar
Southey Melissa C.
Spurdle Amanda
Srinivasan Srilakshmi
Stanford Janet L.
Stattin Paer
Stegmaier Christa
Stevens Victoria L.
Stram Daniel O.
Strom Sara S.
Taari Kimmo
Tammela Teuvo L. J.
Tay Evelyn
Teerlink Craig
Teixeira Manuel R.
Tettey Yao
Thibodeau Stephen N.
Tillmans Lori
Travis Ruth C.
Trichopoulos Dimitrios
Truelove Ann
Trynka Gosia
Turner Megan
Tymrakiewicz Malgorzata
Vineis Paolo
Vlahova Aleksandrina
Vogel Walther
Wahlfors Tiina
Wallinder Hans
Wang Liang
Weischer Maren
Wiklund Fredrik
Wilkinson Rosemary
Witte John S.
Wokolorczyk Dominika
Wood Glenn
Wu Huihai
Wu Suh-Yuh
Yeadon Trina
Yeager Meredith
Yeboah Edward D.
Zachariah Babu
Zheng Wei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/05/2016
Field of study

Harvard University - DASH

Local Genetic Correlation Gives Insights into the Shared Genetic Architecture of Complex Traits.

Author: Shi Huwenbo,
Publication venue
Publication date: 15/05/2023
Field of study

Ezid

Contrasting the Genetic Architecture of 30 Complex Traits from Summary Association Data

Author: Shi Huwenbo,
Publication venue
Publication date: 04/05/2023
Field of study

Ezid

A multivariate Bernoulli model to predict DNaseI hypersensitivity status from haplotype data

Author: Shi Huwenbo,
Publication venue
Publication date: 17/04/2023
Field of study

Ezid

Computational methods to analyze large-scale genetic studies of complex human traits

Author: Shi Huwenbo
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

Large-scale genome-wide association studies (GWAS) have produced a rich resource of genetic data over the past decade, urging the need to develop computational and statistical methods that analyze these data. This dissertation presents four statistical methods that model the correlation structure between genetic variants and its effect on GWAS summary association statistics to help understand the genetic basis of complex human traits and diseases.The first method employs the multivariate Bernoulli distribution to model haplotype data, allowing for higher-order interactions among genetic variants, and shows better accuracy in predicting DNase I hypersensitivity status.The second method partitions heritability into small regions on the genome using GWAS summary statistics data, while accounting for complex correlation structures among genetic variants, and uncovers the genetic architectures of complex human traits and diseases.Extending the second method into pairs of traits, the third method partitions genetic correlation into small genomic regions using GWAS summary statistics data, and provides insights into the shared genetic basis between pairs of traits.Finally, the fourth method dissects population-specific and shared causal genetic variants of complex traits in two continental populations, using GWAS summary statistics data obtained from samples of different ethnicities, and reveals differences in genetic architectures of two continental populations

Ezid

eScholarship - University of California