Search CORE

282 research outputs found

Using multiple alignments to improve seeded local alignment algorithms

Author: Batzoglou Serafim
Flannick Jason
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

Multiple alignments among genomes are becoming increasingly prevalent. This trend motivates the development of tools for efficient homology search between a query sequence and a database of multiple alignments. In this paper, we present an algorithm that uses the information implicit in a multiple alignment to dynamically build an index that is weighted most heavily towards the promising regions of the multiple alignment. We have implemented Typhon, a local alignment tool that incorporates our indexing algorithm, which our test results show to be more sensitive than algorithms that index only a sequence. This suggests that when applied on a whole-genome scale, Typhon should provide improved homology searches in time comparable to existing algorithms

CiteSeerX

Crossref

PubMed Central

Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation

Author: Citation Flannick
David Altshuler
David Altshuler
Eric Banks
Eric Banks
George B. Grant
George B. Grant
Jason Flannick
Joshua M. Korn
Joshua M. Korn
Mark A. Depristo
Mark A. Depristo
Pierre Fontanillas
Pierre Fontanillas
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

High coverage whole genome sequencing provides near complete information about genetic variation. However, other technologies can be more efficient in some settings by (a) reducing redundant coverage within samples and (b) exploiting patterns of genetic variation across samples. To characterize as many samples as possible, many genetic studies therefore employ lower coverage sequencing or SNP array genotyping coupled to statistical imputation. To compare these approaches individually and in conjunction, we developed a statistical framework to estimate genotypes jointly from sequence reads, array intensities, and imputation. In European samples, we find similar sensitivity (89%) and specificity (99.6%) from imputation with either 1× sequencing or 1 M SNP arrays. Sensitivity is increased, particularly for low-frequency polymorphisms (MAF <5%), when low coverage sequence reads are added to dense genome-wide SNP arrays — the converse, however, is not true. At sites where sequence reads and array intensities produce different sample genotypes, joint analysis reduces genotype errors and identifies novel error modes. Our joint framework informs the use of next-generation sequencing in genome wide association studies and supports development of improved methods for genotype calling

CiteSeerX

Public Library of Science (PLOS)

DSpace@MIT

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

The Francis Crick Institute

Sequential PAttern mining using a bitmap representation

Author: Jason Flannick
Jay Ayres
Johannes Gehrke
Tomi Yiu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

Recommended from our members

Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes.

Author: Afaq Saima
Afzal Shoaib
Ahlqvist Emma
Almgren Peter
Amin Najaf
An Ping
Bang Lia B
Bertoni Alain G
Bielak Lawrence F
Bombieri Cristina
Bork-Jensen Jette
Brandslund Ivan
Brody Jennifer A
Burtt Noël P
Canouil Mickaël
Chen Yii-Der Ida
Cho Yoon Shin
Christensen Cramer
Chu Audrey Y
Cook James P
de Haan Hugoline G
Demirkan Ayse
Eastwood Sophie V
Eckardt Kai-Uwe
ExomeBP Consortium
Fischer Krista
Flannick Jason
Gambaro Giovanni
Gan Wei
GIANT Consortium
Giedraitis Vilmantas
Graff Marielisa
Grarup Niels
Grove Megan L
Guo Xiuqing
Gustafsson Stefan
Hackinger Sophie
Hai Yang
Han Sohee
Highland Heather M
Hivert Marie-France
Hu Yao
Huo Shaofeng
Isomaa Bo
Jensen Richard A
Justice Anne E
Jäger Susanne
Jørgensen Marit E
Jørgensen Torben
Kim Bong-Jo
Kim Sung Soo
Kim Young Jin
Kitajima Hidetoshi
Koistinen Heikki A
Kovacs Peter
Kravic Jasmina
Kriebel Jennifer
Kronenberg Florian
Käräjämäki Annemari
Lange Leslie A
Lecoeur Cécile
Lee Jung-Jin
Lehne Benjamin
Li Huaixing
Li Jin
Li Man
Li-Gao Ruifang
Ligthart Symen
Lin Keng-Hung
Liu Dajiang J
Lohman Kurt K
Lu Yingchang
Läll Kristi
MAGIC Consortium
Mahajan Anubha
Malerba Giovanni
Marouli Eirini
Marten Jonathan
Meidtner Karina
Müller-Nurasyid Martina
Peloso Gina Marie
Preuss Michael
Prins Bram Peter
Rayner N William
Robertson Neil R
Rybin Denis V
Smith Albert Vernon
Steinthorsdottir Valgerdur
Tajes Juan Fernandez
Taliun Daniel
Trubetskoy Vassily Vladimirovich
Tybjærg-Hansen Anne
Varga Tibor V
Warren Helen R
Wessel Jennifer
Willems Sara M
Wuttke Matthias
Yaghootkar Hanieh
Zhang Weihua
Zhao Wei
Publication venue: eScholarship, University of California
Publication date: 01/04/2018
Field of study

We aggregated coding variant data for 81,412 type 2 diabetes cases and 370,832 controls of diverse ancestry, identifying 40 coding variant association signals (P < 2.2 × 10-7); of these, 16 map outside known risk-associated loci. We make two important observations. First, only five of these signals are driven by low-frequency variants: even for these, effect sizes are modest (odds ratio ≤1.29). Second, when we used large-scale genome-wide association data to fine-map the associated variants in their regional context, accounting for the global enrichment of complex trait associations in coding sequence, compelling evidence for coding variant causality was obtained for only 16 signals. At 13 others, the associated coding variants clearly represent 'false leads' with potential to generate erroneous mechanistic inference. Coding variant associations offer a direct route to biological insight for complex diseases and identification of validated therapeutic targets; however, appropriate mechanistic inference requires careful specification of their causal contribution to disease predisposition

eScholarship - University of California

Recommended from our members

Erratum: Sequence data and association statistics from 12,940 type 2 diabetes cases and controls.

Author: Abboud Hanna E
Agarwala Vineeta
Balkau Beverley
Barzilai Nir
Beer Nicola L
Below Jennifer E
Blackwell Thomas W
Boeing Heiner
Butterworth Adam S
Carey Jason
Caulkins Lizz
Chen Han
Chen Peng
Chen Yuhui
Chines Peter S
Cingolani Pablo
Danesh John
Day-Williams Aaron G
Dupuis Josee
Ferreira Teresa
Fingerlin Tasha
Flannick Jason
Fuchsberger Christian
Gamazon Eric R
Gaulton Kyle J
Giedraitis Vilmantas
Go Min Jin
Gottesman Omri
Grant George
Grarup Niels
Green Todd
Han Bok-Ghee
Hartl Christopher
Highland Heather M
Horikoshi Momoko
Howson Joanna MM
Hu Cheng
Huang Jinyan
Huh Iksoo
Huyghe Jeroen R
Ikram Mohammad Kamran
Jackson Anne U
Jenkinson Christopher P
Kim Bong-Jo
Kim Yongkang
Kim Young Jin
Koesterer Ryan
Kumar Ashish
Kuulasmaa Teemu
Kuusisto Johanna
Kwak Soo-Heon
Kwan Phoenix
Kwon Min-Seok
Lam Vincent KL
Lee Heung Man
Lee Jaehoon
Lee Juyoung
Lee Selyeong
Lin Keng-Han
Lindgren Cecilia M
Locke Adam E
Lu Yingchang
Ma Clement
Mahajan Anubha
Manning Alisa
Maxwell Taylor J
McCarthy Davis J
Moutsianas Loukas
Müller-Nurasyid Martina
Müller-Nurasyid Martina
Nagai Yoshihiko
Neale Benjamin M
Ng Maggie CY
Palmer Nicholette D
Parker Stephen CJ
Pasko Dorota
Pearson Richard D
Perry John RB
Prabhakaran Dorairaj
Purcell Shaun
Rayner N William
Rivas Manuel A
Robertson Neil R
Scott James
Scott Robert A
Sim Xueling
Smith Joshua D
Stančáková Alena
Stitzel Michael L
Stringham Heather M
Tajes Juan Fernandez
Teslovich Tanya M
van de Bunt Martijn
Varga Tibor V
Voight Benjamin F
Wang Xu
Welch Ryan P
Yoon Joon
Zhang Weihua
Zhao Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

This corrects the article DOI: 10.1038/sdata.2017.179

eScholarship - University of California