Search CORE

653 research outputs found

Commercial chicken breeds exhibit highly divergent patterns of linkage disequilibrium

Author: A A Gheyas
A Collins
A Kranis
A Krasikova
AA Gheyas
AJ Jeffreys
AR Quinlan
C Andreescu
C-J Rubin
D Karolchik
D W Burt
E G Seaby
E Mossotto
EM Heifetz
F Baudat
H-J Megens
I Fumasoni
J Aerts
J Gibson
J Gibson
LW Hillier
M Schmid
MAM Groenen
MG Elferink
MJ Daly
MS Khatkar
N Maniatis
NE Morton
PF O’Reilly
PL Oliver
R J Pengelly
R Kuo
RJ Pengelly
S Ennis
S Ennis
S Myers
S Myers
S Purcell
S Qanbari
S Service
S Singhal
S Wright
ST Sherry
T-Y Kuo
W Lau
W Tapper
W Tapper
W Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/05/2016
Field of study

The analysis of linkage disequilibrium (LD) underpins the development of effective genotyping technologies, trait mapping and understanding of biological mechanisms such as those driving recombination and the impact of selection. We apply the Malécot-Morton model of LD to create additive LD maps that describe the high-resolution LD landscape of commercial chickens. We investigated LD in chickens (Gallus gallus) at the highest resolution to date for broiler, white egg and brown egg layer commercial lines. There is minimal concordance between breeds of fine-scale LD patterns (correlation coefficient <0.21), and even between discrete broiler lines. Regions of LD breakdown, which may align with recombination hot spots, are enriched near CpG islands and transcription start sites (P<2.2 × 10?16), consistent with recent evidence described in finches, but concordance in hot spot locations between commercial breeds is only marginally greater than random. As in other birds, functional elements in the chicken genome are associated with recombination but, unlike evidence from other bird species, the LD landscape is not stable in the populations studied. The development of optimal genotyping panels for genome-led selection programmes will depend on careful analysis of the LD structure of each line of interest. Further study is required to fully elucidate the mechanisms underlying highly divergent LD patterns found in commercial chickens

Southampton (e-Prints Soton)

Crossref

ZENODO

Dryad Digital Repository (Duke University)

PubMed Central

Edinburgh Research Explorer

Electronic Archiving System

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

University of Queensland eSpace

Identification and Replication of Three Novel Myopia Common Susceptibility Gene Loci on Chromosome 3q26 using Linkage and Linkage Disequilibrium Mapping

Author: A Kong
A Santel
A Sik
C Alexander
C Griffin
Christopher J. Hammond
CJ Hammond
D Thierry-Mieg
DF Gudbjartsson
DV Jeyaraju
E Gottlieb
Francis Carbonaro
Greg Gibson
H Buch
J Altmuller
J Fantes
J Wallman
JB Richards
K Kimura
M Dirani
M Ferre
M Uchikawa
MJ Barber
N Maniatis
N Maniatis
N Maniatis
Nikolas Maniatis
R Dandona
S Cipolat
S Vitale
S. H. Melissa Liew
SG Crewther
SM Saw
TD Spector
Tim D. Spector
TL Young
Toby Andrew
V Davies
W Lau
W Zhang
Winston Lau
Publication venue: Public Library of Science
Publication date: 01/10/2008
Field of study

Refractive error is a highly heritable quantitative trait responsible for considerable morbidity. Following an initial genome-wide linkage study using microsatellite markers, we confirmed evidence for linkage to chromosome 3q26 and then conducted fine-scale association mapping using high-resolution linkage disequilibrium unit (LDU) maps. We used a preliminary discovery marker set across the 30-Mb region with an average SNP density of 1 SNP/15 kb (Map 1). Map 1 was divided into 51 LDU windows and additional SNPs were genotyped for six regions (Map 2) that showed preliminary evidence of multi-marker association using composite likelihood. A total of 575 cases and controls selected from the tails of the trait distribution were genotyped for the discovery sample. Malecot model estimates indicate three loci with putative common functional variants centred on MFN1 (180,566 kb; 95% confidence interval 180,505–180, 655 kb), approximately 156 kb upstream from alternate-splicing SOX2OT (182,595 kb; 95% CI 182,533–182,688 kb) and PSARL (184,386 kb; 95% CI 184,356–184,411 kb), with the loci showing modest to strong evidence of association for the Map 2 discovery samples (p<10−7, p<10−10, and p = 0.01, respectively). Using an unselected independent sample of 1,430 individuals, results replicated for the MFN1 (p = 0.006), SOX2OT (p = 0.0002), and PSARL (p = 0.0005) gene regions. MFN1 and PSARL both interact with OPA1 to regulate mitochondrial fusion and the inhibition of mitochondrial-led apoptosis, respectively. That two mitochondrial regulatory processes in the retina are implicated in the aetiology of myopia is surprising and is likely to provide novel insight into the molecular genetic basis of common myopia

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery

King's Research Portal

Recommended from our members

Efficient analysis and storage of large-scale genomic data

Author: Klarqvist Marcus
Publication venue: University of Cambridge
Publication date: 01/09/2019
Field of study

The impending advent of population-scaled sequencing cohorts involving tens of millions of individuals with matched phenotypic measurements will produce unprecedented volumes of genetic data. Storing and analysing such gargantuan datasets places computational performance at a pivotal position in medical genomics. In this thesis, I explore the potential for accelerating and parallelizing standard genetics workflows, file formats, and algorithms using both hardware-accelerated vectorization, parallel and distributed algorithms, and heterogeneous computing. First, I describe a novel bit-counting operation termed the positional population-count, which can be used together with succinct representations and standard efficient operations to accelerate many genetic calculations. In order to enable the use of this new operator and the canonical population count on any target machine I developed a unified low-level library using CPU dispatching to select the optimal method contingent on the available instruction set architecture and the given input size at run-time. As a proof-of-principle application, I apply the positional population-count operator to computing quality control-related summary statistics for terabyte-scaled sequencing readsets with >3,800-fold speed improvements. As another application, I describe a framework for efficiently computing the cardinality of set intersection using these operators and applied this framework to efficiently compute genome-wide linkage-disequilibrium in datasets with up to 67 million samples resulting in up to >60-fold improvements in speed for dense genotypic vectors and up to >250,000-fold savings in memory and >100,000-fold improvement in speed for sparse genotypic vectors. I next describe a framework for handling the terabytes of compressed output data and describe graphical routines for visualizing long-range linkage-disequilibrium blocks as seen over many human centromeres. Finally, I describe efficient algorithms for storing and querying very large genetic datasets and specialized algorithms for the genotype component of such datasets with >10,000-fold savings in memory compared to the current interchange format.Wellcome Trus

Apollo (Cambridge)

Inference of Population Structure using Dense Haplotype Data

Author: AL Price
AL Price
AP Dempster
B Wen
BE Engelhardt
D Falush
D Gamerman
D Reich
Daniel Falush
Daniel John Lawson
DF Conrad
DH Alexander
E Durand
EE Bacon
G Guillot
G Hellenthal
G McVean
G McVean
Garrett Hellenthal
Gregory P. Copenhaver
H Tang
H Tang
HC Fan
J Corander
J Novembre
J Novembre
J Pella
JK Pickrell
JK Pritchard
JO Kitzman
JP Huelsenbeck
JZ Li
KJ Dawson
L Zhivotovsky
LM Gattepaille
M Jakobsson
M Pellecchia
N Cardin
N Li
N Patterson
N Patterson
N Rosenberg
NA Rosenberg
P Donnelly
P Menozzi
P Scheet
R Hernandez
RR Hudson
S Sankararaman
SA Tishkoff
Simon Myers
SR Browning
SR Browning
T Jombart
T Niu
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in unprecedented detail, but presents new statistical challenges. We propose a novel inference framework that aims to efficiently capture information on population structure provided by patterns of haplotype similarity. Each individual in a sample is considered in turn as a recipient, whose chromosomes are reconstructed using chunks of DNA donated by the other individuals. Results of this “chromosome painting” can be summarized as a “coancestry matrix,” which directly reveals key information about ancestral relationships among individuals. If markers are viewed as independent, we show that this matrix almost completely captures the information used by both standard Principal Components Analysis (PCA) and model-based approaches such as STRUCTURE in a unified manner. Furthermore, when markers are in linkage disequilibrium, the matrix combines information across successive markers to increase the ability to discern fine-scale population structure using PCA. In parallel, we have developed an efficient model-based approach to identify discrete populations using this matrix, which offers advantages over PCA in terms of interpretability and over existing clustering algorithms in terms of speed, number of separable populations, and sensitivity to subtle population structure. We analyse Human Genome Diversity Panel data for 938 individuals and 641,000 markers, and we identify 226 populations reflecting differences on continental, regional, local, and family scales. We present multiple lines of evidence that, while many methods capture similar information among strongly differentiated groups, more subtle population structure in human populations is consistently present at a much finer level than currently available geographic labels and is only captured by the haplotype-based approach. The software used for this article, ChromoPainter and fineSTRUCTURE, is available from http://www.paintmychromosomes.com/

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery

Oxford University Research Archive

MPG.PuRe

Explore Bristol Research

FigShare

High-resolution genetic maps identify multiple Type 2 diabetes loci at regulatory hotspots in African Americans and Europeans

Author: Andrew T
Lau W
Maniatis N
Publication venue: 'Elsevier BV'
Publication date: 11/04/2017
Field of study

Interpretation of results from genome-wide association studies for T2D is challenging. Only very few loci have been replicated in African ancestry populations and the identification of the implicated functional genes remain largely undefined. We used genetic maps that capture detailed linkage disequilibrium information in European and African Americans and applied these to large T2D case-control samples in order to estimate locations for putative functional variants in both populations. Replicated T2D locations were tested for evidence of being regulatory hotspots using adipose expression. We validated a sample of our co-location intervals using next generation sequencing and functional annotation, including enhancers, transcription, and chromatin modifications. We identified 111 additional disease-susceptibility locations, 93 of which are cosmopolitan and 18 of which are European specific. We show that many previously known signals are also risk loci in African Americans. The majority of the disease locations appear to confer risk of T2D via the regulation of expression levels for a large number (266) of cis-regulated genes, the majority of which are not the nearest genes to the disease loci. Sequencing three cosmopolitan locations provided candidate functional variants that precisely co-locate with cell-specific chromatin domains and pancreatic islet enhancers. These variants have large effect sizes and are common across populations. Results show that disease-associated loci in different populations, gene expression, and cell-specific regulatory annotation can be effectively integrated by localizing these effects on high-resolution genetic maps. The cis-regulated genes provide insights into the complex molecular pathways involved and can be used as targets for sequencing and functional molecular studies

UCL Discovery

Spiral - Imperial College Digital Repository

Statistical perspectives on dependencies between genomic markers

Author: Wittenburg Dörte (gnd: 136382207)
Publication venue: Universität Rostock Rostock
Publication date
Field of study

To study the genetic impact on a quantitative trait, molecular markers are used as predictor variables in a statistical model. This habilitation thesis elucidated challenges accompanied with such investigations. First, the usefulness of including different kinds of genetic effects, which can be additive or non-additive, was verified. Second, dependencies between markers caused by their proximity on the genome were studied in populations with family stratification. The resulting covariance matrix deserved special attention due to its multi-functionality in several fields of genomic evaluations

Rostocker Dokumentenserver

Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

Author: A Helgason
A Kong
A Telenti
AD Børglum
Ali Syed
Anders D. Børglum
Anders E. Halager
Anders Krogh
Bent Petersen
BJ Stucky
Chen Ye
Christian N. S. Pedersen
Christian Theil Have
Christina M. Hultman
David Westergaard
DF Gudbjartsson
Esben Flindt
Francesco Lescai
G Lunter
GA Van der Auwera
GD Poznik
GM Cooper
H Cao
H Eiberg
H Kupfermann
H Li
H Li
H Li
Hans Eiberg
Hongzhi Cao
J Huddleston
Jacob Malte Jensen
Jakob Grove
Jette Bork-Jensen
Jihua Sun
Johan van Beusekom
Jonas Andreas Sibbesen
Jose M. G. Izarzugaza
JS Seo
JT Simpson
Jun Wang
Junhua Rao
K Katoh
K Tamura
Karsten Kristiansen
Kirstine Belling
KM Steinberg
L Paternoster
Lars Bolund
Lasse Maretty
Laurits Skov
LC Francioli
M Lek
M Nothnagel
M Oven
M Pendleton
MA Eberle
Maria Luisa Matey-Hernandez
Marie Grosjean
MC Frith
Mikkel Heide Schierup
MR Hoehe
Ning Li
Ole Lund
Ole Mors
Oluf Pedersen
P Rice
Palle Villesen
Patrick Sullivan
Peter Løngren
PH Sudmant
PL Auer
R Hubley
R Luo
Rachita Yadav
Ramneek Gupta
Ruiqi Xu
Rune M. Friborg
S Besenbacher
S Deorowicz
S Gnerre
S Liu
S Ripke
SF Altschul
Shengting Li
Shujia Huang
Simon Rasmussen
Siyang Liu
SM Kiełbasa
Stephanie Le Hellard
Søren Besenbacher
Søren Brunak
T Espeseth
T Magocˇ
Thomas D. Als
Thomas Espeseth
Thomas Mailund
Thomas Sicheritz-Pontén
Thorkild I. A. Sørensen
Torben Hansen
VA Schneider
Weijian Ye
WP Kloosterman
WS Wong
Xiaosen Guo
Xun Xu
Yuqi Chang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set of structural variants including many novel insertions and demonstrate how this variant catalogue enables further deciphering of known association mapping signals. We leverage the assemblies to provide 100 completely resolved major histocompatibility complex haplotypes and to resolve major parts of the Y chromosome. Our study provides a regional reference genome that we expect will improve the power of future association mapping studies and hence pave the way for precision medicine initiatives, which now are being launched in many countries including Denmark

Crossref

Copenhagen University Research Information System

Carolina Digital Repository

Online Research Database In Technology

Bayesian Statistical Methods for Genetic Association Studies with Case-Control and Cohort Design

Author: Tachmazidou Ioanna
Tachmazidou Ioanna
Publication venue: Epidemiology and Public Health, Imperial College London
Publication date: 01/03/2009
Field of study

Large-scale genetic association studies are carried out with the hope of discovering single nucleotide polymorphisms involved in the etiology of complex diseases. We propose a coalescent-based model for association mapping which potentially increases the power to detect disease-susceptibility variants in genetic association studies with case-control and cohort design. The approach uses Bayesian partition modelling to cluster haplotypes with similar disease risks by exploiting evolutionary information. We focus on candidate gene regions and we split the chromosomal region of interest into sub-regions or windows of high linkage disequilibrium (LD) therein assuming a perfect phylogeny. The haplotype space is then partitioned into disjoint clusters within which the phenotype-haplotype association is assumed to be the same. The novelty of our approach consists in the fact that the distance used for clustering haplotypes has an evolutionary interpretation, as haplotypes are clustered according to the time to their most recent common mutation. Our approach is fully Bayesian and we develop Markov Chain Monte Carlo algorithms to sample efficiently over the space of possible partitions. We have also developed a Bayesian survival regression model for high-dimension and small sample size settings. We provide a Bayesian variable selection procedure and shrinkage tool by imposing shrinkage priors on the regression coefficients. We have developed a computationally efficient optimization algorithm to explore the posterior surface and find the maximum a posteriori estimates of the regression coefficients. We compare the performance of the proposed methods in simulation studies and using real datasets to both single-marker analyses and recently proposed multi-marker methods and show that our methods perform similarly in localizing the causal allele while yielding lower false positive rates. Moreover, our methods offer computational advantages over other multi-marker approaches

Spiral - Imperial College Digital Repository