36 research outputs found
Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel
A major use of the 1000 Genomes Project (1000GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants. © 2014 Macmillan Publishers Limited. All rights reserved
Robust estimation of bacterial cell count from optical density
Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data
Human genomic regions with exceptionally high levels of population differentiation identified from 911 whole-genome sequences
Data availability: The 1000 Genomes phase I integrated callset used in this study is publicly available at [The 1000 Genomes Project. http://www.1000genomes.org/]. For a list of samples used in this study, refer to Table S1 in Additional file 1. Additional files: Additional file 1: This file contains supplementary Tables ST1 to ST4 (https://static-content.springer.com/esm/art%3A10.1186%2Fgb-2014-15-6-r88/MediaObjects/13059_2014_3364_MOESM1_ESM.xlsx). Additional file 2: This file contains supplementary Figures F1 to F16 (https://static-content.springer.com/esm/art%3A10.1186%2Fgb-2014-15-6-r88/MediaObjects/13059_2014_3364_MOESM2_ESM.pdf).
Additional file 3: Full list of participants and institutions in the 1000 Genomes Project (https://static-content.springer.com/esm/art%3A10.1186%2Fgb-2014-15-6-r88/MediaObjects/13059_2014_3364_MOESM3_ESM.pdf).Copyright © 2014 Colonna et al. Background: Population differentiation has proved to be effective for identifying loci under geographically localized positive selection, and has the potential to identify loci subject to balancing selection. We have previously investigated the pattern of genetic differentiation among human populations at 36.8 million genomic variants to identify sites in the genome showing high frequency differences. Here, we extend this dataset to include additional variants, survey sites with low levels of differentiation, and evaluate the extent to which highly differentiated sites are likely to result from selective or other processes. Results: We demonstrate that while sites with low differentiation represent sampling effects rather than balancing selection, sites showing extremely high population differentiation are enriched for positive selection events and that one half may be the result of classic selective sweeps. Among these, we rediscover known examples, where we actually identify the established functional SNP, and discover novel examples including the genes ABCA12, CALD1 and ZNF804, which we speculate may be linked to adaptations in skin, calcium metabolism and defense, respectively. Conclusions: We identify known and many novel candidate regions for geographically restricted positive selection, and suggest several directions for further research.The Wellcome Trust (098051), an Italian National Research Council (CNR) short-term mobility fellowship from the 2013 program to VC, and an EMBO Short Term Fellowship ASTF 324–2010 to VC
A global reference for human genetic variation
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.We thank the many people who were generous with contributing their samples to the project: the African Caribbean in Barbados; Bengali in Bangladesh; British in England and Scotland; Chinese Dai in Xishuangbanna, China; Colombians in Medellin, Colombia; Esan in Nigeria; Finnish in Finland; Gambian in Western Division – Mandinka; Gujarati Indians in Houston, Texas, USA; Han Chinese in Beijing, China; Iberian populations in Spain; Indian Telugu in the UK; Japanese in Tokyo, Japan; Kinh in Ho Chi Minh City, Vietnam; Luhya in Webuye, Kenya; Mende in Sierra Leone; people with African ancestry in the southwest USA; people with Mexican ancestry in Los Angeles, California, USA; Peruvians in Lima, Peru; Puerto Ricans in Puerto Rico; Punjabi in Lahore, Pakistan; southern Han Chinese; Sri Lankan Tamil in the UK; Toscani in Italia; Utah residents (CEPH) with northern and western European ancestry; and Yoruba in Ibadan, Nigeria. Many thanks to the people who contributed to this project: P. Maul, T. Maul, and C. Foster; Z. Chong, X. Fan, W. Zhou, and T. Chen; N. Sengamalay, S. Ott, L. Sadzewicz, J. Liu, and L. Tallon; L. Merson; O. Folarin, D. Asogun, O. Ikpwonmosa, E. Philomena, G. Akpede, S. Okhobgenin, and O. Omoniwa; the staff of the Institute of Lassa Fever Research and Control (ILFRC), Irrua Specialist Teaching Hospital, Irrua, Edo State, Nigeria; A. Schlattl and T. Zichner; S. Lewis, E. Appelbaum, and L. Fulton; A. Yurovsky and I. Padioleau; N. Kaelin and F. Laplace; E. Drury and H. Arbery; A. Naranjo, M. Victoria Parra, and C. Duque; S. Däkel, B. Lenz, and S. Schrinner; S. Bumpstead; and C. Fletcher-Hoppe. Funding for this work was from the Wellcome Trust Core Award 090532/Z/09/Z and Senior Investigator Award 095552/Z/11/Z (P.D.), and grants WT098051 (R.D.), WT095908 and WT109497 (P.F.), WT086084/Z/08/Z and WT100956/Z/13/Z (G.M.), WT097307 (W.K.), WT0855322/Z/08/Z (R.L.), WT090770/Z/09/Z (D.K.), the Wellcome Trust Major Overseas program in Vietnam grant 089276/Z.09/Z (S.D.), the Medical Research Council UK grant G0801823 (J.L.M.), the UK Biotechnology and Biological Sciences Research Council grants BB/I02593X/1 (G.M.) and BB/I021213/1 (A.R.L.), the British Heart Foundation (C.A.A.), the Monument Trust (J.H.), the European Molecular Biology Laboratory (P.F.), the European Research Council grant 617306 (J.L.M.), the Chinese 863 Program 2012AA02A201, the National Basic Research program of China 973 program no. 2011CB809201, 2011CB809202 and 2011CB809203, Natural Science Foundation of China 31161130357, the Shenzhen Municipal Government of China grant ZYC201105170397A (J.W.), the Canadian Institutes of Health Research Operating grant 136855 and Canada Research Chair (S.G.), Banting Postdoctoral Fellowship from the Canadian Institutes of Health Research (M.K.D.), a Le Fonds de Recherche duQuébec-Santé (FRQS) research fellowship (A.H.), Genome Quebec (P.A.), the Ontario Ministry of Research and Innovation – Ontario Institute for Cancer Research Investigator Award (P.A., J.S.), the Quebec Ministry of Economic Development, Innovation, and Exports grant PSR-SIIRI-195 (P.A.), the German Federal Ministry of Education and Research (BMBF) grants 0315428A and 01GS08201 (R.H.), the Max Planck Society (H.L., G.M., R.S.), BMBF-EPITREAT grant 0316190A (R.H., M.L.), the German Research Foundation (Deutsche Forschungsgemeinschaft) Emmy Noether Grant KO4037/1-1 (J.O.K.), the Beatriu de Pinos Program grants 2006 BP-A 10144 and 2009 BP-B 00274 (M.V.), the Spanish National Institute for Health Research grant PRB2 IPT13/0001-ISCIII-SGEFI/FEDER (A.O.), Ewha Womans University (C.L.), the Japan Society for the Promotion of Science Fellowship number PE13075 (N.P.), the Louis Jeantet Foundation (E.T.D.), the Marie Curie Actions Career Integration grant 303772 (C.A.), the Swiss National Science Foundation 31003A_130342 and NCCR “Frontiers in Genetics” (E.T.D.), the University of Geneva (E.T.D., T.L., G.M.), the US National Institutes of Health National Center for Biotechnology Information (S.S.) and grants U54HG3067 (E.S.L.), U54HG3273 and U01HG5211 (R.A.G.), U54HG3079 (R.K.W., E.R.M.), R01HG2898 (S.E.D.), R01HG2385 (E.E.E.), RC2HG5552 and U01HG6513 (G.T.M., G.R.A.), U01HG5214 (A.C.), U01HG5715 (C.D.B.), U01HG5718 (M.G.), U01HG5728 (Y.X.F.), U41HG7635 (R.K.W., E.E.E., P.H.S.), U41HG7497 (C.L., M.A.B., K.C., L.D., E.E.E., M.G., J.O.K., G.T.M., S.A.M., R.E.M., J.L.S., K.Y.), R01HG4960 and R01HG5701 (B.L.B.), R01HG5214 (G.A.), R01HG6855 (S.M.), R01HG7068 (R.E.M.), R01HG7644 (R.D.H.), DP2OD6514 (P.S.), DP5OD9154 (J.K.), R01CA166661 (S.E.D.), R01CA172652 (K.C.), P01GM99568 (S.R.B.), R01GM59290 (L.B.J., M.A.B.), R01GM104390 (L.B.J., M.Y.Y.), T32GM7790 (C.D.B., A.R.M.), P01GM99568 (S.R.B.), R01HL87699 and R01HL104608 (K.C.B.), T32HL94284 (J.L.R.F.), and contracts HHSN268201100040C (A.M.R.) and HHSN272201000025C (P.S.), Harvard Medical School Eleanor and Miles Shore Fellowship (K.L.), Lundbeck Foundation Grant R170-2014-1039 (K.L.), NIJ Grant 2014-DN-BX-K089 (Y.E.), the Mary Beryl Patch Turnbull Scholar Program (K.C.B.), NSF Graduate Research Fellowship DGE-1147470 (G.D.P.), the Simons Foundation SFARI award SF51 (M.W.), and a Sloan Foundation Fellowship (R.D.H.). E.E.E. is an investigator of the Howard Hughes Medical Institute
The Molecular Identification of Organic Compounds in the Atmosphere: State of the Art and Challenges
Geomorphic impact and rapid subsequent recovery from the 1996 Skeióarársandur jökulhlaup, Iceland, measured with multi-year airborne lidar
The November 1996 jökulhlaup that burst from the Vatnajökull ice cap onto Skeiðarársandur was the highest-magnitude flood ever measured on the largest active glacial outwash plain (sandur). Centimeter-scale elevation transects, measured from repeat-pass airborne laser altimetry missions flown in 1996 (pre-flood), 1997, and 2001, show that sediment deposition exceeded erosion across the central Skeiðarársandur and established an average net elevation gain of +22 cm for the event. Net elevation gains of +29 and +24 cm occurred in braided channels of the Gígjukvísl and Skeiðará rivers, respectively. Nearly half of these gains, however, were removed within 4 years, and the two rivers contrast strongly in style of erosional/depositional impact and subsequent recovery. In the Gígjukvísl, the 1996 jökulhlaup caused massive sediment deposition (up to ~12 m) near the ice margin and intense “mega-forming” of braided channels and bars downstream. Post-jökulhlaup recovery (1997–2001) was characterized by rapid erosion (-0.5 m) of ice-proximal sediments and their transport to downstream reaches, and eradication of the mega-forms. In contrast, the Skeiðará displays minimal post-jökulhlaup sediment erosion in its upstream reaches and little change in braided channel relief. This contrast between river systems is attributed to the presence of a previously studied ~2-km wide ice-marginal trench, caused by glacier retreat and lowering, which contained flows of the Gígjukvísl but not the Skeiðará prior to dispersal onto the outwash plain. Rapid removal of jökulhlaup deposits from this trench suggests that in terms of long-term evolution of the sandur, such features only delay downstream migration of jökulhlaup-derived sediment. These results, therefore, suggest that the net geomorphic impact of jökulhlaups on surface relief of Skeiðarársandur, while profound in the short term, may be eradicated within years to decades
Correcting for systematic underestimation of topographic glacier aerodynamic roughness values from Hintereisferner, Austria
Spatially-distributed values of glacier aerodynamic roughness (z0) are vital for robust estimates of turbulent energy fluxes and ice and snow melt. Microtopographic data allow rapid estimates of z0 over discrete plot-scale areas, but are sensitive to data scale and resolution. Here, we use an extensive multi-scale dataset from Hintereisferner, Austria, to develop a correction factor to derive z0 values from coarse resolution (up to 30 m) topographic data that are more commonly available over larger areas. Resulting z0 estimates are within an order of magnitude of previously validated, plot-scale estimates and aerodynamic values. The method is developed and tested using plot-scale microtopography data generated by structure from motion photogrammetry combined with glacier-scale data acquired by a permanent in-situ terrestrial laser scanner. Finally, we demonstrate the application of the method to a regional-scale digital elevation model acquired by airborne laser scanning. Our workflow opens up the possibility of including spatio-temporal variations of z0 within glacier surface energy balance models without the need for extensive additional field data collection