10 research outputs found

    Comprehensive Survey of SNPs in the Affymetrix Exon Array Using the 1000 Genomes Dataset

    Get PDF
    Microarray gene expression data has been used in genome-wide association studies to allow researchers to study gene regulation as well as other complex phenotypes including disease risks and drug response. To reach scientifically sound conclusions from these studies, however, it is necessary to get reliable summarization of gene expression intensities. Among various factors that could affect expression profiling using a microarray platform, single nucleotide polymorphisms (SNPs) in target mRNA may lead to reduced signal intensity measurements and result in spurious results. The recently released 1000 Genomes Project dataset provides an opportunity to evaluate the distribution of both known and novel SNPs in the International HapMap Project lymphoblastoid cell lines (LCLs). We mapped the 1000 Genomes Project genotypic data to the Affymetrix GeneChip Human Exon 1.0ST array (exon array), which had been used in our previous studies and for which gene expression data had been made publicly available. We also evaluated the potential impact of these SNPs on the differentially spliced probesets we had identified previously. Though the 1000 Genomes Project data allowed a comprehensive survey of the SNPs in this particular array, the same approach can certainly be applied to other microarray platforms. Furthermore, we present a detailed catalogue of SNP-containing probesets (exon-level) and transcript clusters (gene-level), which can be considered in evaluating findings using the exon array as well as benefit the design of follow-up experiments and data re-analysis

    Altered Gene Synchrony Suggests a Combined Hormone-Mediated Dysregulated State in Major Depression

    Get PDF
    Coordinated gene transcript levels across tissues (denoted “gene synchrony”) reflect converging influences of genetic, biochemical and environmental factors; hence they are informative of the biological state of an individual. So could brain gene synchrony also integrate the multiple factors engaged in neuropsychiatric disorders and reveal underlying pathologies? Using bootstrapped Pearson correlation for transcript levels for the same genes across distinct brain areas, we report robust gene transcript synchrony between the amygdala and cingulate cortex in the human postmortem brain of normal control subjects (n = 14; Control/Permutated data, p<0.000001). Coordinated expression was confirmed across distinct prefrontal cortex areas in a separate cohort (n = 19 subjects) and affected different gene sets, potentially reflecting regional network- and function-dependent transcriptional programs. Genewise regional transcript coordination was independent of age-related changes and array technical parameters. Robust shifts in amygdala-cingulate gene synchrony were observed in subjects with major depressive disorder (MDD, denoted here “depression”) (n = 14; MDD/Permutated data, p<0.000001), significantly affecting between 100 and 250 individual genes (10–30% false discovery rate). Biological networks and signal transduction pathways corresponding to the identified gene set suggested putative dysregulated functions for several hormone-type factors previously implicated in depression (insulin, interleukin-1, thyroid hormone, estradiol and glucocorticoids; p<0.01 for association with depression-related networks). In summary, we showed that coordinated gene expression across brain areas may represent a novel molecular probe for brain structure/function that is sensitive to disease condition, suggesting the presence of a distinct and integrated hormone-mediated corticolimbic homeostatic, although maladaptive and pathological, state in major depression

    A global reference for human genetic variation

    Get PDF
    The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.We thank the many people who were generous with contributing their samples to the project: the African Caribbean in Barbados; Bengali in Bangladesh; British in England and Scotland; Chinese Dai in Xishuangbanna, China; Colombians in Medellin, Colombia; Esan in Nigeria; Finnish in Finland; Gambian in Western Division – Mandinka; Gujarati Indians in Houston, Texas, USA; Han Chinese in Beijing, China; Iberian populations in Spain; Indian Telugu in the UK; Japanese in Tokyo, Japan; Kinh in Ho Chi Minh City, Vietnam; Luhya in Webuye, Kenya; Mende in Sierra Leone; people with African ancestry in the southwest USA; people with Mexican ancestry in Los Angeles, California, USA; Peruvians in Lima, Peru; Puerto Ricans in Puerto Rico; Punjabi in Lahore, Pakistan; southern Han Chinese; Sri Lankan Tamil in the UK; Toscani in Italia; Utah residents (CEPH) with northern and western European ancestry; and Yoruba in Ibadan, Nigeria. Many thanks to the people who contributed to this project: P. Maul, T. Maul, and C. Foster; Z. Chong, X. Fan, W. Zhou, and T. Chen; N. Sengamalay, S. Ott, L. Sadzewicz, J. Liu, and L. Tallon; L. Merson; O. Folarin, D. Asogun, O. Ikpwonmosa, E. Philomena, G. Akpede, S. Okhobgenin, and O. Omoniwa; the staff of the Institute of Lassa Fever Research and Control (ILFRC), Irrua Specialist Teaching Hospital, Irrua, Edo State, Nigeria; A. Schlattl and T. Zichner; S. Lewis, E. Appelbaum, and L. Fulton; A. Yurovsky and I. Padioleau; N. Kaelin and F. Laplace; E. Drury and H. Arbery; A. Naranjo, M. Victoria Parra, and C. Duque; S. Däkel, B. Lenz, and S. Schrinner; S. Bumpstead; and C. Fletcher-Hoppe. Funding for this work was from the Wellcome Trust Core Award 090532/Z/09/Z and Senior Investigator Award 095552/Z/11/Z (P.D.), and grants WT098051 (R.D.), WT095908 and WT109497 (P.F.), WT086084/Z/08/Z and WT100956/Z/13/Z (G.M.), WT097307 (W.K.), WT0855322/Z/08/Z (R.L.), WT090770/Z/09/Z (D.K.), the Wellcome Trust Major Overseas program in Vietnam grant 089276/Z.09/Z (S.D.), the Medical Research Council UK grant G0801823 (J.L.M.), the UK Biotechnology and Biological Sciences Research Council grants BB/I02593X/1 (G.M.) and BB/I021213/1 (A.R.L.), the British Heart Foundation (C.A.A.), the Monument Trust (J.H.), the European Molecular Biology Laboratory (P.F.), the European Research Council grant 617306 (J.L.M.), the Chinese 863 Program 2012AA02A201, the National Basic Research program of China 973 program no. 2011CB809201, 2011CB809202 and 2011CB809203, Natural Science Foundation of China 31161130357, the Shenzhen Municipal Government of China grant ZYC201105170397A (J.W.), the Canadian Institutes of Health Research Operating grant 136855 and Canada Research Chair (S.G.), Banting Postdoctoral Fellowship from the Canadian Institutes of Health Research (M.K.D.), a Le Fonds de Recherche duQuébec-Santé (FRQS) research fellowship (A.H.), Genome Quebec (P.A.), the Ontario Ministry of Research and Innovation – Ontario Institute for Cancer Research Investigator Award (P.A., J.S.), the Quebec Ministry of Economic Development, Innovation, and Exports grant PSR-SIIRI-195 (P.A.), the German Federal Ministry of Education and Research (BMBF) grants 0315428A and 01GS08201 (R.H.), the Max Planck Society (H.L., G.M., R.S.), BMBF-EPITREAT grant 0316190A (R.H., M.L.), the German Research Foundation (Deutsche Forschungsgemeinschaft) Emmy Noether Grant KO4037/1-1 (J.O.K.), the Beatriu de Pinos Program grants 2006 BP-A 10144 and 2009 BP-B 00274 (M.V.), the Spanish National Institute for Health Research grant PRB2 IPT13/0001-ISCIII-SGEFI/FEDER (A.O.), Ewha Womans University (C.L.), the Japan Society for the Promotion of Science Fellowship number PE13075 (N.P.), the Louis Jeantet Foundation (E.T.D.), the Marie Curie Actions Career Integration grant 303772 (C.A.), the Swiss National Science Foundation 31003A_130342 and NCCR “Frontiers in Genetics” (E.T.D.), the University of Geneva (E.T.D., T.L., G.M.), the US National Institutes of Health National Center for Biotechnology Information (S.S.) and grants U54HG3067 (E.S.L.), U54HG3273 and U01HG5211 (R.A.G.), U54HG3079 (R.K.W., E.R.M.), R01HG2898 (S.E.D.), R01HG2385 (E.E.E.), RC2HG5552 and U01HG6513 (G.T.M., G.R.A.), U01HG5214 (A.C.), U01HG5715 (C.D.B.), U01HG5718 (M.G.), U01HG5728 (Y.X.F.), U41HG7635 (R.K.W., E.E.E., P.H.S.), U41HG7497 (C.L., M.A.B., K.C., L.D., E.E.E., M.G., J.O.K., G.T.M., S.A.M., R.E.M., J.L.S., K.Y.), R01HG4960 and R01HG5701 (B.L.B.), R01HG5214 (G.A.), R01HG6855 (S.M.), R01HG7068 (R.E.M.), R01HG7644 (R.D.H.), DP2OD6514 (P.S.), DP5OD9154 (J.K.), R01CA166661 (S.E.D.), R01CA172652 (K.C.), P01GM99568 (S.R.B.), R01GM59290 (L.B.J., M.A.B.), R01GM104390 (L.B.J., M.Y.Y.), T32GM7790 (C.D.B., A.R.M.), P01GM99568 (S.R.B.), R01HL87699 and R01HL104608 (K.C.B.), T32HL94284 (J.L.R.F.), and contracts HHSN268201100040C (A.M.R.) and HHSN272201000025C (P.S.), Harvard Medical School Eleanor and Miles Shore Fellowship (K.L.), Lundbeck Foundation Grant R170-2014-1039 (K.L.), NIJ Grant 2014-DN-BX-K089 (Y.E.), the Mary Beryl Patch Turnbull Scholar Program (K.C.B.), NSF Graduate Research Fellowship DGE-1147470 (G.D.P.), the Simons Foundation SFARI award SF51 (M.W.), and a Sloan Foundation Fellowship (R.D.H.). E.E.E. is an investigator of the Howard Hughes Medical Institute

    Strain-level diversity drives alternative community types in millimetre-scale granular biofilms

    No full text
    Microbial communities are often highly diverse in their composition, both at a coarse-grained taxonomic level, such as genus, and at a highly resolved level, such as strains, within species. This variability can be driven by either extrinsic factors such as temperature and or by intrinsic ones, for example demographic fluctuations or ecological interactions. The relative contributions of these factors and the taxonomic level at which they influence community composition remain poorly understood, in part because of the difficulty in identifying true community replicates assembled under the same environmental parameters. Here, we address this problem using an activated granular sludge reactor in which millimetre-scale biofilm granules represent true community replicates. Differences in composition are then expected to be driven primarily by biotic factors. Using 142 shotgun metagenomes of single biofilm granules we found that, at the commonly used genus-level resolution, community replicates varied much more in their composition than would be expected from neutral assembly processes. This variation did not translate into any clear partitioning into discrete community types, that is, distinct compositional states, such as enterotypes in the human gut. However, a strong partition into community types did emerge at the strain level for the dominant organism: genotypes of Candidatus Accumulibacter that coexisted in the metacommunity (the reactor) excluded each other within community replicates (granules). Individual granule communities maintained a significant lineage structure, whereby the strain phylogeny of Accumulibacter correlated with the overall composition of the community, indicating a high potential for co-diversification among species and communities. Our results suggest that due to the high functional redundancy and competition between close relatives, alternative community types are most probably observed at the level of recently differentiated genotypes but not at higher orders of genetic resolution

    Long-read assembly of a Great Dane genome highlights the contribution of GC-rich sequence and mobile elements to canine genomes

    Get PDF
    National audienceTechnological advances have allowed improvements in genome reference sequence assemblies. Here, we combined long- and short-read sequence resources to assemble the genome of a female Great Dane dog. This assembly has improved continuity compared to the existing Boxer-derived (CanFam3.1) reference genome. Annotation of the Great Dane assembly identified 22,182 protein-coding gene models and 7,049 long noncoding RNAs, including 49 protein-coding genes not present in the CanFam3.1 reference. The Great Dane assembly spans the majority of sequence gaps in the CanFam3.1 reference and illustrates that 2,151 gaps overlap the transcription start site of a predicted protein-coding gene. Moreover, a subset of the resolved gaps, which have an 80.95% median GC content, localize to transcription start sites and recombination hotspots more often than expected by chance, suggesting the stable canine recombinational landscape has shaped genome architecture. Alignment of the Great Dane and CanFam3.1 assemblies identified 16,834 deletions and 15,621 insertions, as well as 2,665 deletions and 3,493 insertions located on secondary contigs. These structural variants are dominated by retrotransposon insertion/deletion polymorphisms and include 16,221 dimorphic canine short interspersed elements (SINECs) and 1,121 dimorphic long interspersed element-1 sequences (LINE-1_Cfs). Analysis of sequences flanking the 3' end of LINE-1_Cfs (i.e., LINE-1_Cf 3'-transductions) suggests multiple retrotransposition-competent LINE-1_Cfs segregate among dog populations. Consistent with this conclusion, we demonstrate that a canine LINE-1_Cf element with intact open reading frames can retrotranspose its own RNA and that of a SINEC_Cf consensus sequence in cultured human cells, implicating ongoing retrotransposon activity as a driver of canine genetic variation

    Sequencing Y Chromosomes Resolves Discrepancy in Time to Common Ancestor of Males Versus Females

    No full text
    The Y chromosome and the mitochondrial genome (mtDNA) have been used to estimate when the common patrilineal and matrilineal ancestors of humans lived. We sequenced the genomes of 69 males from nine populations, including two in which we find basal branches of the Y chromosome tree. We identify ancient phylogenetic structure within African haplogroups and resolve a long-standing ambiguity deep within the tree. Applying equivalent methodologies to the Y and mtDNA, we estimate the time to the most recent common ancestor (T(MRCA)) of the Y chromosome to be 120–156 thousand years and the mtDNA T(MRCA) to be 99–148 ky. Our findings suggest that, contrary to prior claims, male lineages do not coalesce significantly more recently than female lineages

    Increased activity of Diaphanous homolog 3 (DIAPH3)/diaphanous causes hearing defects in humans with auditory neuropathy and in Drosophila

    No full text
    Auditory neuropathy is a rare form of deafness characterized by an absent or abnormal auditory brainstem response with preservation of outer hair cell function. We have identified Diaphanous homolog 3 (DIAPH3) as the gene responsible for autosomal dominant nonsyndromic auditory neuropathy (AUNA1), which we previously mapped to chromosome 13q21-q24. Genotyping of additional family members narrowed the interval to an 11-Mb, 3.28-cM gene-poor region containing only four genes, including DIAPH3. DNA sequencing of DIAPH3 revealed a c.-172G > A, g. 48G > A mutation in a highly conserved region of the 5′ UTR. The c.-172G > A mutation occurs within a GC box sequence element and was not found in 379 controls. Using genome-wide expression arrays and quantitative RT-PCR, we demonstrate a 2- to 3-fold overexpression of DIAPH3 mRNA in lymphoblastoid cell lines from affected individuals. Likewise, a significant increase (≈1.5-fold) in DIAPH3 protein was found by quantitative immunoblotting of lysates from lymphoblastoid cell lines derived from affected individuals in comparison with controls. In addition, the c.-172G > A mutation is sufficient to drive overexpression of a luciferase reporter. Finally, the expression of a constitutively active form of diaphanous protein in the auditory organ of Drosophila melanogaster recapitulates the phenotype of impaired response to sound. To date, only two genes, the otoferlin gene OTOF and the pejvakin gene PJVK, are known to underlie nonsyndromic auditory neuropathy. Genetic testing for DIAPH3 may be useful for individuals with recessive as well as dominant inheritance of nonsyndromic auditory neuropathy
    corecore