2,184 research outputs found

    Sequencing strategies and characterization of 721 vervet monkey genomes for future genetic analyses of medically relevant traits

    Get PDF
    Background We report here the first genome-wide high-resolution polymorphism resource for non-human primate (NHP) association and linkage studies, constructed for the Caribbean-origin vervet monkey, or African green monkey (Chlorocebus aethiops sabaeus), one of the most widely used NHPs in biomedical research. We generated this resource by whole genome sequencing (WGS) of monkeys from the Vervet Research Colony (VRC), an NIH-supported research resource for which extensive phenotypic data are available. Results We identified genome-wide single nucleotide polymorphisms (SNPs) by WGS of 721 members of an extended pedigree from the VRC. From high-depth WGS data we identified more than 4 million polymorphic unequivocal segregating sites; by pruning these SNPs based on heterozygosity, quality control filters, and the degree of linkage disequilibrium (LD) between SNPs, we constructed genome-wide panels suitable for genetic association (about 500,000 SNPs) and linkage analysis (about 150,000 SNPs). To further enhance the utility of these resources for linkage analysis, we used a further pruned subset of the linkage panel to generate multipoint identity by descent matrices. Conclusions The genetic and phenotypic resources now available for the VRC and other Caribbean-origin vervets enable their use for genetic investigation of traits relevant to human diseases

    Genetic linkage analysis in the age of whole-genome sequencing

    Get PDF
    For many years, linkage analysis was the primary tool used for the genetic mapping of Mendelian and complex traits with familial aggregation. Linkage analysis was largely supplanted by the wide adoption of genome-wide association studies (GWASs). However, with the recent increased use of whole-genome sequencing (WGS), linkage analysis is again emerging as an important and powerful analysis method for the identification of genes involved in disease aetiology, often in conjunction with WGS filtering approaches. Here, we review the principles of linkage analysis and provide practical guidelines for carrying out linkage studies using WGS data

    Linkage of an ABCC transporter to a single QTL that controls Ostrinia nubilalis larval resistance to the Bacillus thuringiensis Cry1Fa toxin

    Get PDF
    AbstractField evolved resistance of insect populations to Bacillus thuringiensis (Bt) crystalline (Cry) toxins expressed by crop plants has resulted in reduced control of insect feeding damage to field crops, and threatens the sustainability of Bt transgenic technologies. A single quantitative trait locus (QTL) that determines resistance in Ostrinia nubilalis larvae capable of surviving on reproductive stage transgenic corn that express the Bt Cry1Fa toxin was previously mapped to linkage group 12 (LG12) in a backcross pedigree. Fine mapping with high-throughput single nucleotide polymorphism (SNP) anchor markers, a candidate ABC transporter (abcc2) marker, and de novo mutations predicted from a genotyping-by-sequencing (GBS) data redefined a 268.8 cM LG12. The single QTL on LG12 spanned an approximate 46.1 cM region, in which marker 02302.286 and abcc2 were ≤2.81 cM, and the GBS marker 697 was an estimated 1.89 cM distant from the causal genetic factor. This positional mapping data showed that an O. nubilalis genome region encoding an abcc2 transporter is in proximity to a single QTL involved in the inheritance of Cry1F resistance, and will assist in the future identification the mutation(s) involved with this phenotype

    Evaluation of gene-based family-based methods to detect novel genes associated with familial late onset Alzheimer disease

    Get PDF
    AbstractGene-based tests to study the combined effect of rare variants towards a particular phenotype have been widely developed for case-control studies, but their evolution and adaptation for family-based studies, especially for complex incomplete families, has been slower. In this study, we have performed a practical examination of all the latest gene-based methods available for family-based study designs using both simulated and real datasets. We have examined the performance of several collapsing, variance-component and transmission disequilibrium tests across eight different software and twenty-two models utilizing a cohort of 285 families (N=1,235) with late-onset Alzheimer disease (LOAD). After a thorough examination of each of these tests, we propose a methodological approach to identify, with high confidence, genes associated with the studied phenotype with high confidence and we provide recommendations to select the best software and model for family-based gene-based analyses. Additionally, in our dataset, we identified PTK2B, a GWAS candidate gene for sporadic AD, along with six novel genes (CHRD, CLCN2, HDLBP, CPAMD8, NLRP9, MAS1L) as candidates genes for familial LOAD.</jats:p

    Rediscovering the value of families for psychiatric genetics research

    Get PDF
    As it is likely that both common and rare genetic variation are important for complex disease risk, studies that examine the full range of the allelic frequency distribution should be utilized to dissect the genetic influences on mental illness. The rate limiting factor for inferring an association between a variant and a phenotype is inevitably the total number of copies of the minor allele captured in the studied sample. For rare variation, with minor allele frequencies of 0.5% or less, very large samples of unrelated individuals are necessary to unambiguously associate a locus with an illness. Unfortunately, such large samples are often cost prohibitive. However, by using alternative analytic strategies and studying related individuals, particularly those from large multiplex families, it is possible to reduce the required sample size while maintaining statistical power. We contend that using whole genome sequence (WGS) in extended pedigrees provides a cost-effective strategy for psychiatric gene mapping that complements common variant approaches and WGS in unrelated individuals. This was our impetus for forming the “Pedigree-Based Whole Genome Sequencing of Affective and Psychotic Disorders” consortium. In this review, we provide a rationale for the use of WGS with pedigrees in modern psychiatric genetics research. We begin with a focused review of the current literature, followed by a short history of family-based research in psychiatry. Next, we describe several advantages of pedigrees for WGS research, including power estimates, methods for studying the environment, and endophenotypes. We conclude with a brief description of our consortium and its goals

    Design Considerations for Massively Parallel Sequencing Studies of Complex Human Disease

    Get PDF
    Massively Parallel Sequencing (MPS) allows sequencing of entire exomes and genomes to now be done at reasonable cost, and its utility for identifying genes responsible for rare Mendelian disorders has been demonstrated. However, for a complex disease, study designs need to accommodate substantial degrees of locus, allelic, and phenotypic heterogeneity, as well as complex relationships between genotype and phenotype. Such considerations include careful selection of samples for sequencing and a well-developed strategy for identifying the few “true” disease susceptibility genes from among the many irrelevant genes that will be found to harbor rare variants. To examine these issues we have performed simulation-based analyses in order to compare several strategies for MPS sequencing in complex disease. Factors examined include genetic architecture, sample size, number and relationship of individuals selected for sequencing, and a variety of filters based on variant type, multiple observations of genes and concordance of genetic variants within pedigrees. A two-stage design was assumed where genes from the MPS analysis of high-risk families are evaluated in a secondary screening phase of a larger set of probands with more modest family histories. Designs were evaluated using a cost function that assumes the cost of sequencing the whole exome is 400 times that of sequencing a single candidate gene. Results indicate that while requiring variants to be identified in multiple pedigrees and/or in multiple individuals in the same pedigree are effective strategies for reducing false positives, there is a danger of over-filtering so that most true susceptibility genes are missed. In most cases, sequencing more than two individuals per pedigree results in reduced power without any benefit in terms of reduced overall cost. Further, our results suggest that although no single strategy is optimal, simulations can provide important guidelines for study design

    A high-density transcript linkage map with 1,845 expressed genes positioned by microarray-based Single Feature Polymorphisms (SFP) in Eucalyptus

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Technological advances are progressively increasing the application of genomics to a wider array of economically and ecologically important species. High-density maps enriched for transcribed genes facilitate the discovery of connections between genes and phenotypes. We report the construction of a high-density linkage map of expressed genes for the heterozygous genome of <it>Eucalyptus </it>using Single Feature Polymorphism (SFP) markers.</p> <p>Results</p> <p>SFP discovery and mapping was achieved using pseudo-testcross screening and selective mapping to simultaneously optimize linkage mapping and microarray costs. SFP genotyping was carried out by hybridizing complementary RNA prepared from 4.5 year-old trees xylem to an SFP array containing 103,000 25-mer oligonucleotide probes representing 20,726 unigenes derived from a modest size expressed sequence tags collection. An SFP-mapping microarray with 43,777 selected candidate SFP probes representing 15,698 genes was subsequently designed and used to genotype SFPs in a larger subset of the segregating population drawn by selective mapping. A total of 1,845 genes were mapped, with 884 of them ordered with high likelihood support on a framework map anchored to 180 microsatellites with average density of 1.2 cM. Using more probes per unigene increased by two-fold the likelihood of detecting segregating SFPs eventually resulting in more genes mapped. <it>In silico </it>validation showed that 87% of the SFPs map to the expected location on the 4.5X draft sequence of the <it>Eucalyptus grandis </it>genome.</p> <p>Conclusions</p> <p>The <it>Eucalyptus </it>1,845 gene map is the most highly enriched map for transcriptional information for any forest tree species to date. It represents a major improvement on the number of genes previously positioned on <it>Eucalyptus </it>maps and provides an initial glimpse at the gene space for this global tree genome. A general protocol is proposed to build high-density transcript linkage maps in less characterized plant species by SFP genotyping with a concurrent objective of reducing microarray costs. HIgh-density gene-rich maps represent a powerful resource to assist gene discovery endeavors when used in combination with QTL and association mapping and should be especially valuable to assist the assembly of reference genome sequences soon to come for several plant and animal species.</p

    Near-saturated and complete genetic linkage map of black spruce (Picea mariana)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genetic maps provide an important genomic resource for understanding genome organization and evolution, comparative genomics, mapping genes and quantitative trait loci, and associating genomic segments with phenotypic traits. Spruce (<it>Picea</it>) genomics work is quite challenging, mainly because of extremely large size and highly repetitive nature of its genome, unsequenced and poorly understood genome, and the general lack of advanced-generation pedigrees. Our goal was to construct a high-density genetic linkage map of black spruce (<it>Picea mariana</it>, 2n = 24), which is a predominant, transcontinental species of the North American boreal and temperate forests, with high ecological and economic importance.</p> <p>Results</p> <p>We have developed a near-saturated and complete genetic linkage map of black spruce using a three-generation outbred pedigree and amplified fragment length polymorphism (AFLP), selectively amplified microsatellite polymorphic loci (SAMPL), expressed sequence tag polymorphism (ESTP), and microsatellite (mostly cDNA based) markers. Maternal, paternal, and consensus genetic linkage maps were constructed. The maternal, paternal, and consensus maps in our study consistently coalesced into 12 linkage groups, corresponding to the haploid chromosome number (1n = 1x = 12) of 12 in the genus <it>Picea</it>. The maternal map had 816 and the paternal map 743 markers distributed over 12 linkage groups each. The consensus map consisted of 1,111 markers distributed over 12 linkage groups, and covered almost the entire (> 97%) black spruce genome. The mapped markers included 809 AFLPs, 255 SAMPL, 42 microsatellites, and 5 ESTPs. Total estimated length of the genetic map was 1,770 cM, with an average of one marker every 1.6 cM. The maternal, paternal and consensus genetic maps aligned almost perfectly.</p> <p>Conclusion</p> <p>We have constructed the first high density to near-saturated genetic linkage map of black spruce, with greater than 97% genome coverage. Also, this is the first genetic map based on a three-generation outbred pedigree in the genus <it>Picea</it>. The genome length in <it>P. mariana </it>is likely to be about 1,800 cM. The genetic maps developed in our study can serve as a reference map for various genomics studies and applications in <it>Picea a</it>nd Pinaceae.</p
    corecore