186 research outputs found

    Causal graphical models in systems genetics: A unified framework for joint inference of causal network and genetic architecture for correlated phenotypes

    Full text link
    Causal inference approaches in systems genetics exploit quantitative trait loci (QTL) genotypes to infer causal relationships among phenotypes. The genetic architecture of each phenotype may be complex, and poorly estimated genetic architectures may compromise the inference of causal relationships among phenotypes. Existing methods assume QTLs are known or inferred without regard to the phenotype network structure. In this paper we develop a QTL-driven phenotype network method (QTLnet) to jointly infer a causal phenotype network and associated genetic architecture for sets of correlated phenotypes. Randomization of alleles during meiosis and the unidirectional influence of genotype on phenotype allow the inference of QTLs causal to phenotypes. Causal relationships among phenotypes can be inferred using these QTL nodes, enabling us to distinguish among phenotype networks that would otherwise be distribution equivalent. We jointly model phenotypes and QTLs using homogeneous conditional Gaussian regression models, and we derive a graphical criterion for distribution equivalence. We validate the QTLnet approach in a simulation study. Finally, we illustrate with simulated data and a real example how QTLnet can be used to infer both direct and indirect effects of QTLs and phenotypes that co-map to a genomic region.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS288 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Quantitative measures for the management and comparison of annotated genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The ever-increasing number of sequenced and annotated genomes has made management of their annotations a significant undertaking, especially for large eukaryotic genomes containing many thousands of genes. Typically, changes in gene and transcript numbers are used to summarize changes from release to release, but these measures say nothing about changes to individual annotations, nor do they provide any means to identify annotations in need of manual review.</p> <p>Results</p> <p>In response, we have developed a suite of quantitative measures to better characterize changes to a genome's annotations between releases, and to prioritize problematic annotations for manual review. We have applied these measures to the annotations of five eukaryotic genomes over multiple releases – <it>H. sapiens</it>, <it>M. musculus</it>, <it>D. melanogaster</it>, <it>A. gambiae</it>, and <it>C. elegans</it>.</p> <p>Conclusion</p> <p>Our results provide the first detailed, historical overview of how these genomes' annotations have changed over the years, and demonstrate the usefulness of these measures for genome annotation management.</p

    A standard variation file format for human genome sequences

    Get PDF
    Here we describe the Genome Variation Format (GVF) and the 10Gen dataset. GVF, an extension of Generic Feature Format version 3 (GFF3), is a simple tab-delimited format for DNA variant files, which uses Sequence Ontology to describe genome variation data. The 10Gen dataset, ten human genomes in GVF format, is freely available for community analysis from the Sequence Ontology website and from an Amazon elastic block storage (EBS) snapshot for use in Amazon's EC2 cloud computing environment

    Transcriptomic profiling reveals extraordinary diversity of venom peptides in unexplored predatory gastropods of the genus Clavus

    Get PDF
    Predatory gastropods of the superfamily Conoidea number over 12,000 living species. The evolutionary success of this lineage can be explained by the ability of conoideans to produce complex venoms for hunting, defense and competitive interactions. Whereas venoms of cone snails (family Conidae) have become increasingly well studied, the venoms of most other conoidean lineages remain largely uncharacterized. In the present study we present the venom gland transcriptomes of two species of the genus Clavus that belong to the family Drilliidae. Venom gland transcriptomes of two specimens of Clavus canalicularis, and two specimens of Cv. davidgilmouri were analyzed, leading to the identification of a total of 1,176 putative venom peptide toxins ( drillipeptides ). Based on the combined evidence of secretion signal sequence identity, entire precursor similarity search (BLAST), and the orthology inference, putative Clavus toxins were assigned to 158 different gene families. The majority of identified transcripts comprise signal, pro-, mature peptide, and post- regions, with a typically short ( \u3c 50 amino acids) and cysteine-rich mature peptide region. Thus drillipeptides are structurally similar to conotoxins. However, convincing homology with known groups of Conus toxins was only detected for very few toxin families. Among these are Clavus counterparts of Conus venom insulins (drillinsulins), porins (drilliporins), highly diversified lectins (drillilectins). The short size of most drillipeptpides and structural similarity to conotoxins was unexpected, given that most related conoidean gastropod families (Terebridae and Turridae) possess longer mature peptide regions. Our findings indicate that, similar to conotoxins, drillipeptides may represent a valuable resource for future pharmacological exploration

    The Douglas-Fir Genome Sequence Reveals Specialization of the Photosynthetic Apparatus in Pinaceae.

    Get PDF
    A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp). Incremental improvements in sequencing and assembly technologies are in part responsible for the higher quality reference genome, but it may also be due to a slightly lower exact repeat content in Douglas-fir vs. pine and spruce. Comparative genome annotation with angiosperm species reveals gene-family expansion and contraction in Douglas-fir and other conifers which may account for some of the major morphological and physiological differences between the two major plant groups. Notable differences in the size of the NDH-complex gene family and genes underlying the functional basis of shade tolerance/intolerance were observed. This reference genome sequence not only provides an important resource for Douglas-fir breeders and geneticists but also sheds additional light on the evolutionary processes that have led to the divergence of modern angiosperms from the more ancient gymnosperms

    Lipidomic QTL in Diversity Outbred mice identifies a novel function for α/β hydrolase domain 2 (Abhd2) as an enzyme that metabolizes phosphatidylcholine and cardiolipin.

    Get PDF
    We and others have previously shown that genetic association can be used to make causal connections between gene loci and small molecules measured by mass spectrometry in the bloodstream and in tissues. We identified a locus on mouse chromosome 7 where several phospholipids in liver showed strong genetic association to distinct gene loci. In this study, we integrated gene expression data with genetic association data to identify a single gene at the chromosome 7 locus as the driver of the phospholipid phenotypes. The gene encodes α/β-hydrolase domain 2 (Abhd2), one of 23 members of the ABHD gene family. We validated this observation by measuring lipids in a mouse with a whole-body deletion of Abhd2. The Abhd2KO mice had a significant increase in liver levels of phosphatidylcholine and phosphatidylethanolamine. Unexpectedly, we also found a decrease in two key mitochondrial lipids, cardiolipin and phosphatidylglycerol, in male Abhd2KO mice. These data suggest that Abhd2 plays a role in the synthesis, turnover, or remodeling of liver phospholipids

    Identification of PKD1L1 Gene Variants in Children with the Biliary Atresia Splenic Malformation Syndrome

    Get PDF
    Biliary atresia (BA) is the most common cause of end‐stage liver disease in children and the primary indication for pediatric liver transplantation, yet underlying etiologies remain unknown. Approximately 10% of infants affected by BA exhibit various laterality defects (heterotaxy) including splenic abnormalities and complex cardiac malformations — a distinctive subgroup commonly referred to as the biliary atresia splenic malformation (BASM) syndrome. We hypothesized that genetic factors linking laterality features with the etiopathogenesis of BA in BASM patients could be identified through whole exome sequencing (WES) of an affected cohort. DNA specimens from 67 BASM subjects, including 58 patient‐parent trios, from the NIDDK‐supported Childhood Liver Disease Research Network (ChiLDReN) underwent WES. Candidate gene variants derived from a pre‐specified set of 2,016 genes associated with ciliary dysgenesis and/or dysfunction or cholestasis were prioritized according to pathogenicity, population frequency, and mode of inheritance. Five BASM subjects harbored rare and potentially deleterious bi‐allelic variants in polycystin 1‐like 1, PKD1L1, a gene associated with ciliary calcium signaling and embryonic laterality determination in fish, mice and humans. Heterozygous PKD1L1 variants were found in 3 additional subjects. Immunohistochemical analysis of liver from the one BASM subject available revealed decreased PKD1L1 expression in bile duct epithelium when compared to normal livers and livers affected by other non‐cholestatic diseases. Conclusion WES identified bi‐allelic and heterozygous PKD1L1 variants of interest in 8 BASM subjects from the ChiLDReN dataset. The dual roles for PKD1L1 in laterality determination and ciliary function suggest that PKD1L1 is a new, biologically plausible, cholangiocyte‐expressed candidate gene for the BASM syndrome

    Genome-Wide Analysis of Human Disease Alleles Reveals That Their Locations Are Correlated in Paralogous Proteins

    Get PDF
    The millions of mutations and polymorphisms that occur in human populations are potential predictors of disease, of our reactions to drugs, of predisposition to microbial infections, and of age-related conditions such as impaired brain and cardiovascular functions. However, predicting the phenotypic consequences and eventual clinical significance of a sequence variant is not an easy task. Computational approaches have found perturbation of conserved amino acids to be a useful criterion for identifying variants likely to have phenotypic consequences. To our knowledge, however, no study to date has explored the potential of variants that occur at homologous positions within paralogous human proteins as a means of identifying polymorphisms with likely phenotypic consequences. In order to investigate the potential of this approach, we have assembled a unique collection of known disease-causing variants from OMIM and the Human Genome Mutation Database (HGMD) and used them to identify and characterize pairs of sequence variants that occur at homologous positions within paralogous human proteins. Our analyses demonstrate that the locations of variants are correlated in paralogous proteins. Moreover, if one member of a variant-pair is disease-causing, its partner is likely to be disease-causing as well. Thus, information about variant-pairs can be used to identify potentially disease-causing variants, extend existing procedures for polymorphism prioritization, and provide a suite of candidates for further diagnostic and therapeutic purposes
    corecore