21 research outputs found

    Population- and individual-specific regulatory variation in Sardinia

    Get PDF
    Genetic studies of complex traits have mainly identified associations with noncoding variants. To further determine the contribution of regulatory variation, we combined whole-genome and transcriptome data for 624 individuals from Sardinia to identify common and rare variants that influence gene expression and splicing. We identified 21,183 expression quantitative trait loci (eQTLs) and 6,768 splicing quantitative trait loci (sQTLs), including 619 new QTLs. We identified high-frequency QTLs and found evidence of selection near genes involved in malarial resistance and increased multiple sclerosis risk, reflecting the epidemiological history of Sardinia. Using family relationships, we identified 809 segregating expression outliers (median z score of 2.97), averaging 13.3 genes per individual. Outlier genes were enriched for proximal rare variants, providing a new approach to study large-effect regulatory variants and their relevance to traits. Our results provide insight into the effects of regulatory variants and their relationship to population history and individual genetic risk.M.P. is supported by the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement 633964 (ImmunoAgeing). Z.Z. is supported by the National Science Foundation (NSF) GRFP (DGE- 114747) and by the Stanford Center for Computational, Evolutionary, and Human Genomics (CEHG). Z.Z., J.R.D., and G.T.H. also acknowledge support from the Stanford Genome Training Program (SGTP; NIH/NHGRI T32HG000044). J.R.D. is supported by the Stanford Graduate Fellowship. K.R.K. is supported by Department of Defense, Air Force Office of Scientific Research, National Defense Science and Engineering Graduate (NDSEQ) Fellowship 32 CFR 168a. S.J.S. is supported by the NIHR Cambridge Biomedical Research Centre. The SardiNIA project is supported in part by the intramural program of the National Institute on Aging through contract HHSN271201100005C to the Consiglio Nazionale delle Ricerche of Italy. The RNA sequencing was supported by the PB05 InterOmics MIUR Flagship grant; by the FaReBio2011 “Farmaci e Reti Biotecnologiche di Qualità” grant; and by Sardinian Autonomous Region (L.R. no. 7/2009) grant cRP3-154 to F. Cucca, who is also supported by the Italian Foundation for Multiple Sclerosis (FISM 2015/R/09) and by the Fondazione di Sardegna (ex Fondazione Banco di Sardegna, Prot. U1301.2015/AI.1157.BE Prat. 2015-1651). S.B.M. is supported by the US National Institutes of Health through R01HG008150, R01MH101814, U01HG007436, and U01HG009080. All of the authors would like to thank the CRS4 and the SCGPM for the computational infrastructure supporting this project

    Scaling with the flow: advantages of a MapReduce-based scalable and high-throughput sequencing workflow

    No full text
    The continuous increase in sequencing throughput imposes a new generation of tools for data processing. The alternative is to continue suffering scalability problems in processing workflows and IT infrastructure. We evaluate the advantages that the CRS4 Sequencing and Genotyping Platform (CSGP), equipped with 6 Illumina sequencers, gained by replacing its conventional workflow with a new one based on Seal (http://biodoop-seal.sf.net) and Hadoop. The former was a standard pipeline that demultiplexed samples, aligned reads with BWA, removed duplicates with Picard and recalibrated base qualities with GATK. It parallelized computation through concurrent jobs, using a centralized file system to share data. This implementation showed weaknesses as the workload increased: low parallelism; I/O bottleneck at central storage; failure of entire analyses due to node failures or transient cluster problems. The new workflow is a custom, distributed pipeline based on the open-source Seal suite, which provides a set of tools (including a distributed BWA aligner) that run on the Hadoop MapReduce framework, leveraging its functionality for genomic sequencing applications. By switching to a Seal-based workflow we have acquired computational scalability out-of-the-box. Therefore, we can now easily meet the demands imposed by the growing sequencing platform by adding more computing nodes. In addition, the much-increased parallelism has improved overall computational throughput by taking advantage of all available computing power. Notably, we drastically sped up alignment and duplicates removal by 5x without adding computation nodes; adding nodes would result in additional throughput. Moreover, the effort required by our operators to run the analyses has been reduced, since Hadoop transparently handles most hardware and transient network problems and provides a friendly web interface to monitor job progress and logs. Finally, we eliminated the need for our expensive shared parallel storage devices. Our tests reveal that Seal is efficient, achieving close to 70% of the theoretical maximum throughput per node (measured with a single-node version of the workflow on a small data set) and scales linearly at least up to 128 nodes. In summary, this case study suggests that the MapReduce programming model, Seal and Hadoop provide considerable benefits in the genomic sequencing domain. Seal now includes our new workflow as a downloadable sample application.2011-10-11Montreal - CanadaThe 12TH International Congress Of Human Genetics & The American Society Of Human Genetics, 61ST Annual Meeting, October 11–15, 2011 Montreal Canad

    Angiogenesis in gynecological cancers and the options for anti-angiogenesis therapy

    No full text
    Angiogenesis is required in cancer, including gynecological cancers, for the growth of primary tumors and secondary metastases. Development of anti-angiogenesis therapy in gynecological cancers and improvement of its efficacy have been a major focus of fundamental and clinical research. However, survival benefits of current anti-angiogenic agents, such as bevacizumab, in patients with gynecological cancer, are modest. Therefore, a better understanding of angiogenesis and the tumor microenvironment in gynecological cancers is urgently needed to develop more effective anti-angiogenic therapies, either or not in combination with other therapeutic approaches. We describe the molecular aspects of (tumor) blood vessel formation and the tumor microenvironment and provide an extensive clinical overview of current anti-angiogenic therapies for gynecological cancers. We discuss the different phenotypes of angiogenic endothelial cells as potential therapeutic targets, strategies aimed at intervention in their metabolism, and approaches targeting their (inflammatory) tumor microenvironment

    Low-Pass DNA sequencing of 1200 Sardinians reconstructs European Y-cvhromosome phylogeny

    No full text
    Genetic variation within the male-specific portion of the Y chromosome (MSY) can clarify the origins of contemporary populations, but previous studies were hampered by partial genetic information. Population sequencing of 1204 Sardinian males identified 11,763 MSY single-nucleotide polymorphisms, 6751 of which have not previously been observed. We constructed a MSY phylogenetic tree containing all main haplogroups found in Europe, along with many Sardinian-specific lineage clusters within each haplogroup. The tree was calibrated with archaeological data from the initial expansion of the Sardinian population ~7700 years ago. The ages of nodes highlight different genetic strata in Sardinia and reveal the presumptive timing of coalescence with other human populations. We calculate a putative age for coalescence of ~180,000 to 200,000 years ago, which is consistent with previous mitochondrial DNA–based estimates

    Autosomal-Dominant Multiple Pterygium Syndrome Is Caused by Mutations in MYH3

    Get PDF
    Multiple pterygium syndrome (MPS) is a phenotypically and genetically heterogeneous group of rare Mendelian conditions characterized by multiple pterygia, scoliosis, and congenital contractures of the limbs. MPS typically segregates as an autosomal-recessive disorder, but rare instances of autosomal-dominant transmission have been reported. Whereas several mutations causing recessive MPS have been identified, the genetic basis of dominant MPS remains unknown. We identified four families affected by dominantly transmitted MPS characterized by pterygia, camptodactyly of the hands, vertebral fusions, and scoliosis. Exome sequencing identified predicted protein-altering mutations in embryonic myosin heavy chain (MYH3) in three families. MYH3 mutations underlie distal arthrogryposis types 1, 2A, and 2B, but all mutations reported to date occur in the head and neck domains. In contrast, two of the mutations found to cause MPS in this study occurred in the tail domain. The phenotypic overlap among persons with MPS, coupled with physical findings distinct from other conditions caused by mutations in MYH3, suggests that the developmental mechanism underlying MPS differs from that of other conditions and/or that certain functions of embryonic myosin might be perturbed by disruption of specific residues and/or domains. Moreover, the vertebral fusions in persons with MPS, coupled with evidence of MYH3 expression in bone, suggest that embryonic myosin plays a role in skeletal development

    De Novo Mutations in NALCN Cause a Syndrome Characterized by Congenital Contractures of the Limbs and Face, Hypotonia, and Developmental Delay

    Get PDF
    Freeman-Sheldon syndrome, or distal arthrogryposis type 2A (DA2A), is an autosomal-dominant condition caused by mutations in MYH3 and characterized by multiple congenital contractures of the face and limbs and normal cognitive development. We identified a subset of five individuals who had been putatively diagnosed with "DA2A with severe neurological abnormalities" and for whom congenital contractures of the limbs and face, hypotonia, and global developmental delay had resulted in early death in three cases; this is a unique condition that we now refer to as CLIFAHDD syndrome. Exome sequencing identified missense mutations in the sodium leak channel, non-selective (NALCN) in four families affected by CLIFAHDD syndrome. We used molecular-inversion probes to screen for NALCN in a cohort of 202 distal arthrogryposis (DA)-affected individuals as well as concurrent exome sequencing of six other DA-affected individuals, thus revealing NALCN mutations in ten additional families with "atypical" forms of DA. All 14 mutations were missense variants predicted to alter amino acid residues in or near the S5 and S6 pore-forming segments of NALCN, highlighting the functional importance of these segments. In vitro functional studies demonstrated that NALCN alterations nearly abolished the expression of wild-type NALCN, suggesting that alterations that cause CLIFAHDD syndrome have a dominant-negative effect. In contrast, homozygosity for mutations in other regions of NALCN has been reported in three families affected by an autosomal-recessive condition characterized mainly by hypotonia and severe intellectual disability. Accordingly, mutations in NALCN can cause either a recessive or dominant condition characterized by varied though overlapping phenotypic features, perhaps based on the type of mutation and affected protein domain(s)

    Genome sequencing elucidates Sardinian genetic architecture and augments association analyses for lipid and blood inflammatory markers

    No full text
    We report similar to 17.6 million genetic variants from whole-genome sequencing of 2,120 Sardinians; 22% are absent from previous sequencing-based compilations and are enriched for predicted functional consequences. Furthermore, similar to 76,000 variants common in our sample (frequency >5%) are rare elsewhere (<0.5% in the 1000 Genomes Project). We assessed the impact of these variants on circulating lipid levels and five inflammatory biomarkers. We observe 14 signals, including 2 major new loci, for lipid levels and 19 signals, including 2 new loci, for inflammatory markers. The new associations would have been missed in analyses based on 1000 Genomes Project data, underlining the advantages of large-scale sequencing in this founder population
    corecore