22 research outputs found

    Disseny i implementació d'un cluster de computació científica basat en rocks

    Get PDF
    Projecte fet en col.laboració amb l'Institut de Biologia EvolutivaCatalà: L'Institut de Biologia Evolutiva, és un institut de recent creació el principal objectiu del qual és la recerca en genètica de poblacions i genètica comparativa computacional. Degut al gran creixement que ha viscut el món de les dades genètiques durant els darrers anys, resulten imprescindibles noves eines per a dur a terme una recerca de qualitat. El projecte consisteix en l'anàlisi de necessitats, disseny, implementació i avaluació d'un clúster de càlcul d'alt rendiment basat en rocks-cluster orientat en aplicacions destinades a aquesta recerca

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    No full text
    Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts

    Disseny i implementació d'un cluster de computació científica basat en rocks

    No full text
    Projecte fet en col.laboració amb l'Institut de Biologia EvolutivaCatalà: L'Institut de Biologia Evolutiva, és un institut de recent creació el principal objectiu del qual és la recerca en genètica de poblacions i genètica comparativa computacional. Degut al gran creixement que ha viscut el món de les dades genètiques durant els darrers anys, resulten imprescindibles noves eines per a dur a terme una recerca de qualitat. El projecte consisteix en l'anàlisi de necessitats, disseny, implementació i avaluació d'un clúster de càlcul d'alt rendiment basat en rocks-cluster orientat en aplicacions destinades a aquesta recerca

    GWAS results Manhattan plot.

    No full text
    <p>Manhattan plots for Asian, African American and European population subsets showing top hits from each continent. The blue line indicates p-value of 10<sup>−5</sup> and red line indicates p-value of 10<sup>−8</sup>.</p

    Regional association plots.

    No full text
    <p>Regional association plot for Asian <b>(A)</b>, European <b>(B)</b>, American <b>(C)</b> and African population <b>(D)</b> subsets produced by Locuszoom showing top SNPs from each population subset (in purple) and surrounding SNPs in the region colored by LD (r<sup>2</sup>) with the top SNP. Lower panel shows genes annotated within this region. Solid blue lines represent recombination rates.</p

    Effect of collapsed duplications on diversity estimates: what to expect

    No full text
    The study of segmental duplications (SDs) and copy-number variants (CNVs) is of great importance in the fields of genomics and evolution. However, SDs and CNVs are usually excluded from genome-wide scans for natural selection. Because of high identity between copies, SDs and CNVs that are not included in reference genomes are prone to be collapsed-that is, mistakenly aligned to the same region-when aligning sequence data from single individuals to the reference. Such collapsed duplications are additionally challenging because concerted evolution between duplications alters their site frequency spectrum and linkage disequilibrium patterns. To investigate the potential effect of collapsed duplications upon natural selection scans we obtained expectations for four summary statistics from simulations of duplications evolving under a range of interlocus gene conversion and crossover rates. We confirm that summary statistics traditionally used to detect the action of natural selection on DNA sequences cannot be applied to SDs and CNVs since in some cases values for known duplications mimic selective signatures. As a proof of concept of the pervasiveness of collapsed duplications, we analyzed data from the 1,000 Genomes Project. We find that, within regions identified as variable in copy number, diversity between individuals with the duplication is consistently higher than between individuals without the duplication. Furthermore, the frequency of single nucleotide variants (SNVs) deviating from Hardy-Weinberg Equilibrium is higher in individuals with the duplication, which strongly suggests that higher diversity is a consequence of collapsed duplications and incorrect evaluation of SNVs within these CNV regions.This work has been supported by Ministerio de Ciencia e Innovación, Spain (BFU2015-68649-P, MINECO/FEDER, UE), the Direcció General de Recerca, Generalitat de Catalunya (2014SGR1311 and 2014SGR866), the Spanish National Institute of Bioinformatics (PT13/0001/0026) of the Instituto de Salud Carlos III, grant MDM-2014-0370 through the “María de Maeztu” Programme for Units of Excellence in R&D to UPF’s Department of Experimental and Health Sciences; a grant to D.A.H. from Conacyt; and by the Fondo Europeo de Desarrollo Regional (FEDER) and the Fondo Social Europeo (FSE)

    Analysis of five gene sets in chimpanzees suggests decoupling between the action of selection on protein-coding and on noncoding elements

    Get PDF
    We set out to investigate potential differences and similarities between the selective forces acting upon the coding and noncoding regions of five different sets of genes defined according to functional and evolutionary criteria: 1) two reference gene sets presenting accelerated and slow rates of protein evolution (the Complement and Actin pathways); 2) a set of genes with evidence of accelerated evolution in at least one of their introns; and 3) two gene sets related to neurological function (Parkinson's and Alzheimer's diseases). To that effect, we combine human-chimpanzee divergence patterns with polymorphism data obtained from target resequencing 20 central chimpanzees, our closest relatives with largest long-term effective population size. By using the distribution of fitness effect-alpha extension of the McDonald-Kreitman test, we reproduce inferences of rates of evolution previously based only on divergence data on both coding and intronic sequences and also obtain inferences for other classes of genomic elements (untranslated regions, promoters, and conserved noncoding sequences). Our results suggest that 1) the distribution of fitness effect-alpha method successfully helps distinguishing different scenarios of accelerated divergence (adaptation or relaxed selective constraints) and 2) the adaptive history of coding and noncoding sequences within the gene sets analyzed is decoupled
    corecore