33 research outputs found

    Genomics and Privacy: Implications of the New Reality of Closed Data for the Field

    Get PDF
    Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the large scale of the data from personal genomic sequencing makes it especially hard to share data, exacerbating the privacy problem. We also go over various aspects of genomic privacy: first, there is basic identifiability of subjects having their genome sequenced. However, even for individuals who have consented to be identified, there is the prospect of very detailed future characterization of their genotype, which, unanticipated at the time of their consent, may be more personal and invasive than the release of their medical records. We go over various computational strategies for dealing with the issue of genomic privacy. One can “slice” and reformat datasets to allow them to be partially shared while securing the most private variants. This is particularly applicable to functional genomics information, which can be largely processed without variant information. For handling the most private data there are a number of legal and technological approaches—for example, modifying the informed consent procedure to acknowledge that privacy cannot be guaranteed, and/or employing a secure cloud computing environment. Cloud computing in particular may allow access to the data in a more controlled fashion than the current practice of downloading and computing on large datasets. Furthermore, it may be particularly advantageous for small labs, given that the burden of many privacy issues falls disproportionately on them in comparison to large corporations and genome centers. Finally, we discuss how education of future genetics researchers will be important, with curriculums emphasizing privacy and data security. However, teaching personal genomics with identifiable subjects in the university setting will, in turn, create additional privacy issues and social conundrums

    PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data

    Get PDF
    Paired-End Mapper (PEMer) enables mapping of genomic structural variants at considerably enhanced sensitivity, specificity and resolution over previous approaches

    RNF43 is frequently mutated in colorectal and endometrial cancers

    Get PDF
    We report somatic mutations of RNF43 in over 18% of colorectal adenocarcinomas and endometrial carcinomas. RNF43 encodes an E3 ubiquitin ligase that negatively regulates Wnt signaling. Truncating mutations of RNF43 are more prevalent in microsatellite-unstable tumors and show mutual exclusivity with inactivating APC mutations in colorectal adenocarcinomas. These results indicate that RNF43 is one of the most commonly mutated genes in colorectal and endometrial cancers.National Human Genome Research Institute (U.S.) (Grant U54HG003067

    An integrated map of structural variation in 2,504 human genomes

    Get PDF
    © 2015 Macmillan Publishers Limited. All rights reserved. Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association

    Landscape of somatic single nucleotide variants and indels in colorectal cancer and impact on survival

    Get PDF
    Colorectal cancer (CRC) is a biologically heterogeneous disease. To characterize its mutational profile, we conduct targeted sequencing of 205 genes for 2,105 CRC cases with survival data. Our data shows several findings in addition to enhancing the existing knowledge of CRC. We identify PRKCI, SPZ1, MUTYH, MAP2K4, FETUB, and TGFBR2 as additional genes significantly mutated in CRC. We find that among hypermutated tumors, an increased mutation burden is associated with improved CRC-specific survival (HR=0.42, 95% CI: 0.21-0.82). Mutations in TP53 are associated with poorer CRC-specific survival, which is most pronounced in cases carrying TP53 mutations with predicted 0% transcriptional activity (HR=1.53, 95% CI: 1.21-1.94). Furthermore, we observe differences in mutational frequency of several genes and pathways by tumor location, stage, and sex. Overall, this large study provides deep insights into somatic mutations in CRC, and their potential relationships with survival and tumor features. Large scale sequencing study is of paramount importance to unravel the heterogeneity of colorectal cancer. Here, the authors sequenced 205 cancer genes in more than 2000 tumours and identified additional mutated driver genes, determined that mutational burden and specific mutations in TP53 are associated with survival odds
    corecore