17 research outputs found
Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data
A large number of computational methods have been developed for analyzing differential gene expression in RNA-seq data. We describe a comprehensive evaluation of common methods using the SEQC benchmark dataset and ENCODE data. We consider a number of key features, including normalization, accuracy of differential expression detection and differential expression analysis when one condition has no detectable expression. We find significant differences among the methods, but note that array-based methods adapted to RNA-seq data perform comparably to methods designed for RNA-seq. Our results demonstrate that increasing the number of replicate samples significantly improves detection power over increased sequencing depth
Erratum to: Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data
An Improved Canine Genome and a Comprehensive Catalogue of Coding Genes and Non-Coding Transcripts
The domestic dog, Canis familiaris, is a well-established model system for mapping trait and disease loci. While the original draft sequence was of good quality, gaps were abundant particularly in promoter regions of the genome, negatively impacting the annotation and study of candidate genes. Here, we present an improved genome build, canFam3.1, which includes 85 MB of novel sequence and now covers 99.8% of the euchromatic portion of the genome. We also present multiple RNA-Sequencing data sets from 10 different canine tissues to catalog ∼175,000 expressed loci. While about 90% of the coding genes previously annotated by EnsEMBL have measurable expression in at least one sample, the number of transcript isoforms detected by our data expands the EnsEMBL annotations by a factor of four. Syntenic comparison with the human genome revealed an additional ∼3,000 loci that are characterized as protein coding in human and were also expressed in the dog, suggesting that those were previously not annotated in the EnsEMBL canine gene set. In addition to ∼20,700 high-confidence protein coding loci, we found ∼4,600 antisense transcripts overlapping exons of protein coding genes, ∼7,200 intergenic multi-exon transcripts without coding potential, likely candidates for long intergenic non-coding RNAs (lincRNAs) and ∼11,000 transcripts were reported by two different library construction methods but did not fit any of the above categories. Of the lincRNAs, about 6,000 have no annotated orthologs in human or mouse. Functional analysis of two novel transcripts with shRNA in a mouse kidney cell line altered cell morphology and motility. All in all, we provide a much-improved annotation of the canine genome and suggest regulatory functions for several of the novel non-coding transcripts
The genomic basis of adaptive evolution in threespine sticklebacks
Marine stickleback fish have colonized and adapted to thousands of streams and lakes formed since the last ice age, providing an exceptional opportunity to characterize genomic mechanisms underlying repeated ecological adaptation in nature. Here we develop a high-quality reference genome assembly for threespine sticklebacks. By sequencing the genomes of twenty additional individuals from a global set of marine and freshwater populations, we identify a genome-wide set of loci that are consistently associated with marine–freshwater divergence. Our results indicate that reuse of globally shared standing genetic variation, including chromosomal inversions, has an important role in repeated evolution of distinct marine and freshwater sticklebacks, and in the maintenance of divergent ecotypes during early stages of reproductive isolation. Both coding and regulatory changes occur in the set of loci underlying marine–freshwater evolution, but regulatory changes appear to predominate in this well known example of repeated adaptive evolution in nature.National Human Genome Research Institute (U.S.)National Human Genome Research Institute (U.S.) (NHGRI CEGS Grant P50-HG002568
Metastatic Competence Can Emerge with Selection of Preexisting Oncogenic Alleles without a Need of New Mutations
Recommended from our members
Next-Generation Sequencing Suggests Complex, Heterogeneous Pathogenesis In Peripheral T-Cell Lymphoma Unspecified
Abstract
Peripheral T-cell lymphoma (PTCL) makes up about 12 percent of non-Hodgkin lymphoma, comprising 18 diseases that are poorly understood and carry a generally worse prognosis than B-lymphomas. PTCL not otherwise specified (PTCL-NOS), a diagnosis of exclusion, is most common, making up 25-30 percent. Gene-expression studies suggest a heterogeneous origin of this diagnosis, with overlap to other PTCL types, but the genetic factors underlying its pathogenesis are undefined. Current therapy for PTCL-NOS is empiric and ultimately ineffective for most patients. Identification of specific therapeutic targets is therefore a high priority. We have sought better understanding of pathogenesis through next-generation sequencing of PTCL-NOS tumor DNA. Whole-exome sequencing revealed candidate genes but low availability of fresh-frozen samples limited our ability to draw conclusions by this method alone. We therefore sequenced the coding regions of 237 candidate genes in a collection formalin-fixed paraffin-embedded samples. We used Nimblegen Sequence Capture for PCR amplification of exons and Illumina hiSeq for raw sequence generation. Results were aligned to hg19 and compared to dbSNP and the 1,000 genomes data to exclude germline variants. Analysis, including comparison to the COSMIC database of cancer-specific mutations, revealed high-confidence mutations affecting more than 60 known cancer-related genes in 25 PTCL-NOS cases. Recurrent mutations pointed to frequent activation of three key signaling pathways: NF-kB (TNFAIP3), WNT/B-Catenin (APC, CHD8, CELSR2), and NOTCH (NOTCH1, FBXW7). Recurrent deregulation of epigenetic processes was indicated by mutations in genes affecting histone acetylation (EP300, CREBBP), histone methylation (MLL2, KDM6A), and DNA methylation (TET2, DNMT3A). In addition, components of core tumor suppressor pathways showed evidence of frequent inactivation (TP53, ATM, RB1, CUL9, PRKDC). In all, 22 of 25 cases had mutations in at least one of these 17 recurrently mutated genes. Multiple additional candidate disease mechanisms also were suggested by lower-confidence mutations but require confirmation studies, which are under way. In sum, analysis of the coding region of PTCL-NOS tumor DNA suggests a complex and heterogeneous pathogenesis, in line with gene-expression profiling. This work provides an opportunity to better sub-classify entities within the diagnosis of PTCL-NOS and identify specific therapeutic targets and their associated biomarkers.
Disclosures:
Horwitz: Seattle Genetics, Inc.: Consultancy, Research Funding; Millennium: Consultancy, Research Funding
Recommended from our members
The genomic basis of adaptive evolution in threespine sticklebacks.
Marine stickleback fish have colonized and adapted to thousands of streams and lakes formed since the last ice age, providing an exceptional opportunity to characterize genomic mechanisms underlying repeated ecological adaptation in nature. Here we develop a high-quality reference genome assembly for threespine sticklebacks. By sequencing the genomes of twenty additional individuals from a global set of marine and freshwater populations, we identify a genome-wide set of loci that are consistently associated with marine-freshwater divergence. Our results indicate that reuse of globally shared standing genetic variation, including chromosomal inversions, has an important role in repeated evolution of distinct marine and freshwater sticklebacks, and in the maintenance of divergent ecotypes during early stages of reproductive isolation. Both coding and regulatory changes occur in the set of loci underlying marine-freshwater evolution, but regulatory changes appear to predominate in this well known example of repeated adaptive evolution in nature