17 research outputs found
CanPredict: a computational tool for predicting cancer-associated missense mutations
Various cancer genome projects are underway to identify novel mutations that drive tumorigenesis. While these screens will generate large data sets, the majority of identified missense changes are likely to be innocuous passenger mutations or polymorphisms. As a result, it has become increasingly important to develop computational methods for distinguishing functionally relevant mutations from other variations. We previously developed an algorithm, and now present the web application, CanPredict (http://www.canpredict.org/ or http://www.cgl.ucsf.edu/Research/genentech/canpredict/), to allow users to determine if particular changes are likely to be cancer-associated. The impact of each change is measured using two known methods: Sorting Intolerant From Tolerant (SIFT) and the Pfam-based LogR.E-value metric. A third method, the Gene Ontology Similarity Score (GOSS), provides an indication of how closely the gene in which the variant resides resembles other known cancer-causing genes. Scores from these three algorithms are analyzed by a random forest classifier which then predicts whether a change is likely to be cancer-associated. CanPredict fills an important need in cancer biology and will enable a large audience of biologists to determine which mutations are the most relevant for further study
The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective.
BACKGROUND: Transposable elements are found in the genomes of nearly all eukaryotes. The recent completion of the Release 3 euchromatic genomic sequence of Drosophila melanogaster by the Berkeley Drosophila Genome Project has provided precise sequence for the repetitive elements in the Drosophila euchromatin. We have used this genomic sequence to describe the euchromatic transposable elements in the sequenced strain of this species. RESULTS: We identified 85 known and eight novel families of transposable element varying in copy number from one to 146. A total of 1,572 full and partial transposable elements were identified, comprising 3.86% of the sequence. More than two-thirds of the transposable elements are partial. The density of transposable elements increases an average of 4.7 times in the centromere-proximal regions of each of the major chromosome arms. We found that transposable elements are preferentially found outside genes; only 436 of 1,572 transposable elements are contained within the 61.4 Mb of sequence that is annotated as being transcribed. A large proportion of transposable elements is found nested within other elements of the same or different classes. Lastly, an analysis of structural variation from different families reveals distinct patterns of deletion for elements belonging to different classes. CONCLUSIONS: This analysis represents an initial characterization of the transposable elements in the Release 3 euchromatic genomic sequence of D. melanogaster for which comparison to the transposable elements of other organisms can begin to be made. These data have been made available on the Berkeley Drosophila Genome Project website for future analyses.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
Recommended from our members
The transposable elements of the Drosophila melanogaster
Background: Transposable elements are found in the genomes of nearly all eukaryotes. The
recent completion of the Release 3 euchromatic genomic sequence of Drosophila melanogaster by
the Berkeley Drosophila Genome Project has provided precise sequence for the repetitive
elements in the Drosophila euchromatin. We have used this genomic sequence to describe the
euchromatic transposable elements in the sequenced strain of this species.
Results: We identified 85 known and eight novel families of transposable element varying in copy
number from one to 146. A total of 1,572 full and partial transposable elements were identified,
comprising 3.86% of the sequence. More than two-thirds of the transposable elements are partial.
The density of transposable elements increases an average of 4.7 times in the centromereproximal
regions of each of the major chromosome arms. We found that transposable elements
are preferentially found outside genes; only 436 of 1,572 transposable elements are contained
within the 61.4 Mb of sequence that is annotated as being transcribed. A large proportion of
transposable elements is found nested within other elements of the same or different classes.
Lastly, an analysis of structural variation from different families reveals distinct patterns of
deletion for elements belonging to different classes.
Conclusions: This analysis represents an initial characterization of the transposable elements in
the Release 3 euchromatic genomic sequence of D. melanogaster for which comparison to the
transposable elements of other organisms can begin to be made. These data have been made
available on the Berkeley Drosophila Genome Project website for future analyses
Heterochromatic sequences in a Drosophila whole-genome shotgun assembly
BACKGROUND: Most eukaryotic genomes include a substantial repeat-rich fraction termed heterochromatin, which is concentrated in centric and telomeric regions. The repetitive nature of heterochromatic sequence makes it difficult to assemble and analyze. To better understand the heterochromatic component of the Drosophila melanogaster genome, we characterized and annotated portions of a whole-genome shotgun sequence assembly. RESULTS: WGS3, an improved whole-genome shotgun assembly, includes 20.7 Mb of draft-quality sequence not represented in the Release 3 sequence spanning the euchromatin. We annotated this sequence using the methods employed in the re-annotation of the Release 3 euchromatic sequence. This analysis predicted 297 protein-coding genes and six non-protein-coding genes, including known heterochromatic genes, and regions of similarity to known transposable elements. Bacterial artificial chromosome (BAC)-based fluorescence in situ hybridization analysis was used to correlate the genomic sequence with the cytogenetic map in order to refine the genomic definition of the centric heterochromatin; on the basis of our cytological definition, the annotated Release 3 euchromatic sequence extends into the centric heterochromatin on each chromosome arm. CONCLUSIONS: Whole-genome shotgun assembly produced a reliable draft-quality sequence of a significant part of the Drosophila heterochromatin. Annotation of this sequence defined the intron-exon structures of 30 known protein-coding genes and 267 protein-coding gene models. The cytogenetic mapping suggests that an additional 150 predicted genes are located in heterochromatin at the base of the Release 3 euchromatic sequence. Our analysis suggests strategies for improving the sequence and annotation of the heterochromatic portions of the Drosophila and other complex genomes
Recommended from our members
Annotation of the Drosophila melanogaster euchromatic genome: a systematic review
BACKGROUND: The recent completion of the Drosophila melanogaster genomic sequence to high quality and the availability of a greatly expanded set of Drosophila cDNA sequences, aligning to 78% of the predicted euchromatic genes, afforded FlyBase the opportunity to significantly improve genomic annotations. We made the annotation process more rigorous by inspecting each gene visually, utilizing a comprehensive set of curation rules, requiring traceable evidence for each gene model, and comparing each predicted peptide to SWISS-PROT and TrEMBL sequences. RESULTS: Although the number of predicted protein-coding genes in Drosophila remains essentially unchanged, the revised annotation significantly improves gene models, resulting in structural changes to 85% of the transcripts and 45% of the predicted proteins. We annotated transposable elements and non-protein-coding RNAs as new features, and extended the annotation of untranslated (UTR) sequences and alternative transcripts to include more than 70% and 20% of genes, respectively. Finally, cDNA sequence provided evidence for dicistronic transcripts, neighboring genes with overlapping UTRs on the same DNA sequence strand, alternatively spliced genes that encode distinct, non-overlapping peptides, and numerous nested genes. CONCLUSIONS: Identification of so many unusual gene models not only suggests that some mechanisms for gene regulation are more prevalent than previously believed, but also underscores the complex challenges of eukaryotic gene prediction. At present, experimental data and human curation remain essential to generate high-quality genome annotations
High-resolution analysis of copy number alterations and associated expression changes in ovarian tumors
<p>Abstract</p> <p>Background</p> <p>DNA copy number alterations are frequently observed in ovarian cancer, but it remains a challenge to identify the most relevant alterations and the specific causal genes in those regions.</p> <p>Methods</p> <p>We obtained high-resolution 500K SNP array data for 52 ovarian tumors and identified the most statistically significant minimal genomic regions with the most prevalent and highest-level copy number alterations (recurrent CNAs). Within a region of recurrent CNA, comparison of expression levels in tumors with a given CNA to tumors lacking that CNA and to whole normal ovary samples was used to select genes with CNA-specific expression patterns. A public expression array data set of laser capture micro-dissected (LCM) non-malignant fallopian tube epithelia and LCM ovarian serous adenocarcinoma was used to evaluate the effect of cell-type mixture biases.</p> <p>Results</p> <p>Fourteen recurrent deletions were detected on chromosomes 4, 6, 9, 12, 13, 15, 16, 17, 18, 22 and most prevalently on X and 8. Copy number and expression data suggest several apoptosis mediators as candidate drivers of the 8p deletions. Sixteen recurrent gains were identified on chromosomes 1, 2, 3, 5, 8, 10, 12, 15, 17, 19, and 20, with the most prevalent gains localized to 8q and 3q. Within the 8q amplicon, <it>PVT1</it>, but not <it>MYC</it>, was strongly over-expressed relative to tumors lacking this CNA and showed over-expression relative to normal ovary. Likewise, the cell polarity regulators <it>PRKCI </it>and <it>ECT2 </it>were identified as putative drivers of two distinct amplicons on 3q. Co-occurrence analyses suggested potential synergistic or antagonistic relationships between recurrent CNAs. Genes within regions of recurrent CNA showed an enrichment of Cancer Census genes, particularly when filtered for CNA-specific expression.</p> <p>Conclusion</p> <p>These analyses provide detailed views of ovarian cancer genomic changes and highlight the benefits of using multiple reference sample types for the evaluation of CNA-specific expression changes.</p
Senescence-Associated Secretory Phenotypes Reveal Cell-Nonautonomous Functions of Oncogenic RAS and the p53 Tumor Suppressor
Cellular senescence suppresses cancer by arresting cell proliferation, essentially permanently, in response to oncogenic stimuli, including genotoxic stress. We modified the use of antibody arrays to provide a quantitative assessment of factors secreted by senescent cells. We show that human cells induced to senesce by genotoxic stress secrete myriad factors associated with inflammation and malignancy. This senescence-associated secretory phenotype (SASP) developed slowly over several days and only after DNA damage of sufficient magnitude to induce senescence. Remarkably similar SASPs developed in normal fibroblasts, normal epithelial cells, and epithelial tumor cells after genotoxic stress in culture, and in epithelial tumor cells in vivo after treatment of prostate cancer patients with DNA-damaging chemotherapy. In cultured premalignant epithelial cells, SASPs induced an epithelialβmesenchyme transition and invasiveness, hallmarks of malignancy, by a paracrine mechanism that depended largely on the SASP factors interleukin (IL)-6 and IL-8. Strikingly, two manipulations markedly amplified, and accelerated development of, the SASPs: oncogenic RAS expression, which causes genotoxic stress and senescence in normal cells, and functional loss of the p53 tumor suppressor protein. Both loss of p53 and gain of oncogenic RAS also exacerbated the promalignant paracrine activities of the SASPs. Our findings define a central feature of genotoxic stress-induced senescence. Moreover, they suggest a cell-nonautonomous mechanism by which p53 can restrain, and oncogenic RAS can promote, the development of age-related cancer by altering the tissue microenvironment
Rickettsia Phylogenomics: Unwinding the Intricacies of Obligate Intracellular Life
BACKGROUND: Completed genome sequences are rapidly increasing for Rickettsia, obligate intracellular alpha-proteobacteria responsible for various human diseases, including epidemic typhus and Rocky Mountain spotted fever. In light of phylogeny, the establishment of orthologous groups (OGs) of open reading frames (ORFs) will distinguish the core rickettsial genes and other group specific genes (class 1 OGs or C1OGs) from those distributed indiscriminately throughout the rickettsial tree (class 2 OG or C2OGs). METHODOLOGY/PRINCIPAL FINDINGS: We present 1823 representative (no gene duplications) and 259 non-representative (at least one gene duplication) rickettsial OGs. While the highly reductive (approximately 1.2 MB) Rickettsia genomes range in predicted ORFs from 872 to 1512, a core of 752 OGs was identified, depicting the essential Rickettsia genes. Unsurprisingly, this core lacks many metabolic genes, reflecting the dependence on host resources for growth and survival. Additionally, we bolster our recent reclassification of Rickettsia by identifying OGs that define the AG (ancestral group), TG (typhus group), TRG (transitional group), and SFG (spotted fever group) rickettsiae. OGs for insect-associated species, tick-associated species and species that harbor plasmids were also predicted. Through superimposition of all OGs over robust phylogeny estimation, we discern between C1OGs and C2OGs, the latter depicting genes either decaying from the conserved C1OGs or acquired laterally. Finally, scrutiny of non-representative OGs revealed high levels of split genes versus gene duplications, with both phenomena confounding gene orthology assignment. Interestingly, non-representative OGs, as well as OGs comprised of several gene families typically involved in microbial pathogenicity and/or the acquisition of virulence factors, fall predominantly within C2OG distributions. CONCLUSION/SIGNIFICANCE: Collectively, we determined the relative conservation and distribution of 14354 predicted ORFs from 10 rickettsial genomes across robust phylogeny estimation. The data, available at PATRIC (PathoSystems Resource Integration Center), provide novel information for unwinding the intricacies associated with Rickettsia pathogenesis, expanding the range of potential diagnostic, vaccine and therapeutic targets
Recommended from our members
The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective.
BACKGROUND: Transposable elements are found in the genomes of nearly all eukaryotes. The recent completion of the Release 3 euchromatic genomic sequence of Drosophila melanogaster by the Berkeley Drosophila Genome Project has provided precise sequence for the repetitive elements in the Drosophila euchromatin. We have used this genomic sequence to describe the euchromatic transposable elements in the sequenced strain of this species. RESULTS: We identified 85 known and eight novel families of transposable element varying in copy number from one to 146. A total of 1,572 full and partial transposable elements were identified, comprising 3.86% of the sequence. More than two-thirds of the transposable elements are partial. The density of transposable elements increases an average of 4.7 times in the centromere-proximal regions of each of the major chromosome arms. We found that transposable elements are preferentially found outside genes; only 436 of 1,572 transposable elements are contained within the 61.4 Mb of sequence that is annotated as being transcribed. A large proportion of transposable elements is found nested within other elements of the same or different classes. Lastly, an analysis of structural variation from different families reveals distinct patterns of deletion for elements belonging to different classes. CONCLUSIONS: This analysis represents an initial characterization of the transposable elements in the Release 3 euchromatic genomic sequence of D. melanogaster for which comparison to the transposable elements of other organisms can begin to be made. These data have been made available on the Berkeley Drosophila Genome Project website for future analyses.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
Whole genome sequencing across clinical trials identifies rare coding variants in GPR68 associated with chemotherapy-induced peripheral neuropathy
Abstract Background Dose-limiting toxicities significantly impact the benefit/risk profile of many drugs. Whole genome sequencing (WGS) in patients receiving drugs with dose-limiting toxicities can identify therapeutic hypotheses to prevent these toxicities. Chemotherapy-induced peripheral neuropathy (CIPN) is a common dose-limiting neurological toxicity of chemotherapies with no effective approach for prevention. Methods We conducted a genetic study of time-to-first peripheral neuropathy event using 30Γβgermline WGS data from whole blood samples from 4900 European-ancestry cancer patients in 14 randomized controlled trials. A substantial number of patients in these trials received taxane and platinum-based chemotherapies as part of their treatment regimen, either standard of care or in combination with the PD-L1 inhibitor atezolizumab. The trials spanned several cancers including renal cell carcinoma, triple negative breast cancer, non-small cell lung cancer, small cell lung cancer, bladder cancer, ovarian cancer, and melanoma. Results We identified a locus consisting of low-frequency variants in intron 13 of GRID2 associated with time-to-onset of first peripheral neuropathy (PN) indexed by rs17020773 (pβ=β2.03βΓβ10β8, all patients, pβ=β6.36βΓβ10β9, taxane treated). Gene-level burden analysis identified rare coding variants associated with increased PN risk in the C-terminus of GPR68 (pβ=β1.59βΓβ10β6, all patients, pβ=β3.47βΓβ10β8, taxane treated), a pH-sensitive G-protein coupled receptor (GPCR). The variants driving this signal were found to alter predicted arrestin binding motifs in the C-terminus of GPR68. Analysis of snRNA-seq from human dorsal root ganglia (DRG) indicated that expression of GPR68 was highest in mechano-thermo-sensitive nociceptors. Conclusions Our genetic study provides insight into the impact of low-frequency and rare coding genetic variation on PN risk and suggests that further study of GPR68 in sensory neurons may yield a therapeutic hypothesis for prevention of CIPN