1,042 research outputs found

    Reranking candidate gene models with cross-species comparison for improved gene prediction

    Get PDF
    Background: Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc). Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and comparative genomics datasets may help to select among competing models of comparable probability by exploiting features likely to be associated with the correct gene models, such as conserved exon/intron structure or protein sequence features. Results: We have investigated the utility of a simple post-processing step for selecting among a set of alternative gene models, using global scoring rules to rerank competing models for more accurate prediction. For each gene locus, we first generate the K best candidate gene models using the gene finder Evigan, and then rerank these models using comparisons with putative orthologous genes from closely-related species. Candidate gene models with lower scores in the original gene finder may be selected if they exhibit strong similarity to probable orthologs in coding sequence, splice site location, or signal peptide occurrence. Experiments on Drosophila melanogaster demonstrate that reranking based on cross-species comparison outperforms the best gene models identified by Evigan alone, and also outperforms the comparative gene finders GeneWise and Augustus+. Conclusion: Reranking gene models with cross-species comparison improves gene prediction accuracy. This straightforward method can be readily adapted to incorporate additional lines of evidence, as it requires only a ranked source of candidate gene models

    Complete Genome Sequence of a Putative Densovirus of the Asian Citrus Psyllid, Diaphorina citri.

    Get PDF
    Here, we report the complete genome sequence of a putative densovirus of the Asian citrus psyllid, Diaphorina citri Diaphorina citri densovirus (DcDNV) was originally identified through metagenomics, and here, we obtained the complete nucleotide sequence using PCR-based approaches. Phylogenetic analysis places DcDNV between viruses of the Ambidensovirus and Iteradensovirus genera

    SPG10 is a rare cause of spastic paraplegia in European families

    Get PDF
    Background: SPG10 is an autosomal dominant form of hereditary spastic paraplegia (HSP), which is caused by mutations in the neural kinesin heavy chain KIF5A gene, the neuronal motor of fast anterograde axonal transport. Only four mutations have been identified to date.Objective: To determine the frequency of SPG10 in European families with HSP and to specify the SPG10 phenotype.Patients and methods: 80 index patients from families with autosomal dominant HSP were investigated for SPG10 mutations by direct sequencing of the KIF5A motor domain. Additionally, the whole gene was sequenced in 20 of these families.Results: Three novel KIF5A mutations were detected in German families, including one missense mutation (c.759G>T, p.K253N), one in frame deletion (c.768_770delCAA, p.N256del) and one splice site mutation (c.217G>A). Onset of gait disturbance varied from infancy to 30 years of age. All patients presented clinically with pure HSP, but a subclinical sensory--motor neuropathy was detected by neurophysiology studies.Conclusions: SPG10 accounts for approximately 3% of European autosomal dominant HSP families. All mutations affect the motor domain of kinesin and thus most likely impair axonal transport. Clinically, SPG10 is characterised by spastic paraplegia with mostly subclinical peripheral neuropathy

    ECgene: an alternative splicing database update

    Get PDF
    ECgene () was developed to provide functional annotation for alternatively spliced genes. The applications encompass the genome-based transcript modeling for alternative splicing (AS), domain analysis with Gene Ontology (GO) annotation and expression analysis based on the EST and SAGE data. We have expanded the ECgene's AS modeling and EST clustering to nine organisms for which sufficient EST data are available in the GenBank. As for the human genome, we have also introduced several new applications to analyze differential expression. ECprofiler is an ontology-based candidate gene search system that allows users to select an arbitrary combination of gene expression pattern and GO functional categories. DEGEST is a database of differentially expressed genes and isoforms based on the EST information. Importantly, gene expression is analyzed at three distinctive levels—gene, isoform and exon levels. The user interfaces for functional and expression analyses have been substantially improved. ASviewer is a dedicated java application that visualizes the transcript structure and functional features of alternatively spliced variants. The SAGE part of the expression module provides many additional features including SNP, differential expression and alternative tag positions

    Precision Medicine and Actionable Alterations in Lung Cancer: A Single Institution Experience

    Get PDF
    OBJECTIVES: Oncology has become more reliant on new testing methods and a greater use of electronic medical records, which provide a plethora of information available to physicians and researchers. However, to take advantage of vital clinical and research data for precision medicine, we must initially make an effort to create an infrastructure for the collection, storage, and utilization of this information with uniquely designed disease-specific registries that could support the collection of a large number of patients. MATERIALS AND METHODS: In this study, we perform an in-depth analysis of a series of lung adenocarcinoma patients (n = 415) with genomic and clinical data in a recently created thoracic patient registry. RESULTS: Of the 415 patients with lung adenocarcinoma, 59% (n = 245) were female; the median age was 64 (range, 22-92) years with a median OS of 33.29 months (95% CI, 29.77-39.48). The most common actionable alterations were identified in EGFR (n = 177/415 [42.7%]), ALK (n = 28/377 [7.4%]), and BRAF V600E (n = 7/288 [2.4%]). There was also a discernible difference in survival for 222 patients, who had an actionable alteration, with a median OS of 39.8 months as compared to 193 wild-type patients with a median OS of 26.0 months (P CONCLUSION: The use of patient registries, focused genomic panels and the appropriate use of clinical guidelines in community and academic settings may influence cohort selection for clinical trials and improve survival outcomes

    Improving Gene-finding in Chlamydomonas reinhardtii:GreenGenie2

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The availability of whole-genome sequences allows for the identification of the entire set of protein coding genes as well as their regulatory regions. This can be accomplished using multiple complementary methods that include ESTs, homology searches and <it>ab initio </it>gene predictions. Previously, the Genie gene-finding algorithm was trained on a small set of <it>Chlamydomonas </it>genes and shown to improve the accuracy of gene prediction in this species compared to other available programs. To improve <it>ab initio </it>gene finding in <it>Chlamydomonas</it>, we assemble a new training set consisting of over 2,300 cDNAs by assembling over 167,000 <it>Chlamydomonas </it>EST entries in GenBank using the EST assembly tool PASA.</p> <p>Results</p> <p>The prediction accuracy of our cDNA-trained gene-finder, GreenGenie2, attains 83% sensitivity and 83% specificity for exons on short-sequence predictions. We predict about 12,000 genes in the version <it>v3 Chlamydomonas </it>genome assembly, most of which (78%) are either identical to or significantly overlap the published catalog of <it>Chlamydomonas </it>genes <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. 22% of the published catalog is absent from the GreenGenie2 predictions; there is also a fraction (23%) of GreenGenie2 predictions that are absent from the published gene catalog. Randomly chosen gene models were tested by RT-PCR and most support the GreenGenie2 predictions.</p> <p>Conclusion</p> <p>These data suggest that training with EST assemblies is highly effective and that GreenGenie2 is a valuable, complementary tool for predicting genes in <it>Chlamydomonas reinhardtii</it>.</p

    Cryptic transcripts from a ubiquitous plasmid origin of replication confound tests for cis-regulatory function.

    Get PDF
    A vast amount of research on the regulation of gene expression has relied on plasmid reporter assays. In this study, we show that plasmids widely used for this purpose constitutively produce substantial amounts of RNA from a TATA-containing cryptic promoter within the origin of replication. Readthrough of these RNAs into the intended transcriptional unit potently stimulated reporter activity when the inserted test sequence contained a 3' splice site (ss). We show that two human sequences, originally reported to be internal ribosome entry sites and later to instead be promoters, mimic both types of element in dicistronic reporter assays by causing these cryptic readthrough transcripts to splice in patterns that allow efficient translation of the downstream cistron. Introduction of test sequences containing 3' ss into monocistronic luciferase reporter vectors widely used in the study of transcriptional regulation also created the false appearance of promoter function via the same mechanism. Across a large number of variants of these plasmids, we found a very highly significant correlation between reporter activity and levels of such spliced readthrough transcripts. Computational estimation of the frequency of cryptic 3' ss in genomic sequences suggests that misattribution of cis-regulatory function may be a common occurrence

    Spatial mapping of splicing factor complexes involved in exon and intron definition

    Get PDF
    We have analyzed the interaction between serine/arginine-rich (SR) proteins and splicing components that recognize either the 5′ or 3′ splice site. Previously, these interactions have been extensively characterized biochemically and are critical for both intron and exon definition. We use fluorescence resonance energy transfer (FRET) microscopy to identify interactions of individual SR proteins with the U1 small nuclear ribonucleoprotein (snRNP)–associated 70-kD protein (U1 70K) and with the small subunit of the U2 snRNP auxiliary factor (U2AF35) in live-cell nuclei. We find that these interactions occur in the presence of RNA polymerase II inhibitors, demonstrating that they are not exclusively cotranscriptional. Using FRET imaging by means of fluorescence lifetime imaging microscopy (FLIM), we map these interactions to specific sites in the nucleus. The FLIM data also reveal a previously unknown interaction between HCC1, a factor related to U2AF65, with both subunits of U2AF. Spatial mapping using FLIM-FRET reveals differences in splicing factors interactions within complexes located in separate subnuclear domains
    corecore