171 research outputs found

    Using ESTs to improve the accuracy of de novo gene prediction

    Get PDF
    BACKGROUND: ESTs are a tremendous resource for determining the exon-intron structures of genes, but even extensive EST sequencing tends to leave many exons and genes untouched. Gene prediction systems based exclusively on EST alignments miss these exons and genes, leading to poor sensitivity. De novo gene prediction systems, which ignore ESTs in favor of genomic sequence, can predict such "untouched" exons, but they are less accurate when predicting exons to which ESTs align. TWINSCAN is the most accurate de novo gene finder available for nematodes and N-SCAN is the most accurate for mammals, as measured by exact CDS gene prediction and exact exon prediction. RESULTS: TWINSCAN_EST is a new system that successfully combines EST alignments with TWINSCAN. On the whole C. elegans genome TWINSCAN_EST shows 14% improvement in sensitivity and 13% in specificity in predicting exact gene structures compared to TWINSCAN without EST alignments. Not only are the structures revealed by EST alignments predicted correctly, but these also constrain the predictions without alignments, improving their accuracy. For the human genome, we used the same approach with N-SCAN, creating N-SCAN_EST. On the whole genome, N-SCAN_EST produced a 6% improvement in sensitivity and 1% in specificity of exact gene structure predictions compared to N-SCAN. CONCLUSION: TWINSCAN_EST and N-SCAN_EST are more accurate than TWINSCAN and N-SCAN, while retaining their ability to discover novel genes to which no ESTs align. Thus, we recommend using the EST versions of these programs to annotate any genome for which EST information is available. TWINSCAN_EST and N-SCAN_EST are part of the TWINSCAN open source software package

    A General Definition and Nomenclature for Alternative Splicing Events

    Get PDF
    Understanding the molecular mechanisms responsible for the regulation of the transcriptome present in eukaryotic cells is one of the most challenging tasks in the postgenomic era. In this regard, alternative splicing (AS) is a key phenomenon contributing to the production of different mature transcripts from the same primary RNA sequence. As a plethora of different transcript forms is available in databases, a first step to uncover the biology that drives AS is to identify the different types of reflected splicing variation. In this work, we present a general definition of the AS event along with a notation system that involves the relative positions of the splice sites. This nomenclature univocally and dynamically assigns a specific “AS code” to every possible pattern of splicing variation. On the basis of this definition and the corresponding codes, we have developed a computational tool (AStalavista) that automatically characterizes the complete landscape of AS events in a given transcript annotation of a genome, thus providing a platform to investigate the transcriptome diversity across genes, chromosomes, and species. Our analysis reveals that a substantial part—in human more than a quarter—of the observed splicing variations are ignored in common classification pipelines. We have used AStalavista to investigate and to compare the AS landscape of different reference annotation sets in human and in other metazoan species and found that proportions of AS events change substantially depending on the annotation protocol, species-specific attributes, and coding constraints acting on the transcripts. The AStalavista system therefore provides a general framework to conduct specific studies investigating the occurrence, impact, and regulation of AS

    Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells

    Get PDF
    The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5′ and 3′ transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network

    Neuromuscular Consequences of an Extreme Mountain Ultra-Marathon

    Get PDF
    We investigated the physiological consequences of one of the most extreme exercises realized by humans in race conditions: a 166-km mountain ultra-marathon (MUM) with 9500 m of positive and negative elevation change. For this purpose, (i) the fatigue induced by the MUM and (ii) the recovery processes over two weeks were assessed. Evaluation of neuromuscular function (NMF) and blood markers of muscle damage and inflammation were performed before and immediately following (n = 22), and 2, 5, 9 and 16 days after the MUM (n = 11) in experienced ultra-marathon runners. Large maximal voluntary contraction decreases occurred after MUM (−35% [95% CI: −28 to −42%] and −39% [95% CI: −32 to −46%] for KE and PF, respectively), with alteration of maximal voluntary activation, mainly for KE (−19% [95% CI: −7 to −32%]). Significant modifications in markers of muscle damage and inflammation were observed after the MUM as suggested by the large changes in creatine kinase (from 144±94 to 13,633±12,626 UI L−1), myoglobin (from 32±22 to 1,432±1,209 µg L−1), and C-Reactive Protein (from <2.0 to 37.7±26.5 mg L−1). Moderate to large reductions in maximal compound muscle action potential amplitude, high-frequency doublet force, and low frequency fatigue (index of excitation-contraction coupling alteration) were also observed for both muscle groups. Sixteen days after MUM, NMF had returned to initial values, with most of the recovery process occurring within 9 days of the race. These findings suggest that the large alterations in NMF after an ultra-marathon race are multi-factorial, including failure of excitation-contraction coupling, which has never been described after prolonged running. It is also concluded that as early as two weeks after such an extreme running exercise, maximal force capacities have returned to baseline

    The Major Antigenic Membrane Protein of “Candidatus Phytoplasma asteris” Selectively Interacts with ATP Synthase and Actin of Leafhopper Vectors

    Get PDF
    Phytoplasmas, uncultivable phloem-limited phytopathogenic wall-less bacteria, represent a major threat to agriculture worldwide. They are transmitted in a persistent, propagative manner by phloem-sucking Hemipteran insects. Phytoplasma membrane proteins are in direct contact with hosts and are presumably involved in determining vector specificity. Such a role has been proposed for phytoplasma transmembrane proteins encoded by circular extrachromosomal elements, at least one of which is a plasmid. Little is known about the interactions between major phytoplasma antigenic membrane protein (Amp) and insect vector proteins. The aims of our work were to identify vector proteins interacting with Amp and to investigate their role in transmission specificity. In controlled transmission experiments, four Hemipteran species were identified as vectors of “Candidatus Phytoplasma asteris”, the chrysanthemum yellows phytoplasmas (CYP) strain, and three others as non-vectors. Interactions between a labelled (recombinant) CYP Amp and insect proteins were analysed by far Western blots and affinity chromatography. Amp interacted specifically with a few proteins from vector species only. Among Amp-binding vector proteins, actin and both the α and β subunits of ATP synthase were identified by mass spectrometry and Western blots. Immunofluorescence confocal microscopy and Western blots of plasma membrane and mitochondrial fractions confirmed the localisation of ATP synthase, generally known as a mitochondrial protein, in plasma membranes of midgut and salivary gland cells in the vector Euscelidius variegatus. The vector-specific interaction between phytoplasma Amp and insect ATP synthase is demonstrated for the first time, and this work also supports the hypothesis that host actin is involved in the internalization and intracellular motility of phytoplasmas within their vectors. Phytoplasma Amp is hypothesized to play a crucial role in insect transmission specificity

    The genome of the seagrass Zostera marina reveals angiosperm adaptation to the sea

    Get PDF
    Seagrasses colonized the sea(1) on at least three independent occasions to form the basis of one of the most productive and widespread coastal ecosystems on the planet(2). Here we report the genome of Zostera marina (L.), the first, to our knowledge, marine angiosperm to be fully sequenced. This reveals unique insights into the genomic losses and gains involved in achieving the structural and physiological adaptations required for its marine lifestyle, arguably the most severe habitat shift ever accomplished by flowering plants. Key angiosperm innovations that were lost include the entire repertoire of stomatal genes(3), genes involved in the synthesis of terpenoids and ethylene signalling, and genes for ultraviolet protection and phytochromes for far-red sensing. Seagrasses have also regained functions enabling them to adjust to full salinity. Their cell walls contain all of the polysaccharides typical of land plants, but also contain polyanionic, low-methylated pectins and sulfated galactans, a feature shared with the cell walls of all macroalgae(4) and that is important for ion homoeostasis, nutrient uptake and O-2/CO2 exchange through leaf epidermal cells. The Z. marina genome resource will markedly advance a wide range of functional ecological studies from adaptation of marine ecosystems under climate warming(5,6), to unravelling the mechanisms of osmoregulation under high salinities that may further inform our understanding of the evolution of salt tolerance in crop plants(7)

    Risk to plant health of Flavescence doree for the EU territory

    Get PDF
    Following a request from the European Commission, the EFSA Panel on Plant Health (PLH) performed a quantitative analysis of the risk posed by the Flavescence dor\ue9e phytoplasma (FDp) in the EU territory. Three scenarios were analysed, one with current measures in place (scenario A0), one designed to improve grapevine propagation material phytosanitary status (scenario A1) and one with reinforced eradication and containment (scenario A2). The potential for entry is limited, FDp being almost non-existent outside the EU. FDp and its major vector, Scaphoideus titanus, have already established over large parts of the EU and have the potential to establish in a large fraction of the currently unaffected EU territory. With the current measures in place (A0), spread of FDp is predicted to continue with a progression of between a few and ca 20 newly infested NUTS 2 regions during the next 10 years, illustrating the limitations of the current control measures against spread. FDp spread is predicted to be roughly similar between scenarios A1 and A2, but more restricted than under scenario A0. However, even with reinforced control scenarios, stabilisation or reduction in the number of infested NUTS 2 regions has only relatively low probability. Under scenario A0, FDp has a 0.5\u20131% impact on the overall EU grapes and wine production, reflecting the effectiveness of the current control measures against impact. Under both scenarios A1 and A2, FDp impact is predicted to be reduced, by approximately one-third (A1) to two-thirds (A2) as compared to A0, but the associated uncertainties are large. The generalised use of hot water treatment for planting material produced in infected zones has the most important contribution to FDp impact reduction in scenario A1 and has high feasibility. Both increased eradication and containment measures contribute to impact reduction under scenario A2 but the overall feasibility is lower
    corecore