27 research outputs found

    Local structural alignment of RNA with affine gap model

    Get PDF
    BACKGROUND: Predicting new non-coding RNAs (ncRNAs) of a family can be done by aligning the potential candidate with a member of the family with known sequence and secondary structure. Existing tools either only consider the sequence similarity or cannot handle local alignment with gaps. RESULTS: In this paper, we consider the problem of finding the optimal local structural alignment between a query RNA sequence (with known secondary structure) and a target sequence (with unknown secondary structure) with the affine gap penalty model. We provide the algorithm to solve the problem. CONCLUSIONS: Based on an experiment, we show that there are ncRNA families in which considering local structural alignment with gap penalty model can identify real hits more effectively than using global alignment or local alignment without gap penalty model.published_or_final_versio

    Identification and Characterization of Novel Genotoxic Stress-Inducible Nuclear Long Noncoding RNAs in Mammalian Cells

    Get PDF
    Whole transcriptome analyses have revealed a large number of novel transcripts including long and short noncoding RNAs (ncRNAs). Currently, there is great interest in characterizing the functions of the different classes of ncRNAs and their relevance to cellular processes. In particular, nuclear long ncRNAs may be involved in controlling various aspects of biological regulation, such as stress responses. By a combination of bioinformatic and experimental approaches, we identified 25 novel nuclear long ncRNAs from 6,088,565 full-length human cDNA sequences. Some nuclear long ncRNAs were conserved among vertebrates, whereas others were found only among primates. Expression profiling of the nuclear long ncRNAs in human tissues revealed that most were expressed ubiquitously. A subset of the identified nuclear long ncRNAs was induced by the genotoxic agents mitomycin C or doxorubicin, in HeLa Tet-off cells. There were no commonly altered nuclear long ncRNAs between mitomycin C- and doxorubicin-treated cells. These results suggest that distinct sets of nuclear long ncRNAs play roles in cellular defense mechanisms against specific genotoxic agents, and that particular long ncRNAs have the potential to be surrogate indicators of a specific cell stress

    The Challenge of Regulation in a Minimal Photoautotroph: Non-Coding RNAs in Prochlorococcus

    Get PDF
    Prochlorococcus, an extremely small cyanobacterium that is very abundant in the world's oceans, has a very streamlined genome. On average, these cells have about 2,000 genes and very few regulatory proteins. The limited capability of regulation is thought to be a result of selection imposed by a relatively stable environment in combination with a very small genome. Furthermore, only ten non-coding RNAs (ncRNAs), which play crucial regulatory roles in all forms of life, have been described in Prochlorococcus. Most strains also lack the RNA chaperone Hfq, raising the question of how important this mode of regulation is for these cells. To explore this question, we examined the transcription of intergenic regions of Prochlorococcus MED4 cells subjected to a number of different stress conditions: changes in light qualities and quantities, phage infection, or phosphorus starvation. Analysis of Affymetrix microarray expression data from intergenic regions revealed 276 novel transcriptional units. Among these were 12 new ncRNAs, 24 antisense RNAs (asRNAs), as well as 113 short mRNAs. Two additional ncRNAs were identified by homology, and all 14 new ncRNAs were independently verified by Northern hybridization and 5′RACE. Unlike its reduced suite of regulatory proteins, the number of ncRNAs relative to genome size in Prochlorococcus is comparable to that found in other bacteria, suggesting that RNA regulators likely play a major role in regulation in this group. Moreover, the ncRNAs are concentrated in previously identified genomic islands, which carry genes of significance to the ecology of this organism, many of which are not of cyanobacterial origin. Expression profiles of some of these ncRNAs suggest involvement in light stress adaptation and/or the response to phage infection consistent with their location in the hypervariable genomic islands

    Deep sequencing reveals as-yet-undiscovered small RNAs in Escherichia coli

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In <it>Escherichia coli</it>, approximately 100 regulatory small RNAs (sRNAs) have been identified experimentally and many more have been predicted by various methods. To provide a comprehensive overview of sRNAs, we analysed the low-molecular-weight RNAs (< 200 nt) of <it>E. coli </it>with deep sequencing, because the regulatory RNAs in bacteria are usually 50-200 nt in length.</p> <p>Results</p> <p>We discovered 229 novel candidate sRNAs (≥ 50 nt) with computational or experimental evidence of transcription initiation. Among them, the expression of seven intergenic sRNAs and three <it>cis</it>-antisense sRNAs was detected by northern blot analysis. Interestingly, five novel sRNAs are expressed from prophage regions and we note that these sRNAs have several specific characteristics. Furthermore, we conducted an evolutionary conservation analysis of the candidate sRNAs and summarised the data among closely related bacterial strains.</p> <p>Conclusions</p> <p>This comprehensive screen for <it>E. coli </it>sRNAs using a deep sequencing approach has shown that many as-yet-undiscovered sRNAs are potentially encoded in the <it>E. coli </it>genome. We constructed the <it>Escherichia coli </it>Small RNA Browser (ECSBrowser; <url>http://rna.iab.keio.ac.jp/</url>), which integrates the data for previously identified sRNAs and the novel sRNAs found in this study.</p

    Differentiating Protein-Coding and Noncoding RNA: Challenges and Ambiguities

    Get PDF
    The assumption that RNA can be readily classified into either protein-coding or non-protein–coding categories has pervaded biology for close to 50 years. Until recently, discrimination between these two categories was relatively straightforward: most transcripts were clearly identifiable as protein-coding messenger RNAs (mRNAs), and readily distinguished from the small number of well-characterized non-protein–coding RNAs (ncRNAs), such as transfer, ribosomal, and spliceosomal RNAs. Recent genome-wide studies have revealed the existence of thousands of noncoding transcripts, whose function and significance are unclear. The discovery of this hidden transcriptome and the implicit challenge it presents to our understanding of the expression and regulation of genetic information has made the need to distinguish between mRNAs and ncRNAs both more pressing and more complicated. In this Review, we consider the diverse strategies employed to discriminate between protein-coding and noncoding transcripts and the fundamental difficulties that are inherent in what may superficially appear to be a simple problem. Misannotations can also run in both directions: some ncRNAs may actually encode peptides, and some of those currently thought to do so may not. Moreover, recent studies have shown that some RNAs can function both as mRNAs and intrinsically as functional ncRNAs, which may be a relatively widespread phenomenon. We conclude that it is difficult to annotate an RNA unequivocally as protein-coding or noncoding, with overlapping protein-coding and noncoding transcripts further confounding this distinction. In addition, the finding that some transcripts can function both intrinsically at the RNA level and to encode proteins suggests a false dichotomy between mRNAs and ncRNAs. Therefore, the functionality of any transcript at the RNA level should not be discounted

    Characterizing the Syphilis-Causing Treponema pallidum ssp. pallidum Proteome Using Complementary Mass Spectrometry

    Get PDF
    YesBackground. The spirochete bacterium Treponema pallidum ssp. pallidum is the etiological agent of syphilis, a chronic multistage disease. Little is known about the global T. pallidum proteome, therefore mass spectrometry studies are needed to bring insights into pathogenicity and protein expression profiles during infection. Methodology/Principal Findings. To better understand the T. pallidum proteome profile during infection, we studied T. pallidum ssp. pallidum DAL-1 strain bacteria isolated from rabbits using complementary mass spectrometry techniques, including multidimensional peptide separation and protein identification via matrix-assisted laser desorption ionization-time of flight (MALDI-TOF/TOF) and electrospray ionization (ESI-LTQ-Orbitrap) tandem mass spectrometry. A total of 6033 peptides were detected, corresponding to 557 unique T. pallidum proteins at a high level of confidence, representing 54% of the predicted proteome. A previous gel-based T. pallidum MS proteome study detected 58 of these proteins. One hundred fourteen of the detected proteins were previously annotated as hypothetical or uncharacterized proteins; this is the first account of 106 of these proteins at the protein level. Detected proteins were characterized according to their predicted biological function and localization; half were allocated into a wide range of functional categories. Proteins annotated as potential membrane proteins and proteins with unclear functional annotations were subjected to an additional bioinformatics pipeline analysis to facilitate further characterization. A total of 116 potential membrane proteins were identified, of which 16 have evidence supporting outer membrane localization. We found 8/12 proteins related to the paralogous tpr gene family: TprB, TprC/D, TprE, TprG, TprH, TprI and TprJ. Protein abundance was semi-quantified using label-free spectral counting methods. A low correlation (r = 0.26) was found between previous microarray signal data and protein abundance. Conclusions. This is the most comprehensive description of the global T. pallidum proteome to date. These data provide valuable insights into in vivo T. pallidum protein expression, paving the way for improved understanding of the pathogenicity of this enigmatic organism.This work was supported by the grants from the Flanders Research Foundation, SOFI-B Grant to CRK, http://www.fwo.be/, a Public Health Service Grant from the National Institutes of Health to CEC, (grant # AI-051334), https://www.nih.gov/ and a grant from the Grant Agency of the Czech Republic to DS and MS (P302/12/0574, GP14-29596P), https:// gacr.cz/
    corecore