53 research outputs found

    Untranslated regions of mRNAs

    Get PDF
    Gene expression is finely regulated at the post-transcriptional level. Features of the untranslated regions of mRNAs that control their translation, degradation and localization include stem-loop structures, upstream initiation codons and open reading frames, internal ribosome entry sites and various cis-acting elements that are bound by RNA-binding proteins

    Regularized Least Squares Cancer Classifiers from DNA microarray data

    Get PDF
    BACKGROUND: The advent of the technology of DNA microarrays constitutes an epochal change in the classification and discovery of different types of cancer because the information provided by DNA microarrays allows an approach to the problem of cancer analysis from a quantitative rather than qualitative point of view. Cancer classification requires well founded mathematical methods which are able to predict the status of new specimens with high significance levels starting from a limited number of data. In this paper we assess the performances of Regularized Least Squares (RLS) classifiers, originally proposed in regularization theory, by comparing them with Support Vector Machines (SVM), the state-of-the-art supervised learning technique for cancer classification by DNA microarray data. The performances of both approaches have been also investigated with respect to the number of selected genes and different gene selection strategies. RESULTS: We show that RLS classifiers have performances comparable to those of SVM classifiers as the Leave-One-Out (LOO) error evaluated on three different data sets shows. The main advantage of RLS machines is that for solving a classification problem they use a linear system of order equal to either the number of features or the number of training examples. Moreover, RLS machines allow to get an exact measure of the LOO error with just one training. CONCLUSION: RLS classifiers are a valuable alternative to SVM classifiers for the problem of cancer classification by gene expression data, due to their simplicity and low computational complexity. Moreover, RLS classifiers show generalization ability comparable to the ones of SVM classifiers also in the case the classification of new specimens involves very few gene expression levels

    A fuzzy method for RNA-Seq differential expression analysis in presence of multireads

    Get PDF
    Background: When the reads obtained from high-throughput RNA sequencing are mapped against a reference database, a significant proportion of them - known as multireads - can map to more than one reference sequence. These multireads originate from gene duplications, repetitive regions or overlapping genes. Removing the multireads from the mapping results, in RNA-Seq analyses, causes an underestimation of the read counts, while estimating the real read count can lead to false positives during the detection of differentially expressed sequences. Results: We present an innovative approach to deal with multireads and evaluate differential expression events, entirely based on fuzzy set theory. Since multireads cause uncertainty in the estimation of read counts during gene expression computation, they can also influence the reliability of differential expression analysis results, by producing false positives. Our method manages the uncertainty in gene expression estimation by defining the fuzzy read counts and evaluates the possibility of a gene to be differentially expressed with three fuzzy concepts: over-expression, same-expression and under-expression. The output of the method is a list of differentially expressed genes enriched with information about the uncertainty of the results due to the multiread presence. We have tested the method on RNA-Seq data designed for case-control studies and we have compared the obtained results with other existing tools for read count estimation and differential expression analysis. Conclusions: The management of multireads with the use of fuzzy sets allows to obtain a list of differential expression events which takes in account the uncertainty in the results caused by the presence of multireads. Such additional information can be used by the biologists when they have to select the most relevant differential expression events to validate with laboratory assays. Our method can be used to compute reliable differential expression events and to highlight possible false positives in the lists of differentially expressed genes computed with other tools

    UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs

    Get PDF
    The 5′ and 3′ untranslated regions of eukaryotic mRNAs play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleo-cytoplasmic mRNA transport, translation efficiency, subcellular localization and message stability. UTRdb is a curated database of 5′ and 3′ untranslated sequences of eukaryotic mRNAs, derived from several sources of primary data. Experimentally validated functional motifs are annotated (and also collated as the UTRsite database) and cross-links to genomic and protein data are provided. The integration of UTRdb with genomic and protein data has allowed the implementation of a powerful retrieval resource for the selection and extraction of UTR subsets based on their genomic coordinates and/or features of the protein encoded by the relevant mRNA (e.g. GO term, PFAM domain, etc.). All internet resources implemented for retrieval and functional analysis of 5′ and 3′ untranslated regions of eukaryotic mRNAs are accessible at http://www.ba.itb.cnr.it/UTR/

    WoPPER: Web server for Position Related data analysis of gene Expression in Prokaryotes

    Get PDF
    The structural and conformational organization of chromosomes is crucial for gene expression regulation in eukaryotes and prokaryotes as well. Up to date, gene expression data generated using either microarray or RNA-sequencing are available for many bacterial genomes. However, differential gene expression is usually investigated with methods considering each gene independently, thus not taking into account the physical localization of genes along a bacterial chromosome. Here, we present WoPPER, a web tool integrating gene expression and genomic annotations to identify differentially expressed chromosomal regions in bacteria. RNA-sequencing or microarray-based gene expression data are provided as input, along with gene annotations. The user can select genomic annotations from an internal database including 2780 bacterial strains, or provide custom genomic annotations. The analysis produces as output the lists of positionally related genes showing a coordinated trend of differential expression. Graphical representations, including a circular plot of the analyzed chromosome, allow intuitive browsing of the results. The analysis procedure is based on our previously published R-package PREDA. The release of this tool is timely and relevant for the scientific community, as WoPPER will fill an existing gap in prokaryotic gene expression data analysis and visualization tools. WoPPER is open to all users and can be reached at the following URL: https://WoPPER.ba.itb.cnr.it

    BEAT: Bioinformatics Exon Array Tool to store, analyze and visualize Affymetrix GeneChip Human Exon Array data from disease experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It is known from recent studies that more than 90% of human multi-exon genes are subject to Alternative Splicing (AS), a key molecular mechanism in which multiple transcripts may be generated from a single gene. It is widely recognized that a breakdown in AS mechanisms plays an important role in cellular differentiation and pathologies. Polymerase Chain Reactions, microarrays and sequencing technologies have been applied to the study of transcript diversity arising from alternative expression. Last generation Affymetrix GeneChip Human Exon 1.0 ST Arrays offer a more detailed view of the gene expression profile providing information on the AS patterns. The exon array technology, with more than five million data points, can detect approximately one million exons, and it allows performing analyses at both gene and exon level. In this paper we describe BEAT, an integrated user-friendly bioinformatics framework to store, analyze and visualize exon arrays datasets. It combines a data warehouse approach with some rigorous statistical methods for assessing the AS of genes involved in diseases. Meta statistics are proposed as a novel approach to explore the analysis results. BEAT is available at <url>http://beat.ba.itb.cnr.it</url>.</p> <p>Results</p> <p>BEAT is a web tool which allows uploading and analyzing exon array datasets using standard statistical methods and an easy-to-use graphical web front-end. BEAT has been tested on a dataset with 173 samples and tuned using new datasets of exon array experiments from 28 colorectal cancer and 26 renal cell cancer samples produced at the Medical Genetics Unit of IRCCS Casa Sollievo della Sofferenza.</p> <p>To highlight all possible AS events, alternative names, accession Ids, Gene Ontology terms and biochemical pathways annotations are integrated with exon and gene level expression plots. The user can customize the results choosing custom thresholds for the statistical parameters and exploiting the available clinical data of the samples for a multivariate AS analysis.</p> <p>Conclusions</p> <p>Despite exon array chips being widely used for transcriptomics studies, there is a lack of analysis tools offering advanced statistical features and requiring no programming knowledge. BEAT provides a user-friendly platform for a comprehensive study of AS events in human diseases, displaying the analysis results with easily interpretable and interactive tables and graphics.</p

    InteractomeSeq: a web server for the identification and profiling of domains and epitopes from phage display and next generation sequencing data

    Get PDF
    High-Throughput Sequencing technologies are transforming many research fields, including the analysis of phage display libraries. The phage display technology coupled with deep sequencing was introduced more than a decade ago and holds the potential to circumvent the traditional laborious picking and testing of individual phage rescued clones. However, from a bioinformatics point of view, the analysis of this kind of data was always performed by adapting tools designed for other purposes, thus not considering the noise background typical of the 'interactome sequencing' approach and the heterogeneity of the data. InteractomeSeq is a web server allowing data analysis of protein domains ('domainome') or epitopes ('epitome') from either Eukaryotic or Prokaryotic genomic phage libraries generated and selected by following an Interactome sequencing approach. InteractomeSeq allows users to upload raw sequencing data and to obtain an accurate characterization of domainome/epitome profiles after setting the parameters required to tune the analysis. The release of this tool is relevant for the scientific and clinical community, because InteractomeSeq will fill an existing gap in the field of large-scale biomarkers profiling, reverse vaccinology, and structural/functional studies, thus contributing essential information for gene annotation or antigen identification. InteractomeSeq is freely available at https://InteractomeSeq.ba.itb.cnr.it/

    Dakwah Dalam Membangun Etika Kerukunan Hidup Umat Beragama

    Get PDF
    Indonesia is multi-ethnics, culture and religions,require normative values to regulate social relations.Normative values on religion, social values and moresthat have been prevailing in the community, is thebasis for building a culture of inter-religious relations.Pancasila and the 1945‟s Constitution is a collectiveagreement in regulating the society and the state.National unity took place in the community where thevalues of religion has lived and practiced. Religiousdiversity has become a reality of history and they havebeen living side by side. Today, Da‟wah activities inbuilding awareness in the practice of religion amongthe other religious communities, is needed. Therefore,the practice of the “inclusive religion” should beapplied

    A platform independent RNA-Seq protocol for the detection of transcriptome complexity

    Get PDF
    Background: Recent studies have demonstrated an unexpected complexity of transcription in eukaryotes. The majority of the genome is transcribed and only a little fraction of these transcripts is annotated as protein coding genes and their splice variants. Indeed, most transcripts are the result of antisense, overlapping and non-coding RNA expression. In this frame, one of the key aims of high throughput transcriptome sequencing is the detection of all RNA species present in the cell and the first crucial step for RNA-seq users is represented by the choice of the strategy for cDNA library construction. The protocols developed so far provide the utilization of the entire library for a single sequencing run with a specific platform. Results: We set up a unique protocol to generate and amplify a strand-specific cDNA library representative of all RNA species that may be implemented with all major platforms currently available on the market (Roche 454, Illumina, ABI/SOLiD). Our method is reproducible, fast, easy-to-perform and even allows to start from low input total RNA. Furthermore, we provide a suitable bioinformatics tool for the analysis of the sequences produced following this protocol. Conclusion: We tested the efficiency of our strategy, showing that our method is platform-independent, thus allowing the simultaneous analysis of the same sample with different NGS technologies, and providing an accurate quantitative and qualitative portrait of complex whole transcriptomes

    p53FamTaG: a database resource of human p53, p63 and p73 direct target genes combining in silico prediction and microarray data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The p53 gene family consists of the three genes p53, p63 and p73, which have polyhedral non-overlapping functions in pivotal cellular processes such as DNA synthesis and repair, growth arrest, apoptosis, genome stability, angiogenesis, development and differentiation. These genes encode sequence-specific nuclear transcription factors that recognise the same responsive element (RE) in their target genes. Their inactivation or aberrant expression may determine tumour progression or developmental disease. The discovery of several protein isoforms with antagonistic roles, which are produced by the expression of different promoters and alternative splicing, widened the complexity of the scenario of the transcriptional network of the p53 family members. Therefore, the identification of the genes transactivated by p53 family members is crucial to understand the specific role for each gene in cell cycle regulation. We have combined a genome-wide computational search of p53 family REs and microarray analysis to identify new direct target genes. The huge amount of biological data produced has generated a critical need for bioinformatic tools able to manage and integrate such data and facilitate their retrieval and analysis.</p> <p>Description</p> <p>We have developed the p53FamTaG database (p53 FAMily TArget Genes), a modular relational database, which contains p53 family direct target genes selected in the human genome searching for the presence of the REs and the expression profile of these target genes obtained by microarray experiments. p53FamTaG database also contains annotations of publicly available databases and links to other experimental data.</p> <p>The genome-wide computational search of the REs was performed using PatSearch, a pattern-matching program implemented in the DNAfan tool. These data were integrated with the microarray results we produced from the overexpression of different isoforms of p53, p63 and p73 stably transfected in isogenic cell lines, allowing the comparative study of the transcriptional activity of all the proteins in the same cellular background.</p> <p>p53FamTaG database is available free at <url>http://www2.ba.itb.cnr.it/p53FamTaG/</url></p> <p>Conclusion</p> <p>p53FamTaG represents a unique integrated resource of human direct p53 family target genes that is extensively annotated and provides the users with an efficient query/retrieval system which displays the results of our microarray experiments and allows the export of RE sequences. The database was developed for supporting and integrating high-throughput <it>in silico</it> and experimental analyses and represents an important reference source of knowledge for research groups involved in the field of oncogenesis, apoptosis and cell cycle regulation.</p
    • …
    corecore