71 research outputs found

    IsomiR_Window : a system for analyzing small‑RNA‑seq data in an integrative and user‑friendly manner

    Get PDF
    Research Areas: Biochemistry & Molecular ; Biology Biotechnology & Applied ; Microbiology ; Mathematical & Computational BiologyBackground: IsomiRs are miRNA variants that vary in length and/or sequence when compared to their canonical forms. These variants display differences in length and/or sequence, including additions or deletions of one or more nucleotides (nts) at the 5′ and/or 3′ end, internal editings or untemplated 3′ end additions. Most available tools for small RNA-seq data analysis do not allow the identification of isomiRs and often require advanced knowledge of bioinformatics. To overcome this, we have developed IsomiR Window, a platform that supports the systematic identification, quantification and functional exploration of isomiR expression in small RNA-seq datasets, accessible to users with no computational skills. Methods: IsomiR Window enables the discovery of isomiRs and identification of all annotated non-coding RNAs in RNA-seq datasets from animals and plants. It comprises two main components: the IsomiR Window pipeline for data processing; and the IsomiR Window Browser interface. It integrates over ten third-party softwares for the analysis of small-RNA-seq data and holds a new algorithm that allows the detection of all possible types of isomiRs. These include 3′ and 5′end isomiRs, 3′ end tailings, isomiRs with single nucleotide polymorphisms (SNPs) or potential RNA editings, as well as all possible fuzzy combinations. IsomiR Window includes all required databases for analysis and annotation, and is freely distributed as a Linux virtual machine, including all required software. Results: IsomiR Window processes several datasets in an automated manner, without restrictions of input file size. It generates high quality interactive figures and tables which can be exported into different formats. The performance of isomiR detection and quantification was assessed using simulated small-RNA-seq data. For correctly mapped reads, it identified different types of isomiRs with high confidence and 100% accuracy. The analysis of a small RNA-seq data from Basal Cell Carcinomas (BCCs) using isomiR Window confirmed that miR-183-5p is up-regulated in Nodular BCCs, but revealed that this effect was predominantly due to a novel 5′end variant. This variant displays a different seed region motif and 1756 isoform-exclusive mRNA targets that are significantly associated with disease pathways, underscoring the biological relevance of isomiR-focused analysis. IsomiR Window is available at https ://isomi r.fc.ul.pt/.info:eu-repo/semantics/publishedVersio

    Visualization of the small RNA transcriptome using seqclusterViz

    Get PDF
    The study of small RNAs provides us with a deeper understanding of the complexity of gene regulation within cells. Of the different types of small RNAs, the most important in mammals are miRNA, tRNA fragments and piRNAs. Using small RNA-seq analysis, we can study all small RNA types simultaneously, with the potential to detect novel small RNA types. We describe SeqclusterViz, an interactive HTML-javascript webpage for visualizing small noncoding RNAs (small RNAs) detected by Seqcluster. The SeqclusterViz tool allows users to visualize known and novel small RNA types in model or non-model organisms, and to select small RNA candidates for further validation. SeqclusterViz is divided into three panels: i) query-ready tables showing detected small RNA clusters and their genomic locations, ii) the expression profile over the precursor for all the samples together with RNA secondary structures, and iii) the mostly highly expressed sequences. Here, we show the capabilities of the visualization tool and its validation using human brain samples from patients with Parkinson's disease

    HumiR: Web Services, Tools and Databases for Exploring Human microRNA Data

    Get PDF
    For many research aspects on small non-coding RNAs, especially microRNAs, computational tools and databases are developed. This includes quantification of miRNAs, piRNAs, tRNAs and tRNA fragments, circRNAs and others. Furthermore, the prediction of new miRNAs, isomiRs, arm switch events, target and target pathway prediction and miRNA pathway enrichment are common tasks. Additionally, databases and resources containing expression profiles, e.g., from different tissues, organs or cell types, are generated. This information in turn leads to improved miRNA repositories. While most of the respective tools are implemented in a species-independent manner, we focused on tools for human small non-coding RNAs. This includes four aspects: (1) miRNA analysis tools (2) databases on miRNAs and variations thereof (3) databases on expression profiles (4) miRNA helper tools facilitating frequent tasks such as naming conversion or reporter assay design. Although dependencies between the tools exist and several tools are jointly used in studies, the interoperability is limited. We present HumiR, a joint web presence for our tools. HumiR facilitates an entry in the world of miRNA research, supports the selection of the right tool for a research task and represents the very first step towards a fully integrated knowledge-base for human small non-coding RNA research. We demonstrate the utility of HumiR by performing a very comprehensive analysis of Alzheimer’s miRNAs

    miFRame: analysis and visualization of miRNA sequencing data in neurological disorders

    Get PDF
    Background: While in the past decades nucleic acid analysis has been predominantly carried out using quantitative low- and high-throughput approaches such as qRT-PCR and microarray technology, next-generation sequencing (NGS) with its single base resolution is now frequently applied in DNA and RNA testing. Especially for small non-coding RNAs such as microRNAs there is a need for analysis and visualization tools that facilitate interpretation of the results also for clinicians. Methods: We developed miFRame, which supports the analysis of human small RNA NGS data. Our tool carries out different data analyses for known as well as predicted novel mature microRNAs from known precursors and presents the results in a well interpretable manner. Analyses include among others expression analysis of precursors and mature miRNAs, detection of novel precursors and detection of potential iso-microRNAs. Aggregation of results from different users moreover allows for evaluation whether remarkable results, such as novel mature miRNAs, are indeed specific for the respective experimental set-up or are frequently detected across a broad range of experiments. Results: We demonstrate the capabilities of miFRame, which is freely available at http://www.ccb.uni-saarland.de/miframe on two studies, circulating biomarker screening for Multiple Sclerosis (cohort includes clinically isolated syndrome, relapse remitting MS, matched controls) as well as Alzheimer Disease (cohort includes Alzheimer Disease, Mild Cognitive Impairment, matched controls). Here, our tool allowed for an improved biomarker discovery by identifying likely false positive marker candidates

    Automated analysis of small RNA datasets with RAPID

    Get PDF
    Understanding the role of short-interfering RNA (siRNA) in diverse biological processes is of current interest and often approached through small RNA sequencing. However, analysis of these datasets is difficult due to the complexity of biological RNA processing pathways, which differ between species. Several properties like strand specificity, length distribution, and distribution of soft-clipped bases are few parameters known to guide researchers in understanding the role of siRNAs. We present RAPID, a generic eukaryotic siRNA analysis pipeline, which captures information inherent in the datasets and automatically produces numerous visualizations as user-friendly HTML reports, covering multiple categories required for siRNA analysis. RAPID also facilitates an automated comparison of multiple datasets, with one of the normalization techniques dedicated for siRNA knockdown analysis, and integrates differential expression analysis using DESeq2. Availability and Implementation RAPID is available under MIT license at https://github.com/SchulzLab/RAPID. We recommend using it as a conda environment available from https://anaconda.org/bioconda/rapi

    Big Data Analytics for Complex Systems

    Get PDF
    The evolution of technology in all fields led to the generation of vast amounts of data by modern systems. Using data to extract information, make predictions, and make decisions is the current trend in artificial intelligence. The advancement of big data analytics tools made accessing and storing data easier and faster than ever, and machine learning algorithms help to identify patterns in and extract information from data. The current tools and machines in health, computer technologies, and manufacturing can generate massive raw data about their products or samples. The author of this work proposes a modern integrative system that can utilize big data analytics, machine learning, super-computer resources, and industrial health machines’ measurements to build a smart system that can mimic the human intelligence skills of observations, detection, prediction, and decision-making. The applications of the proposed smart systems are included as case studies to highlight the contributions of each system. The first contribution is the ability to utilize big data revolutionary and deep learning technologies on production lines to diagnose incidents and take proper action. In the current digital transformational industrial era, Industry 4.0 has been receiving researcher attention because it can be used to automate production-line decisions. Reconfigurable manufacturing systems (RMS) have been widely used to reduce the setup cost of restructuring production lines. However, the current RMS modules are not linked to the cloud for online decision-making to take the proper decision; these modules must connect to an online server (super-computer) that has big data analytics and machine learning capabilities. The online means that data is centralized on cloud (supercomputer) and accessible in real-time. In this study, deep neural networks are utilized to detect the decisive features of a product and build a prediction model in which the iFactory will make the necessary decision for the defective products. The Spark ecosystem is used to manage the access, processing, and storing of the big data streaming. This contribution is implemented as a closed cycle, which for the best of our knowledge, no one in the literature has introduced big data analysis using deep learning on real-time applications in the manufacturing system. The code shows a high accuracy of 97% for classifying the normal versus defective items. The second contribution, which is in Bioinformatics, is the ability to build supervised machine learning approaches based on the gene expression of patients to predict proper treatment for breast cancer. In the trial, to personalize treatment, the machine learns the genes that are active in the patient cohort with a five-year survival period. The initial condition here is that each group must only undergo one specific treatment. After learning about each group (or class), the machine can personalize the treatment of a new patient by diagnosing the patients’ gene expression. The proposed model will help in the diagnosis and treatment of the patient. The future work in this area involves building a protein-protein interaction network with the selected genes for each treatment to first analyze the motives of the genes and target them with the proper drug molecules. In the learning phase, a couple of feature-selection techniques and supervised standard classifiers are used to build the prediction model. Most of the nodes show a high-performance measurement where accuracy, sensitivity, specificity, and F-measure ranges around 100%. The third contribution is the ability to build semi-supervised learning for the breast cancer survival treatment that advances the second contribution. By understanding the relations between the classes, we can design the machine learning phase based on the similarities between classes. In the proposed research, the researcher used the Euclidean matrix distance among each survival treatment class to build the hierarchical learning model. The distance information that is learned through a non-supervised approach can help the prediction model to select the classes that are away from each other to maximize the distance between classes and gain wider class groups. The performance measurement of this approach shows a slight improvement from the second model. However, this model reduced the number of discriminative genes from 47 to 37. The model in the second contribution studies each class individually while this model focuses on the relationships between the classes and uses this information in the learning phase. Hierarchical clustering is completed to draw the borders between groups of classes before building the classification models. Several distance measurements are tested to identify the best linkages between classes. Most of the nodes show a high-performance measurement where accuracy, sensitivity, specificity, and F-measure ranges from 90% to 100%. All the case study models showed high-performance measurements in the prediction phase. These modern models can be replicated for different problems within different domains. The comprehensive models of the newer technologies are reconfigurable and modular; any newer learning phase can be plugged-in at both ends of the learning phase. Therefore, the output of the system can be an input for another learning system, and a newer feature can be added to the input to be considered for the learning phase

    Advances in Evolutionary Algorithms

    Get PDF
    With the recent trends towards massive data sets and significant computational power, combined with evolutionary algorithmic advances evolutionary computation is becoming much more relevant to practice. Aim of the book is to present recent improvements, innovative ideas and concepts in a part of a huge EA field
    • …
    corecore