19 research outputs found

    Predicting Consensus Structures for RNA Alignments Via Pseudo-Energy Minimization

    Get PDF
    Thermodynamic processes with free energy parameters are often used in algorithms that solve the free energy minimization problem to predict secondary structures of single RNA sequences. While results from these algorithms are promising, an observation is that single sequence-based methods have moderate accuracy and more information is needed to improve on RNA secondary structure prediction, such as covariance scores obtained from multiple sequence alignments. We present in this paper a new approach to predicting the consensus secondary structure of a set of aligned RNA sequences via pseudo-energy minimization. Our tool, called RSpredict, takes into account sequence covariation and employs effective heuristics for accuracy improvement. RSpredict accepts, as input data, a multiple sequence alignment in FASTA or ClustalW format and outputs the consensus secondary structure of the input sequences in both the Vienna style Dot Bracket format and the Connectivity Table format. Our method was compared with some widely used tools including KNetFold, Pfold and RNAalifold. A comprehensive test on different datasets including Rfam sequence alignments and a multiple sequence alignment obtained from our study on the Drosophila X chromosome reveals that RSpredict is competitive with the existing tools on the tested datasets. RSpredict is freely available online as a web server and also as a jar file for download at http://datalab.njit.edu/biology/RSpredict

    Biological Data Cleaning: A Case Study

    No full text
    As databases become more pervasive through the biological sciences, various data quality concerns are emerging. Biological databases tend to develop data quality issues regarding data legacy, data uniformity and data duplication. Due to the nature of this data, each of these problems is non-trivial and can cause many problems for the database. For biological data to be corrected and standardised, methods and frameworks must be developed to handle both structural and traditional data. This paper discusses issues concerning biological data quality with respect to data cleaning. It presents BIO-AJAX, a framework developed to address these issues. It finally describes BIO-JAX for TreeBASE and BIO-AJAX for Lineage Path, two implementations of BIO-AJAX on phylogenetic data sets

    A New Kernel Method for RNA Classification

    No full text
    Support vector machines (SVMs) are a state-of-the-art machine learning tool widely used in speech recognition, image processing and biological sequence analysis. An essential step in SVMs is to devise a kernel function to compute the similarity between two data points in Euclidean space. In this paper we present a new kernel that takes advantage of both global and local structural information in RNAs and uses the information together to classify RNAs with support vector machines. Experimental results demonstrate the good performance of the new kernel and show that it outperforms existing kernels when applied to classifying non-coding RNA sequences

    Approximate Searching in Phylogenetic Databases

    No full text
    This paper presents background information about phylogenetic trees and possible methods for comparing the structures of these trees. Phylogeny, or the study of the evolutionary development of organisms, provides a vast amount of data, most of which is represented in the form of trees. Traditional methods for querying these trees can be inefficient. Therefore, more effective methods need to be developed. Here, we present two possible schemes for querying these trees, analyze them and introduce improvements on these methods

    Operational prediction of solar flares using a transformer-based framework

    No full text
    <p>Operational prediction of solar flares using a transformer-based framework</p&gt

    Operational prediction of solar flares using a transformer-based framework

    No full text
    <p>Operational prediction of solar flares using a transformer-based framework</p&gt

    Operational prediction of solar flares using a transformer-based framework

    No full text
    <p>Operational prediction of solar flares using a transformer-based framework</p&gt

    A Study of Phylogenetic Tools for Genomic Nomenclature Data Cleaning

    No full text
    In this poster we propose a method for addressing the genomic nomenclature problem by using phylogenetic tools along with the BIO-AJAX data cleaning framework

    ON THE EDITING DISTANCE BETWEEN UNDIRECTED ACYCLIC GRAPHS

    No full text

    Lineage Path Integration for Phylogenetic Resources

    No full text
    With the increase of genome and proteome data, phylogenetic information and phylogenetic analysis tools are increasing greatly in current biological repositories. First, many repositories allow users to browse information about species through taxonomic tools. These tools present the species with its lineage path and links to the various types of information the repository provides about the species. Second, some multiple sequence alignment tools offer users basic phylogenetic data through applying basic reconstruction algorithms to the alignment. With the availability of this information in multiple locations, integrated tools are needed to allow the user to compare this data. This paper presents data integration research on lineage paths using the BIO-AJAX framework. It introduces BIO-AJAX for Lineage Paths, a tool that integrates lineage path information for NCBI Taxonomy Database [1] and the Integrated Taxonomic Information System (ITIS)[6]
    corecore