306 research outputs found

    DEVELOPMENT AND IMPLEMENTATION OF A BIOINFORMATICS ONLINE DISTANCE EDUCATION LEARNING TOOL FOR AFRICA

    Get PDF
    Bioinformatics refers to the creation and advancement of algorithms, computational and statistical techniques and theories for solving formal and practical problems arising from the management and analysis of biological data. However, some parts of the African continent have not been properly sensitized to bio-scientific and computing field. Thus, there is the need for appropriate strategies of introducing the basic components of this emerging scientific field to part of the African populace through the development of an online distance education learning tool. This study involved the design of a bioinformatics online distance educative tool an implementation of the bioinformatics online distance educative tool by a programming approach. Design and implementation were done using the Borland Delphi 7 Enterprise edition within its Integrated Development Environment. The advantage of using Delphi programming language in implementing this useful bioinformatics web tool is that Delphi programming language is an object oriented programming language that has a lot of extra facilities for the enhancement of further technical functions, which ordinary HTML cannot handle. The development and use of a bioinformatics distance education software, as a teaching tool, in some African countries holds great promise for accommodating the needs of the populace, who live in cities, small towns and remote areas

    Bayesian regularization of hidden Markov models with an application to bioinformatics

    Get PDF
    This paper discusses a Bayesian approach to regularizing hidden Markov models and demonstrates an application of this scheme to Bioinformatics

    Online Learning in Discrete Hidden Markov Models

    Get PDF
    We present and analyse three online algorithms for learning in discrete Hidden Markov Models (HMMs) and compare them with the Baldi-Chauvin Algorithm. Using the Kullback-Leibler divergence as a measure of generalisation error we draw learning curves in simplified situations. The performance for learning drifting concepts of one of the presented algorithms is analysed and compared with the Baldi-Chauvin algorithm in the same situations. A brief discussion about learning and symmetry breaking based on our results is also presented.Comment: 8 pages, 6 figure

    An empirical comparison of supervised machine learning techniques in bioinformatics

    Get PDF
    Research in bioinformatics is driven by the experimental data. Current biological databases are populated by vast amounts of experimental data. Machine learning has been widely applied to bioinformatics and has gained a lot of success in this research area. At present, with various learning algorithms available in the literature, researchers are facing difficulties in choosing the best method that can apply to their data. We performed an empirical study on 7 individual learning systems and 9 different combined methods on 4 different biological data sets, and provide some suggested issues to be considered when answering the following questions: (i) How does one choose which algorithm is best suitable for their data set? (ii) Are combined methods better than a single approach? (iii) How does one compare the effectiveness of a particular algorithm to the others

    How Random is a Coin Toss? Bayesian Inference and the Symbolic Dynamics of Deterministic Chaos

    Get PDF
    Symbolic dynamics has proven to be an invaluable tool in analyzing the mechanisms that lead to unpredictability and random behavior in nonlinear dynamical systems. Surprisingly, a discrete partition of continuous state space can produce a coarse-grained description of the behavior that accurately describes the invariant properties of an underlying chaotic attractor. In particular, measures of the rate of information production--the topological and metric entropy rates--can be estimated from the outputs of Markov or generating partitions. Here we develop Bayesian inference for k-th order Markov chains as a method to finding generating partitions and estimating entropy rates from finite samples of discretized data produced by coarse-grained dynamical systems.Comment: 8 pages, 1 figure; http://cse.ucdavis.edu/~cmg/compmech/pubs/hrct.ht

    The posterior-Viterbi: a new decoding algorithm for hidden Markov models

    Full text link
    Background: Hidden Markov models (HMM) are powerful machine learning tools successfully applied to problems of computational Molecular Biology. In a predictive task, the HMM is endowed with a decoding algorithm in order to assign the most probable state path, and in turn the class labeling, to an unknown sequence. The Viterbi and the posterior decoding algorithms are the most common. The former is very efficient when one path dominates, while the latter, even though does not guarantee to preserve the automaton grammar, is more effective when several concurring paths have similar probabilities. A third good alternative is 1-best, which was shown to perform equal or better than Viterbi. Results: In this paper we introduce the posterior-Viterbi (PV) a new decoding which combines the posterior and Viterbi algorithms. PV is a two step process: first the posterior probability of each state is computed and then the best posterior allowed path through the model is evaluated by a Viterbi algorithm. Conclusions: We show that PV decoding performs better than other algorithms first on toy models and then on the computational biological problem of the prediction of the topology of beta-barrel membrane proteins.Comment: 23 pages, 3 figure

    DNA Steganalysis Using Deep Recurrent Neural Networks

    Full text link
    Recent advances in next-generation sequencing technologies have facilitated the use of deoxyribonucleic acid (DNA) as a novel covert channels in steganography. There are various methods that exist in other domains to detect hidden messages in conventional covert channels. However, they have not been applied to DNA steganography. The current most common detection approaches, namely frequency analysis-based methods, often overlook important signals when directly applied to DNA steganography because those methods depend on the distribution of the number of sequence characters. To address this limitation, we propose a general sequence learning-based DNA steganalysis framework. The proposed approach learns the intrinsic distribution of coding and non-coding sequences and detects hidden messages by exploiting distribution variations after hiding these messages. Using deep recurrent neural networks (RNNs), our framework identifies the distribution variations by using the classification score to predict whether a sequence is to be a coding or non-coding sequence. We compare our proposed method to various existing methods and biological sequence analysis methods implemented on top of our framework. According to our experimental results, our approach delivers a robust detection performance compared to other tools

    Prediction of peptides binding to MHC class I alleles by partial periodic pattern mining

    Get PDF
    MHC (Major Histocompatibility Complex) is a key player in the immune response of an organism. It is important to be able to predict which antigenic peptides will bind to a spe-cific MHC allele and which will not, creating possibilities for controlling immune response and for the applications of immunotherapy. However a problem encountered in the computational binding prediction methods for MHC class I is the presence of bulges and loops in the peptides, changing the total length. Most machine learning methods in use to-day require the sequences to be of same length to success-fully mine the binding motifs. We propose the use of time-based data mining methods in motif mining to be able to mine motifs position-independently. Also, the information for both binding and non-binding peptides are used on the contrary to the other methods which only rely on binding peptides. The prediction results are between 70-80% for the tested alleles
    corecore