52,675 research outputs found

    Knowledge discovery in biological databases : a neural network approach

    Get PDF
    Knowledge discovery, in databases, also known as data mining, is aimed to find significant information from a set of data. The knowledge to be mined from the dataset may refer to patterns, association rules, classification and clustering rules, and so forth. In this dissertation, we present a neural network approach to finding knowledge in biological databases. Specifically, we propose new methods to process biological sequences in two case studies: the classification of protein sequences and the prediction of E. Coli promoters in DNA sequences. Our proposed methods, based oil neural network architectures combine techniques ranging from Bayesian inference, coding theory, feature selection, dimensionality reduction, to dynamic programming and machine learning algorithms. Empirical studies show that the proposed methods outperform previously published methods and have excellent performance on the latest dataset. We have implemented the proposed algorithms into an infrastructure, called Genome Mining, developed for biosequence classification and recognition

    Bioinformatics Approach for Pattern of Myelin-Specific Proteins and Related Human Disorders

    Get PDF
    Background: Recent neuroinformatic studies, on the structure-function interaction of proteins, causative agents basis of human disease have implied that dysfunction or defect of different protein classes could be associated with several related diseases. Objectives: The aim of this study was the use of bioinformatics approaches for understanding the structure, function and relationship of myelin protein 2 (PMP2), a myelin-basic protein in the basis of neuronal disorders. Methods: A collection of databases for exploiting classification information systematically, including, protein structure, protein family and classification of human disease, based on a new approach was used. Knowledge discovery was carried out based on collections criteria and in silico integrative in vitro studies. Results: The results of the evaluation of bioinformatics comorbid proteomics studies revealed that PMP2, an intracellular andmembrane myelin protein, is specific for a neuritis disease and collaborative to other diseases. Leprosy, another neuronal disease that could be related to neuritis, consists of interferon gamma (IFNG), a secreted protein included various protein classes from what is neuritis. Conclusions: The growth rate of information in bioinformatics databases could facilitate studies of live organisms prior to observation studies. Two different protein classes could be causative agents of one disease. However, two related diseases from one disease group could consist of different protein classes. Future research in the field of proteomics could allow modern insight to reshuffling of proteins in different diseases, and lead to the discovery of the etiology of such diseases

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

    An empirical comparison of supervised machine learning techniques in bioinformatics

    Get PDF
    Research in bioinformatics is driven by the experimental data. Current biological databases are populated by vast amounts of experimental data. Machine learning has been widely applied to bioinformatics and has gained a lot of success in this research area. At present, with various learning algorithms available in the literature, researchers are facing difficulties in choosing the best method that can apply to their data. We performed an empirical study on 7 individual learning systems and 9 different combined methods on 4 different biological data sets, and provide some suggested issues to be considered when answering the following questions: (i) How does one choose which algorithm is best suitable for their data set? (ii) Are combined methods better than a single approach? (iii) How does one compare the effectiveness of a particular algorithm to the others
    corecore