4,431 research outputs found

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

    Applications of Biological Cell Models in Robotics

    Full text link
    In this paper I present some of the most representative biological models applied to robotics. In particular, this work represents a survey of some models inspired, or making use of concepts, by gene regulatory networks (GRNs): these networks describe the complex interactions that affect gene expression and, consequently, cell behaviour

    Investigating modularity and transparency within bioinspired connectionist architectures using genetic and epigenetic models

    Get PDF
    Machine learning algorithms allow computers to deal with incomplete data in tasks such as speech recognition and object detection. Some machine learning algorithms take inspiration from biological systems due to useful properties such as robustness, allowing algorithms to be flexible and domain agnostic. This comes at a cost, resulting in difficulty when one attempts to understand the reasoning behind decisions. This is problematic when such models are applied in realworld situations where accountability, legality, and maintenance are of concern. Artificial gene regulatory networks (AGRNs) are a type of connectionist architecture inspired by gene regulatory mechanisms. AGRNs are of interest within this thesis due to their ability to solve tasks in chaotic dynamical systems despite their relatively small size.The overarching aim of this work was to investigate the properties of connectionist architectures to improve the transparency of their execution. Initially, the evolutionary process and internal structure of AGRNs were investigated. Following this, the creation of an external control layer used to improve the transparency of execution of an external connectionist architecture was attempted.When investigating the evolutionary process of AGRNs, pathways were found that when followed, produced more performant networks in a shorter time frame. Evidence that AGRNs are capable of performing well despite internal interference was found when investigating their modularity, where it was also discovered that they do not develop strict modularity consistently. A control layer inspired by epigenetics that selectively deactivates nodes in trained artificial neural networks (ANNs) was developed; the analysis of its behaviour provided an insight into the internal workings of the ANN

    Computational studies of genome evolution and regulation

    Get PDF
    This thesis takes on the challenge of extracting information from large volumes of biological data produced with newly established experimental techniques. The different types of information present in a particular dataset have been carefully identified to maximise the information gained from the data. This also precludes the attempts to infer the types of information that are not present in the data. In the first part of the thesis I examined the evolutionary origins of de novo taxonomically restricted genes (TRGs) in Drosophila subgenus. De novo TRGs are genes that have originated after the speciation of a particular clade from previously non-coding regions - functional ncRNA, within introns or alternative frames of older protein-coding genes, or from intergenic sequences. TRGs are clade-specific tool-kits that are likely to contain proteins with yet undocumented functions and new protein folds that are yet to be discovered. One of the main challenges in studying de novo TRGs is the trade-off between false positives (non-functional open reading frames) and false negatives (true TRGs that have properties distinct from well established genes). Here I identified two de novo TRG families in Drosophila subgenus that have not been previously reported as de novo originated genes, and to our knowledge they are the best candidates identified so far for experimental studies aimed at elucidating the properties of de novo genes. In the second part of the thesis I examined the information contained in single cell RNA sequencing (scRNA-seq) data and propose a method for extracting biological knowledge from this data using generative neural networks. The main challenge is the noisiness of scRNA-seq data - the number of transcripts sequenced is not proportional to the number of mRNAs present in the cell. I used an autoencoder to reduce the dimensionality of the data without making untestable assumptions about the data. This embedding into lower dimensional space alongside the features learned by an autoencoder contains information about the cell populations, differentiation trajectories and the regulatory relationships between the genes. Unlike most methods currently used, an autoencoder does not assume that these regulatory relationships are the same in all cells in the data set. The main advantages of our approach is that it makes minimal assumptions about the data, it is robust to noise and it is possible to assess its performance. In the final part of the thesis I summarise lessons learnt from analysing various types of biological data and make suggestions for the future direction of similar computational studies

    Predicting Secondary Structures, Contact Numbers, and Residue-wise Contact Orders of Native Protein Structure from Amino Acid Sequence by Critical Random Networks

    Full text link
    Prediction of one-dimensional protein structures such as secondary structures and contact numbers is useful for the three-dimensional structure prediction and important for the understanding of sequence-structure relationship. Here we present a new machine-learning method, critical random networks (CRNs), for predicting one-dimensional structures, and apply it, with position-specific scoring matrices, to the prediction of secondary structures (SS), contact numbers (CN), and residue-wise contact orders (RWCO). The present method achieves, on average, Q3Q_3 accuracy of 77.8% for SS, correlation coefficients of 0.726 and 0.601 for CN and RWCO, respectively. The accuracy of the SS prediction is comparable to other state-of-the-art methods, and that of the CN prediction is a significant improvement over previous methods. We give a detailed formulation of critical random networks-based prediction scheme, and examine the context-dependence of prediction accuracies. In order to study the nonlinear and multi-body effects, we compare the CRNs-based method with a purely linear method based on position-specific scoring matrices. Although not superior to the CRNs-based method, the surprisingly good accuracy achieved by the linear method highlights the difficulty in extracting structural features of higher order from amino acid sequence beyond that provided by the position-specific scoring matrices.Comment: 20 pages, 1 figure, 5 tables; minor revision; accepted for publication in BIOPHYSIC
    corecore