4 research outputs found

    Bioinformatics tools in predictive ecology: Applications to fisheries

    Get PDF
    This article is made available throught the Brunel Open Access Publishing Fund - Copygith @ 2012 Tucker et al.There has been a huge effort in the advancement of analytical techniques for molecular biological data over the past decade. This has led to many novel algorithms that are specialized to deal with data associated with biological phenomena, such as gene expression and protein interactions. In contrast, ecological data analysis has remained focused to some degree on off-the-shelf statistical techniques though this is starting to change with the adoption of state-of-the-art methods, where few assumptions can be made about the data and a more explorative approach is required, for example, through the use of Bayesian networks. In this paper, some novel bioinformatics tools for microarray data are discussed along with their ‘crossover potential’ with an application to fisheries data. In particular, a focus is made on the development of models that identify functionally equivalent species in different fish communities with the aim of predicting functional collapse

    Interactive Exploration of Multitask Dependency Networks

    Get PDF
    Scientists increasingly depend on machine learning algorithms to discover patterns in complex data. Two examples addressed in this dissertation are identifying how information sharing among regions of the brain develops due to learning; and, learning dependency networks of blood proteins associated with cancer. Dependency networks, or graphical models, are learned from the observed data in order to make comparisons between the sub-populations of the dataset. Rarely is there sufficient data to infer robust individual networks for each sub-population. The multiple networks must be considered simultaneously; exploding the hypothesis space of the learning problem. Exploring this complex solution space requires input from the domain scientist to refine the objective function. This dissertation introduces a framework to incorporate domain knowledge in transfer learning to facilitate the exploration of solutions. The framework is a generalization of existing algorithms for multiple network structure identification. Solutions produced with human input narrow down the variance of solutions to those that answer questions of interest to domain scientists. Patterns, such as identifying differences between networks, are learned with higher confidence using transfer learning than through the standard method of bootstrapping. Transfer learning may be the ideal method for making comparisons among dependency networks, whether looking for similarities or differences. Domain knowledge input and visualization of solutions are combined in an interactive tool that enables domain scientists to explore the space of solutions efficiently

    Combining heterogeneous sources of data for the reverse-engineering of gene regulatory networks

    Get PDF
    Gene Regulatory Networks (GRNs) represent how genes interact in various cellular processes by describing how the expression level, or activity, of genes can affect the expression of the other genes. Reverse-engineering GRN models can help biologists understand and gain insight into genetic conditions and diseases. Recently, the increasingly widespread use of DNA microarrays, a high-throughput technology that allows the expression of thousands of genes to be measured simultaneously in biological experiments, has led to many datasets of gene expression measurements becoming publicly available and a subsequent explosion of research in the reverse-engineering of GRN models. However, microarray technology has a number of limitations as a data source for the modelling of GRNs, due to concerns over its reliability and the reproducibility of experimental results. The underlying theme of the research presented in this thesis is the incorporation of multiple sources and different types of data into techniques for reverse-engineering or learning GRNs from data. By drawing on many data sources, the resulting network models should be more robust, accurate and reliable than models that have been learnt using a single data source. This is achieved by focusing on two main strands of research. First, the thesis presents some of the earliest work in the incorporation of prior knowledge that has been generated from a large body of scientific papers, for Bayesian network based GRN models. Second, novel methods for the use of multiple microarray datasets to produce Bayesian network based GRN models are introduced. Empirical evaluations are used to show that the incorporation of literature-based prior knowledge and combining multiple microarray datasets can provide an improvement, when compared to the use of a single microarray dataset, for the reverse-engineering of Bayesian network based GRN models.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore