18 research outputs found

    Multi-layered model of individual HIV infection progression and mechanisms of phenotypical expression

    Get PDF
    Cite as: Perrin, Dimitri (2008) Multi-layered model of individual HIV infection progression and mechanisms of phenotypical expression. PhD thesis, Dublin City University

    Clustering Algorithms: Their Application to Gene Expression Data

    Get PDF
    Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and iden-tify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure

    Unique networks: a method to identity disease-specific regulatory networks from microarray data

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The survival of any organismis determined by the mechanisms triggered in response to the inputs received. Underlying mechanisms are described by graphical networks that can be inferred from different types of data such as microarrays. Deriving robust and reliable networks can be complicated due to the microarray structure of the data characterized by a discrepancy between the number of genes and samples of several orders of magnitude, bias and noise. Researchers overcome this problem by integrating independent data together and deriving the common mechanisms through consensus network analysis. Different conditions generate different inputs to the organism which reacts triggering different mechanisms with similarities and differences. A lot of effort has been spent into identifying the commonalities under different conditions. Highlighting similarities may overshadow the differences which often identify the main characteristics of the triggered mechanisms. In this thesis we introduce the concept of study-specific mechanism. We develop a pipeline to semiautomatically identify study-specific networks called unique-networks through a combination of consensus approach, graphical similarities and network analysis. The main pipeline called UNIP (Unique Networks Identification Pipeline) takes a set of independent studies, builds gene regulatory networks for each of them, calculates an adaptation of the sensitivity measure based on the networks graphical similarities, applies clustering to group the studies who generate the most similar networks into study-clusters and derives the consensus networks. Once each study-cluster is associated with a consensus-network, we identify the links that appear only in the consensus network under consideration but not in the others (unique-connections). Considering the genes involved in the unique-connections we build Bayesian networks to derive the unique-networks. Finally, we exploit the inference tool to calculate each gene prediction-accuracy across all studies to further refine the unique-networks. Biological validation through different software and the literature are explored to validate our method. UNIP is first applied to a set of synthetic data perturbed with different levels of noise to study the performance and verify its reliability. Then, wheat under stress conditions and different types of cancer are explored. Finally, we develop a user-friendly interface to combine the set of studies by using AND and NOT logic operators. Based on the findings, UNIP is a robust and reliable method to analyse large sets of transcriptomic data. It easily detects the main complex relationships between transcriptional expression of genes specific for different conditions and also highlights structures and nodes that could be potential targets for further research

    Gene regulatory network modelling with evolutionary algorithms -an integrative approach

    Get PDF
    Building models for gene regulation has been an important aim of Systems Biology over the past years, driven by the large amount of gene expression data that has become available. Models represent regulatory interactions between genes and transcription factors and can provide better understanding of biological processes, and means of simulating both natural and perturbed systems (e.g. those associated with disease). Gene regulatory network (GRN) quantitative modelling is still limited, however, due to data issues such as noise and restricted length of time series, typically used for GRN reverse engineering. These issues create an under-determination problem, with many models possibly fitting the data. However, large amounts of other types of biological data and knowledge are available, such as cross-platform measurements, knockout experiments, annotations, binding site affinities for transcription factors and so on. It has been postulated that integration of these can improve model quality obtained, by facilitating further filtering of possible models. However, integration is not straightforward, as the different types of data can provide contradictory information, and are intrinsically noisy, hence large scale integration has not been fully explored, to date. Here, we present an integrative parallel framework for GRN modelling, which employs evolutionary computation and different types of data to enhance model inference. Integration is performed at different levels. (i) An analysis of cross-platform integration of time series microarray data, discussing the effects on the resulting models and exploring crossplatform normalisation techniques, is presented. This shows that time-course data integration is possible, and results in models more robust to noise and parameter perturbation, as well as reduced noise over-fitting. (ii) Other types of measurements and knowledge, such as knock-out experiments, annotated transcription factors, binding site affinities and promoter sequences are integrated within the evolutionary framework to obtain more plausible GRN models. This is performed by customising initialisation, mutation and evaluation of candidate model solutions. The different data types are investigated and both qualitative and quantitative improvements are obtained. Results suggest that caution is needed in order to obtain improved models from combined data, and the case study presented here provides an example of how this can be achieved. Furthermore, (iii), RNA-seq data is studied in comparison to microarray experiments, to identify overlapping features and possibilities of integration within the framework. The extension of the framework to this data type is straightforward and qualitative improvements are obtained when combining predicted interactions from single-channel and RNA-seq datasets
    corecore