137 research outputs found

    Non-acyclicity of coset lattices and generation of finite groups

    Get PDF

    Part I:

    Get PDF

    A Bayesian approach to simultaneously characterize the stochastic and deterministic components of a system

    Get PDF
    The present work provides a Bayesian approach to learn plausible models capable of characterizing complex time series in which deterministic and stochastic phenomena concur. Two main approaches are actually developed. The first approach, is a simple superposition model grounded on the hypothesis that the interactions between the stochastic and deterministic phenomena are negligible. To enable this model to capture complex dynamics, the stochastic part is assumed to be a fractal signal. Under the assumptions of this model, an analysis method is proposed, enabling the characterization of the fractal stochastic component and the estimation the deterministic part. The second main approach relies on Stochastic Differential Equations (SDEs) to model systems where the stochastic and deterministic part interact. First, a non-parametric estimation method for SDEs is developed, using recent advances from Gaussian processes. Finally, the thesis studies how to overcome the main constraint that the use of SDEs imposes: the Markovianity assumption. To that end, a new structured variational autoencoder with latent SDE dynamics is proposed. All the methods are tested on both synthetic and real signals, demonstrating its ability to capture the behavior of complex systems

    Statistical inference from large-scale genomic data

    Get PDF
    This thesis explores the potential of statistical inference methodologies in their applications in functional genomics. In essence, it summarises algorithmic findings in this field, providing step-by-step analytical methodologies for deciphering biological knowledge from large-scale genomic data, mainly microarray gene expression time series. This thesis covers a range of topics in the investigation of complex multivariate genomic data. One focus involves using clustering as a method of inference and another is cluster validation to extract meaningful biological information from the data. Information gained from the application of these various techniques can then be used conjointly in the elucidation of gene regulatory networks, the ultimate goal of this type of analysis. First, a new tight clustering method for gene expression data is proposed to obtain tighter and potentially more informative gene clusters. Next, to fully utilise biological knowledge in clustering validation, a validity index is defined based on one of the most important ontologies within the Bioinformatics community, Gene Ontology. The method bridges a gap in current literature, in the sense that it takes into account not only the variations of Gene Ontology categories in biological specificities and their significance to the gene clusters, but also the complex structure of the Gene Ontology. Finally, Bayesian probability is applied to making inference from heterogeneous genomic data, integrated with previous efforts in this thesis, for the aim of large-scale gene network inference. The proposed system comes with a stochastic process to achieve robustness to noise, yet remains efficient enough for large-scale analysis. Ultimately, the solutions presented in this thesis serve as building blocks of an intelligent system for interpreting large-scale genomic data and understanding the functional organisation of the genome

    Dynamic Network Reconstruction in Systems Biology: Methods and Algorithms

    Get PDF
    Dynamic network reconstruction refers to a class of problems that explore causal interactions between variables operating in dynamical systems. This dissertation focuses on methods and algorithms that reconstruct/infer network topology or dynamics from observations of an unknown system. The essential challenges, compared to system identification, are imposing sparsity on network topology and ensuring network identifiability. This work studies the following cases: multiple experiments with heterogeneity, low sampling frequency and nonlinearity, which are generic in biology that make reconstruction problems particularly challenging. The heterogeneous data sets are measurements in multiple experiments from the underlying dynamical systems that are different in parameters, whereas the network topology is assumed to be consistent. It is particularly common in biological applications. This dissertation proposes a way to deal with multiple data sets together to increase computational robustness. Furthermore, it can also be used to enforce network identifiability by multiple experiments with input perturbations. The necessity to study low-sampling-frequency data is due to the mismatch of network topology of discrete-time and continuous-time models. It is generally assumed that the underlying physical systems are evolving over time continuously. An important concept system aliasing is introduced to manifest whether the continuous system can be uniquely determined from its associated discrete-time model with the specified sampling frequency. A Nyquist-Shannon-like sampling theorem is provided to determine the critical sampling frequency for system aliasing. The reconstruction method integrates the Expectation Maximization (EM) method with a modified Sparse Bayesian Learning (SBL) to deal with reconstruction from output measurements. A tentative study on nonlinear Boolean network reconstruction is provided. The nonlinear Boolean network is considered as a union of local networks of linearized dynamical systems. The reconstruction method extends the algorithm used for heterogeneous data sets to provide an approximated inference but improve computational robustness significantly. The reconstruction algorithms are implemented in MATLAB and wrapped as a package. With considerations on generic signal features in practice, this work contributes to practically useful network reconstruction methods in biological applications

    Statistical inference from large-scale genomic data

    Get PDF
    This thesis explores the potential of statistical inference methodologies in their applications in functional genomics. In essence, it summarises algorithmic findings in this field, providing step-by-step analytical methodologies for deciphering biological knowledge from large-scale genomic data, mainly microarray gene expression time series. This thesis covers a range of topics in the investigation of complex multivariate genomic data. One focus involves using clustering as a method of inference and another is cluster validation to extract meaningful biological information from the data. Information gained from the application of these various techniques can then be used conjointly in the elucidation of gene regulatory networks, the ultimate goal of this type of analysis. First, a new tight clustering method for gene expression data is proposed to obtain tighter and potentially more informative gene clusters. Next, to fully utilise biological knowledge in clustering validation, a validity index is defined based on one of the most important ontologies within the Bioinformatics community, Gene Ontology. The method bridges a gap in current literature, in the sense that it takes into account not only the variations of Gene Ontology categories in biological specificities and their significance to the gene clusters, but also the complex structure of the Gene Ontology. Finally, Bayesian probability is applied to making inference from heterogeneous genomic data, integrated with previous efforts in this thesis, for the aim of large-scale gene network inference. The proposed system comes with a stochastic process to achieve robustness to noise, yet remains efficient enough for large-scale analysis. Ultimately, the solutions presented in this thesis serve as building blocks of an intelligent system for interpreting large-scale genomic data and understanding the functional organisation of the genome.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    • …
    corecore