2,038 research outputs found
Improving the efficiency of Bayesian Network Based EDAs and their application in Bioinformatics
Estimation of distribution algorithms (EDAs) is a relatively new trend of stochastic optimizers which have received a lot of attention during last decade. In each generation, EDAs build probabilistic models of promising solutions of an optimization problem to guide the search process. New sets of solutions are obtained by sampling the corresponding probability distributions. Using this approach, EDAs are able to provide the user a set of models that reveals the dependencies between variables of the optimization problems while solving them. In order to solve a complex problem, it is necessary to use a probabilistic model which is able to capture the dependencies. Bayesian networks are usually used for modeling multiple dependencies between variables. Learning Bayesian networks, especially for large problems with high degree of dependencies among their variables is highly computationally expensive which makes it the bottleneck of EDAs. Therefore introducing efficient Bayesian learning algorithms in EDAs seems necessary in order to use them for large problems. In this dissertation, after comparing several Bayesian network learning algorithms, we propose an algorithm, called CMSS-BOA, which uses a recently introduced heuristic called max-min parent children (MMPC) in order to constrain the model search space. This algorithm does not consider a fixed and small upper bound on the order of interaction between variables and is able solve problems with large numbers of variables efficiently. We compare the efficiency of CMSS-BOA with the standard Bayesian network based EDA for solving several benchmark problems and finally we use it to build a predictor for predicting the glycation sites in mammalian proteins
Fast model-fitting of Bayesian variable selection regression using the iterative complex factorization algorithm
Bayesian variable selection regression (BVSR) is able to jointly analyze
genome-wide genetic datasets, but the slow computation via Markov chain Monte
Carlo (MCMC) hampered its wide-spread usage. Here we present a novel iterative
method to solve a special class of linear systems, which can increase the speed
of the BVSR model-fitting tenfold. The iterative method hinges on the complex
factorization of the sum of two matrices and the solution path resides in the
complex domain (instead of the real domain). Compared to the Gauss-Seidel
method, the complex factorization converges almost instantaneously and its
error is several magnitude smaller than that of the Gauss-Seidel method. More
importantly, the error is always within the pre-specified precision while the
Gauss-Seidel method is not. For large problems with thousands of covariates,
the complex factorization is 10 -- 100 times faster than either the
Gauss-Seidel method or the direct method via the Cholesky decomposition. In
BVSR, one needs to repetitively solve large penalized regression systems whose
design matrices only change slightly between adjacent MCMC steps. This slight
change in design matrix enables the adaptation of the iterative complex
factorization method. The computational innovation will facilitate the
wide-spread use of BVSR in reanalyzing genome-wide association datasets.Comment: Accepted versio
Contributions to statistical machine learning algorithm
This thesis's research focus is on computational statistics along with DEAR (abbreviation of differential equation associated regression) model direction, and that in mind, the journal papers are written as contributions to statistical machine learning algorithm literature
Alternative Methods for H1 Simulations in Genome Wide Association Studies
Assessing the statistical power to detect susceptibility variants plays a
critical role in GWA studies both from the prospective and retrospective points
of view. Power is empirically estimated by simulating phenotypes under a
disease model H1. For this purpose, the "gold" standard consists in simulating
genotypes given the phenotypes (e.g. Hapgen). We introduce here an alternative
approach for simulating phenotypes under H1 that does not require generating
new genotypes for each simulation. In order to simulate phenotypes with a fixed
total number of cases and under a given disease model, we suggest three
algorithms: i) a simple rejection algorithm; ii) a numerical Markov Chain
Monte-Carlo (MCMC) approach; iii) and an exact and efficient backward sampling
algorithm. In our study, we validated the three algorithms both on a
toy-dataset and by comparing them with Hapgen on a more realistic dataset. As
an application, we then conducted a simulation study on a 1000 Genomes Project
dataset consisting of 629 individuals (314 cases) and 8,048 SNPs from
Chromosome X. We arbitrarily defined an additive disease model with two
susceptibility SNPs and an epistatic effect. The three algorithms are
consistent, but backward sampling is dramatically faster than the other two.
Our approach also gives consistent results with Hapgen. Using our application
data, we showed that our limited design requires a biological a priori to limit
the investigated region. We also proved that epistatic effects can play a
significant role even when simple marker statistics (e.g. trend) are used. We
finally showed that the overall performance of a GWA study strongly depends on
the prevalence of the disease: the larger the prevalence, the better the power
Optimising algorithm and hardware for deep neural networks on FPGAs
This thesis proposes novel algorithm and hardware optimisation approaches to accelerate Deep Neural Networks (DNNs), including both Convolutional Neural Networks (CNNs) and Bayesian Neural Networks (BayesNNs).
The first contribution of this thesis is to propose an adaptable and reconfigurable hardware design to accelerate CNNs. By analysing the computational patterns of different CNNs, a unified hardware architecture is proposed for both 2-Dimension and 3-Dimension CNNs. The accelerator is also designed with runtime adaptability, which adopts different parallelism strategies for different convolutional layers at runtime.
The second contribution of this thesis is to propose a novel neural network architecture and hardware design co-optimisation approach, which improves the performance of CNNs at both algorithm and hardware levels. Our proposed three-phase co-design framework decouples network training from design space exploration, which significantly reduces the time-cost of the co-optimisation process.
The third contribution of this thesis is to propose an algorithmic and hardware co-optimisation framework for accelerating BayesNNs. At the algorithmic level, three categories of structured sparsity are explored to reduce the computational complexity of BayesNNs. At the hardware level, we propose a novel hardware architecture with the aim of exploiting the structured sparsity for BayesNNs. Both algorithmic and hardware optimisations are jointly applied to push the performance limit.Open Acces
Learning and Interpreting Multi-Multi-Instance Learning Networks
We introduce an extension of the multi-instance learning problem where
examples are organized as nested bags of instances (e.g., a document could be
represented as a bag of sentences, which in turn are bags of words). This
framework can be useful in various scenarios, such as text and image
classification, but also supervised learning over graphs. As a further
advantage, multi-multi instance learning enables a particular way of
interpreting predictions and the decision function. Our approach is based on a
special neural network layer, called bag-layer, whose units aggregate bags of
inputs of arbitrary size. We prove theoretically that the associated class of
functions contains all Boolean functions over sets of sets of instances and we
provide empirical evidence that functions of this kind can be actually learned
on semi-synthetic datasets. We finally present experiments on text
classification, on citation graphs, and social graph data, which show that our
model obtains competitive results with respect to accuracy when compared to
other approaches such as convolutional networks on graphs, while at the same
time it supports a general approach to interpret the learnt model, as well as
explain individual predictions.Comment: JML
Universal Algorithmic Intelligence: A mathematical top->down approach
Sequential decision theory formally solves the problem of rational agents in
uncertain worlds if the true environmental prior probability distribution is
known. Solomonoff's theory of universal induction formally solves the problem
of sequence prediction for unknown prior distribution. We combine both ideas
and get a parameter-free theory of universal Artificial Intelligence. We give
strong arguments that the resulting AIXI model is the most intelligent unbiased
agent possible. We outline how the AIXI model can formally solve a number of
problem classes, including sequence prediction, strategic games, function
minimization, reinforcement and supervised learning. The major drawback of the
AIXI model is that it is uncomputable. To overcome this problem, we construct a
modified algorithm AIXItl that is still effectively more intelligent than any
other time t and length l bounded agent. The computation time of AIXItl is of
the order t x 2^l. The discussion includes formal definitions of intelligence
order relations, the horizon problem and relations of the AIXI theory to other
AI approaches.Comment: 70 page
- …