787 research outputs found
A Time and Space Efficient Junction Tree Architecture
The junction tree algorithm is a way of computing marginals of boolean
multivariate probability distributions that factorise over sets of random
variables. The junction tree algorithm first constructs a tree called a
junction tree who's vertices are sets of random variables. The algorithm then
performs a generalised version of belief propagation on the junction tree. The
Shafer-Shenoy and Hugin architectures are two ways to perform this belief
propagation that tradeoff time and space complexities in different ways: Hugin
propagation is at least as fast as Shafer-Shenoy propagation and in the cases
that we have large vertices of high degree is significantly faster. However,
this speed increase comes at the cost of an increased space complexity. This
paper first introduces a simple novel architecture, ARCH-1, which has the best
of both worlds: the speed of Hugin propagation and the low space requirements
of Shafer-Shenoy propagation. A more complicated novel architecture, ARCH-2, is
then introduced which has, up to a factor only linear in the maximum
cardinality of any vertex, time and space complexities at least as good as
ARCH-1 and in the cases that we have large vertices of high degree is
significantly faster than ARCH-1
Online Matrix Completion with Side Information
We give an online algorithm and prove novel mistake and regret bounds for
online binary matrix completion with side information. The mistake bounds we
prove are of the form . The term is
analogous to the usual margin term in SVM (perceptron) bounds. More
specifically, if we assume that there is some factorization of the underlying
matrix into where the rows of are interpreted
as "classifiers" in and the rows of as "instances" in
, then is the maximum (normalized) margin over all
factorizations consistent with the observed matrix. The
quasi-dimension term measures the quality of side information. In the
presence of vacuous side information, . However, if the side
information is predictive of the underlying factorization of the matrix, then
in an ideal case, where is the number of distinct row
factors and is the number of distinct column factors. We additionally
provide a generalization of our algorithm to the inductive setting. In this
setting, we provide an example where the side information is not directly
specified in advance. For this example, the quasi-dimension is now bounded
by
On similarity prediction and pairwise clustering
We consider the problem of clustering a finite set of items from pairwise similarity information. Unlike what is done in the literature on this subject, we do so in a passive learning setting, and with no specific constraints on the cluster shapes other than their size. We investigate the problem in different settings: i. an online setting, where we provide a tight characterization of the prediction complexity in the mistake bound model, and ii. a standard stochastic batch setting, where we give tight upper and lower bounds on the achievable generalization error. Prediction performance is measured both in terms of the ability to recover the similarity function encoding the hidden clustering and in terms of how well we classify each item within the set. The proposed algorithms are time efficient
Efficient algorithms for online learning over graphs
In this thesis we consider the problem of online learning with labelled graphs, in particular designing algorithms that can perform this problem quickly and with low memory requirements. We consider the tasks of Classification (in which we are asked to predict the labels of vertices) and Similarity Prediction (in which we are asked to predict whether two given vertices have the same label). The first half of the thesis considers non- probabilistic online learning, where there is no probability distribution on the labelling and we bound the number of mistakes of an algorithm by a function of the labelling’s complexity (i.e. its “naturalness"), often the cut- size. The second half of the thesis considers probabilistic machine learning in which we have a known probability distribution on the labelling. Before considering probabilistic online learning we first analyse the junction tree algorithm, on which we base our online algorithms, and design a new ver- sion of it, superior to the otherwise current state of the art. Explicitly, the novel contributions of this thesis are as follows: • A new algorithm for online prediction of the labelling of a graph which has better performance than previous algorithms on certain graph and labelling families. • Two algorithms for online similarity prediction on a graph (a novel problem solved in this thesis). One performs very well whilst the other not so well but which runs exponentially faster. • A new (better than before, in terms of time and space complexity) state of the art junction tree algorithm, as well as an application of it to the problem of online learning in an Ising model. • An algorithm that, in linear time, finds the optimal junction tree for online inference in tree-structured Ising models, the resulting online junction tree algorithm being far superior to the previous state of the art. All claims in this thesis are supported by mathematical proofs
Synthesis of (E)-2- Chlorhomovinyl Ribose to Examine the Mechanism of 6-Chlorhomovinyl Adenosine
Nucleosides function as important agents in fundamental human metabolism and provide a broad range of activity in resulting modification of these nucleosides as functional prodrugs. Various modified nucleosides exhibit inhibitory activity towards enzymes and carry the possibility of containing anti-viral and anti-cancer properties. These properties are often exhibited by modifying the nucleobase moiety or the sugar moiety as synthesized novel nucleosides. In understanding my project aim, the sugar moiety of 6-chlorohomovinyl adenosine will be synthesized as the novel nucleoside (E)-2-chlorovinylribose to examine the nucleoside as the active component of 6-chlorohomovinyl adenosine. 6-chlorohomovinyl adenosine 1 demonstrates an active component that works specifically against Trypanosoma Brucei, a parasite associated with African trypanosomiasis, which is a widespread disease within East and South Africa. This compound has also demonstrated positive results within animal testing and is in the second phase of drug testing. As the sugar analog of 1, (E)-2-chlorovinylribose 4 will be examined to understand the mechanistic properties and active sites of 1. The proposed synthesis for the target compound begins with oxidative cleavage at C5 and C6 of 2 which is commercially available and is the primary source needed to make the desired product in addition to the necessary reactive agents. Followed by benzyl substitution, 2 is converted into 3 by the means of periodic acid and ethyl acetate and formed into the 5’OH-ribose by the means of sodium borohydride reduction. Furthermore, Moffatt oxidation followed by Wittig homologation with the tosylated ylide (PPh3CH=Ts), Bu3SnH with AIBN is expressed on the compound to propagate a new bond on the 6’ Carbon for conversion into the desired chloro group on 6’C. Then, deprotection of the isopropyl group and the benzyl group are formed to the target 4. This synthesis of compound 4 would provide as the evidence as to how 1 function within the cell environmen
MaxHedge: Maximising a Maximum Online
We introduce a new online learning framework where, at each trial, the
learner is required to select a subset of actions from a given known action
set. Each action is associated with an energy value, a reward and a cost. The
sum of the energies of the actions selected cannot exceed a given energy
budget. The goal is to maximise the cumulative profit, where the profit
obtained on a single trial is defined as the difference between the maximum
reward among the selected actions and the sum of their costs. Action energy
values and the budget are known and fixed. All rewards and costs associated
with each action change over time and are revealed at each trial only after the
learner's selection of actions. Our framework encompasses several online
learning problems where the environment changes over time; and the solution
trades-off between minimising the costs and maximising the maximum reward of
the selected subset of actions, while being constrained to an action energy
budget. The algorithm that we propose is efficient and general in that it may
be specialised to multiple natural online combinatorial problems.Comment: Published in AISTATS 201
Mistake Bounds for Binary Matrix Completion
We study the problem of completing a binary matrix in an online learning setting.On each trial we predict a matrix entry and then receive the true entry. We propose a Matrix Exponentiated Gradient algorithm [1] to solve this problem. We provide a mistake bound for the algorithm, which scales with the margin complexity [2, 3] of the underlying matrix. The bound suggests an interpretation where each row of the matrix is a prediction task over a finite set of objects, the columns. Using this we show that the algorithm makes a number of mistakes which is comparable up to a logarithmic factor to the number of mistakes made by the Kernel Perceptron with an optimal kernel in hindsight. We discuss applications of the algorithm to predicting as well as the best biclustering and to the problem of predicting the labeling of a graph without knowing the graph in advance
Online Similarity Prediction of Networked Data from Known and Unknown Graphs
We consider online similarity prediction problems over networked data. We begin by relating this task to the more standard class prediction problem, showing that, given an arbitrary algorithm for class prediction, we can construct an algorithm for similarity prediction with "nearly" the same mistake bound, and vice versa. After noticing that this general construction is computationally infeasible, we target our study to {\em feasible} similarity prediction algorithms on networked data. We initially assume that the network structure is {\em known} to the learner. Here we observe that Matrix Winnow \cite{w07} has a near-optimal mistake guarantee, at the price of cubic prediction time per round. This motivates our effort for an efficient implementation of a Perceptron algorithm with a weaker mistake guarantee but with only poly-logarithmic prediction time. Our focus then turns to the challenging case of networks whose structure is initially {\em unknown} to the learner. In this novel setting, where the network structure is only incrementally revealed, we obtain a mistake-bounded algorithm with a quadratic prediction time per round
- …