89,644 research outputs found
Universal Codes from Switching Strategies
We discuss algorithms for combining sequential prediction strategies, a task
which can be viewed as a natural generalisation of the concept of universal
coding. We describe a graphical language based on Hidden Markov Models for
defining prediction strategies, and we provide both existing and new models as
examples. The models include efficient, parameterless models for switching
between the input strategies over time, including a model for the case where
switches tend to occur in clusters, and finally a new model for the scenario
where the prediction strategies have a known relationship, and where jumps are
typically between strongly related ones. This last model is relevant for coding
time series data where parameter drift is expected. As theoretical ontributions
we introduce an interpolation construction that is useful in the development
and analysis of new algorithms, and we establish a new sophisticated lemma for
analysing the individual sequence regret of parameterised models
Gaze Embeddings for Zero-Shot Image Classification
Zero-shot image classification using auxiliary information, such as
attributes describing discriminative object properties, requires time-consuming
annotation by domain experts. We instead propose a method that relies on human
gaze as auxiliary information, exploiting that even non-expert users have a
natural ability to judge class membership. We present a data collection
paradigm that involves a discrimination task to increase the information
content obtained from gaze data. Our method extracts discriminative descriptors
from the data and learns a compatibility function between image and gaze using
three novel gaze embeddings: Gaze Histograms (GH), Gaze Features with Grid
(GFG) and Gaze Features with Sequence (GFS). We introduce two new
gaze-annotated datasets for fine-grained image classification and show that
human gaze data is indeed class discriminative, provides a competitive
alternative to expert-annotated attributes, and outperforms other baselines for
zero-shot image classification
The on-line shortest path problem under partial monitoring
The on-line shortest path problem is considered under various models of
partial monitoring. Given a weighted directed acyclic graph whose edge weights
can change in an arbitrary (adversarial) way, a decision maker has to choose in
each round of a game a path between two distinguished vertices such that the
loss of the chosen path (defined as the sum of the weights of its composing
edges) be as small as possible. In a setting generalizing the multi-armed
bandit problem, after choosing a path, the decision maker learns only the
weights of those edges that belong to the chosen path. For this problem, an
algorithm is given whose average cumulative loss in n rounds exceeds that of
the best path, matched off-line to the entire sequence of the edge weights, by
a quantity that is proportional to 1/\sqrt{n} and depends only polynomially on
the number of edges of the graph. The algorithm can be implemented with linear
complexity in the number of rounds n and in the number of edges. An extension
to the so-called label efficient setting is also given, in which the decision
maker is informed about the weights of the edges corresponding to the chosen
path at a total of m << n time instances. Another extension is shown where the
decision maker competes against a time-varying path, a generalization of the
problem of tracking the best expert. A version of the multi-armed bandit
setting for shortest path is also discussed where the decision maker learns
only the total weight of the chosen path but not the weights of the individual
edges on the path. Applications to routing in packet switched networks along
with simulation results are also presented.Comment: 35 page
Tracking Dengue Epidemics using Twitter Content Classification and Topic Modelling
Detecting and preventing outbreaks of mosquito-borne diseases such as Dengue
and Zika in Brasil and other tropical regions has long been a priority for
governments in affected areas. Streaming social media content, such as Twitter,
is increasingly being used for health vigilance applications such as flu
detection. However, previous work has not addressed the complexity of drastic
seasonal changes on Twitter content across multiple epidemic outbreaks. In
order to address this gap, this paper contrasts two complementary approaches to
detecting Twitter content that is relevant for Dengue outbreak detection,
namely supervised classification and unsupervised clustering using topic
modelling. Each approach has benefits and shortcomings. Our classifier achieves
a prediction accuracy of about 80\% based on a small training set of about
1,000 instances, but the need for manual annotation makes it hard to track
seasonal changes in the nature of the epidemics, such as the emergence of new
types of virus in certain geographical locations. In contrast, LDA-based topic
modelling scales well, generating cohesive and well-separated clusters from
larger samples. While clusters can be easily re-generated following changes in
epidemics, however, this approach makes it hard to clearly segregate relevant
tweets into well-defined clusters.Comment: Procs. SoWeMine - co-located with ICWE 2016. 2016, Lugano,
Switzerlan
Shifting Regret, Mirror Descent, and Matrices
We consider the problem of online prediction in changing environments. In this framework the performance of a predictor is evaluated as the loss relative to an arbitrarily changing predictor, whose individual components come from a base class of predictors. Typical results in the literature consider different base classes (experts, linear predictors on the simplex, etc.) separately. Introducing an arbitrary mapping inside the mirror decent algorithm, we provide a framework that unifies and extends existing results. As an example, we prove new shifting regret bounds for matrix prediction problems
- …