70 research outputs found

    Learning the Switching Rate by Discretising Bernoulli Sources Online

    Get PDF

    Universal Codes from Switching Strategies

    Get PDF
    We discuss algorithms for combining sequential prediction strategies, a task which can be viewed as a natural generalisation of the concept of universal coding. We describe a graphical language based on Hidden Markov Models for defining prediction strategies, and we provide both existing and new models as examples. The models include efficient, parameterless models for switching between the input strategies over time, including a model for the case where switches tend to occur in clusters, and finally a new model for the scenario where the prediction strategies have a known relationship, and where jumps are typically between strongly related ones. This last model is relevant for coding time series data where parameter drift is expected. As theoretical ontributions we introduce an interpolation construction that is useful in the development and analysis of new algorithms, and we establish a new sophisticated lemma for analysing the individual sequence regret of parameterised models

    Online Isotonic Regression

    Get PDF
    We consider the online version of the isotonic regression problem. Given a set of linearly ordered points (e.g., on the real line), the learner must predict labels sequentially at adversarially chosen positions and is evaluated by her total squared loss compared against the best isotonic (non-decreasing) function in hindsight. We survey several standard online learning algorithms and show that none of them achieve the optimal regret exponent; in fact, most of them (including Online Gradient Descent, Follow the Leader and Exponential Weights) incur linear regret. We then prove that the Exponential Weights algorithm played over a covering net of isotonic functions has a regret bounded by O(T1/3log⁥2/3(T))O\big(T^{1/3} \log^{2/3}(T)\big) and present a matching Ω(T1/3)\Omega(T^{1/3}) lower bound on regret. We provide a computationally efficient version of this algorithm. We also analyze the noise-free case, in which the revealed labels are isotonic, and show that the bound can be improved to O(log⁥T)O(\log T) or even to O(1)O(1) (when the labels are revealed in isotonic order). Finally, we extend the analysis beyond squared loss and give bounds for entropic loss and absolute loss.Comment: 25 page

    EUROPEAN CONFERENCE ON QUEUEING THEORY 2016

    Get PDF
    International audienceThis booklet contains the proceedings of the second European Conference in Queueing Theory (ECQT) that was held from the 18th to the 20th of July 2016 at the engineering school ENSEEIHT, Toulouse, France. ECQT is a biannual event where scientists and technicians in queueing theory and related areas get together to promote research, encourage interaction and exchange ideas. The spirit of the conference is to be a queueing event organized from within Europe, but open to participants from all over the world. The technical program of the 2016 edition consisted of 112 presentations organized in 29 sessions covering all trends in queueing theory, including the development of the theory, methodology advances, computational aspects and applications. Another exciting feature of ECQT2016 was the institution of the TakĂĄcs Award for outstanding PhD thesis on "Queueing Theory and its Applications"

    Detecting changes in high frequency data streams, with applications

    No full text
    In recent years, problems relating to the analysis of data streams have become widespread. A data stream is a collection of time ordered observations x1, x2, ... generated from the random variables X1, X2, .... It is assumed that the observations are univariate and independent, and that they arrive in discrete time. Unlike traditional sequential analysis problems considered by statisticians, the size of a data stream is not assumed to be fixed, and new observations may be received over time. The rate at which these observations are received can be very high, perhaps several thousand every second. Therefore computational efficiency is very important, and methods used for analysis must be able to cope with potentially huge data sets. This paper is concerned with the task of detecting whether a data stream contains a change point, and extends traditional methods for sequential change detection to the streaming context. We focus on two different settings of the change point problem. The first is nonparametric change detection where, in contrast to most of the existing literature, we assume that nothing is known about either the pre- or post-change stream distribution. The task is then to detect a change from an unknown base distribution F0 to an unknown distribution F1. Further, we impose the constraint that change detection methods must have a bounded rate of false positives, which is important when it comes to assessing the significance of discovered change points. It is this constraint which makes the nonparametric problem difficult. We present several novel methods for this problem, and compare their performance via extensive experimental analysis. The second strand of our research is Bernoulli change detection, with application to streaming classification. In this setting, we assume a parametric form for the stream distribution, but one where both the pre- and post-change parameters are unknown. The task is again to detect changes, while having a control on the rate of false positives. After developing two different methods for tackling the pure Bernoulli change detection task, we then show how our approach can be deployed in streaming classification applications. Here, the goal is to classify objects into one of several categories. In the streaming case, the optimal classification rule can change over time, and classification techniques which are not able to adapt to these changes will suffer performance degradation. We show that by focusing only on the frequency of errors produced by the classifier, we can treat this as a Bernoulli change detection problem, and again perform extensive experimental analysis to show the value of our methods

    Novel Datasets, User Interfaces and Learner Models to Improve Learner Engagement Prediction on Educational Videos

    Get PDF
    With the emergence of Open Education Resources (OERs), educational content creation has rapidly scaled up, making a large collection of new materials made available. Among these, we find educational videos, the most popular modality for transferring knowledge in the technology-enhanced learning paradigm. Rapid creation of learning resources opens up opportunities in facilitating sustainable education, as the potential to personalise and recommend specific materials that align with individual users’ interests, goals, knowledge level, language and stylistic preferences increases. However, the quality and topical coverage of these materials could vary significantly, posing significant challenges in managing this large collection, including the risk of negative user experience and engagement with these materials. The scarcity of support resources such as public datasets is another challenge that slows down the development of tools in this research area. This thesis develops a set of novel tools that improve the recommendation of educational videos. Two novel datasets and an e-learning platform with a novel user interface are developed to support the offline and online testing of recommendation models for educational videos. Furthermore, a set of learner models that accounts for the learner interests, knowledge, novelty and popularity of content is developed through this thesis. The different models are integrated together to propose a novel learner model that accounts for the different factors simultaneously. The user studies conducted on the novel user interface show that the new interface encourages users to explore the topical content more rigorously before making relevance judgements about educational videos. Offline experiments on the newly constructed datasets show that the newly proposed learner models outperform their relevant baselines significantly

    Interval-censored Hawkes processes

    Full text link
    Interval-censored data solely records the aggregated counts of events during specific time intervals - such as the number of patients admitted to the hospital or the volume of vehicles passing traffic loop detectors - and not the exact occurrence time of the events. It is currently not understood how to fit the Hawkes point processes to this kind of data. Its typical loss function (the point process log-likelihood) cannot be computed without exact event times. Furthermore, it does not have the independent increments property to use the Poisson likelihood. This work builds a novel point process, a set of tools, and approximations for fitting Hawkes processes within interval-censored data scenarios. First, we define the Mean Behavior Poisson process (MBPP), a novel Poisson process with a direct parameter correspondence to the popular self-exciting Hawkes process. We fit MBPP in the interval-censored setting using an interval-censored Poisson log-likelihood (IC-LL). We use the parameter equivalence to uncover the parameters of the associated Hawkes process. Second, we introduce two novel exogenous functions to distinguish the exogenous from the endogenous events. We propose the multi-impulse exogenous function - for when the exogenous events are observed as event time - and the latent homogeneous Poisson process exogenous function - for when the exogenous events are presented as interval-censored volumes. Third, we provide several approximation methods to estimate the intensity and compensator function of MBPP when no analytical solution exists. Fourth and finally, we connect the interval-censored loss of MBPP to a broader class of Bregman divergence-based functions. Using the connection, we show that the popularity estimation algorithm Hawkes Intensity Process (HIP) is a particular case of the MBPP. We verify our models through empirical testing on synthetic data and real-world data

    Computational roles of cortico-cerebellar loops in temporal credit assignment

    Get PDF
    Animal survival depends on behavioural adaptation to the environment. This is thought to be enabled by plasticity in the neural circuit. However, the laws which govern neural plasticity are unclear. From a functional aspect, it is desirable to correctly identify, or assign “credit” for, the neurons or synapses responsible for the task decision and subsequent performance. In the biological circuit, the intricate, non-linear interactions involved in neural networks makes appropriately assigning credit to neurons highly challenging. In the temporal domain, this is known as the temporal credit assignment (TCA) problem. This Thesis considers the role the cerebellum – a powerful subcortical structure with strong error-guided plasticity rules – as a solution to TCA in the brain. In particular, I use artificial neural networks as a means to model and understand the mechanisms by which the cerebellum can support learning in the neocortex via the cortico-cerebellar loop. I introduce two distinct but compatible computational models of cortico-cerebellar interaction. The first model asserts that the cerebellum provides the neocortex predictive feedback, modeled in the form of error gradients, with respect to its current activity. This predictive feedback enables better credit assignment in the neocortex and effectively removes the lock between feedforward and feedback processing in cortical networks. This model captures observed long-term deficits associated with cerebellar dysfunction, namely cerebellar dysmetria, in both the motor and non-motor domain. Predictions are also made with respect to alignment of cortico-cerebellar activity during learning and the optimal task conditions for cerebellar contribution. The second model also looks at the role of the cerebellum in learning, but now considers its ability to instantaneously drive the cortex towards desired task dynamics. Unlike the first model, this model does not assume any local cortical plasticity need take place at all and task-directed learning can effectively be outsourced to the cerebellum. This model captures recent optogenetic studies in mice which show the cerebellum as a necessary component for the maintenance of desired cortical dynamics and ensuing behaviour. I also show that this driving input can eventually be used as a teaching signal for the cortical circuit, thereby conceptually unifying the two models. Overall, this Thesis explores the computational role of the cerebellum and cortico-cerebellar loops for task acquisition and maintenance in the brain
    • 

    corecore