35 research outputs found
Recommended from our members
Gaussian processes for state space models and change point detection
This thesis details several applications of Gaussian processes (GPs) for enhanced time series modeling.
We first cover different approaches for using Gaussian processes in time series problems.
These are extended to the state space approach to time series in two different problems.
We also combine Gaussian processes and Bayesian online change point detection (BOCPD) to increase the generality of the Gaussian process time series methods.
These methodologies are evaluated on predictive performance on six real world data sets, which include three environmental data sets, one financial, one biological, and one from industrial well drilling.
Gaussian processes are capable of generalizing standard linear time series models.
We cover two approaches: the Gaussian process time series model (GPTS) and the autoregressive Gaussian process (ARGP).
We cover a variety of methods that greatly reduce the computational and memory complexity of Gaussian process approaches, which are generally cubic in computational complexity.
Two different improvements to state space based approaches are covered.
First, Gaussian process inference and learning (GPIL) generalizes linear dynamical systems (LDS), for which the Kalman filter is based, to general nonlinear systems for nonparametric system identification.
Second, we address pathologies in the unscented Kalman filter (UKF).
We use Gaussian process optimization (GPO) to learn UKF settings that minimize the potential for sigma point collapse.
We show how to embed mentioned Gaussian process approaches to time series into a change point framework.
Old data, from an old regime, that hinders predictive performance is automatically and elegantly phased out.
The computational improvements for Gaussian process time series approaches are of even greater use in the change point framework.
We also present a supervised framework learning a change point model when change point labels are available in training.I would like to thank Rockwell Collins, formerly DataPath, Inc., which funded my studentship
Généralisations de la théorie PAC-bayésienne pour l'apprentissage inductif, l'apprentissage transductif et l'adaptation de domaine
Tableau dâhonneur de la FacultĂ© des Ă©tudes supĂ©rieures et postdoctorales, 2015-2016En apprentissage automatique, lâapproche PAC-bayĂ©sienne permet dâobtenir des garanties statistiques sur le risque de votes de majoritĂ© pondĂ©rĂ©s de plusieurs classificateurs (nommĂ©s votants). La thĂ©orie PAC-bayĂ©sienne «classique», initiĂ©e par McAllester (1999), Ă©tudie le cadre dâapprentissage inductif, sous lâhypothĂšse que les exemples dâapprentissage sont gĂ©nĂ©rĂ©s de maniĂšre indĂ©pendante et quâils sont identiquement distribuĂ©s (i.i.d.) selon une distribution de probabilitĂ© inconnue mais fixe. Les contributions de la thĂšse se divisent en deux parties. Nous prĂ©sentons dâabord une analyse des votes de majoritĂ©, fondĂ©e sur lâĂ©tude de la marge comme variable alĂ©atoire. Il en dĂ©coule une conceptualisation originale de la thĂ©orie PACbayĂ©sienne. Notre approche, trĂšs gĂ©nĂ©rale, permet de retrouver plusieurs rĂ©sultats existants pour le cadre dâapprentissage inductif, ainsi que de les relier entre eux. Nous mettons notamment en lumiĂšre lâimportance de la notion dâespĂ©rance de dĂ©saccord entre les votants. BĂątissant sur une comprĂ©hension approfondie de la thĂ©orie PAC-bayĂ©sienne, acquise dans le cadre inductif, nous lâĂ©tendons ensuite Ă deux autres cadres dâapprentissage. Dâune part, nous Ă©tudions le cadre dâapprentissage transductif, dans lequel les descriptions des exemples Ă classifier sont connues de lâalgorithme dâapprentissage. Dans ce contexte, nous formulons des bornes sur le risque du vote de majoritĂ© qui amĂ©liorent celles de la littĂ©rature. Dâautre part, nous Ă©tudions le cadre de lâadaptation de domaine, dans lequel la distribution gĂ©nĂ©ratrice des exemples Ă©tiquetĂ©s de lâĂ©chantillon dâentraĂźnement diffĂšre de la distribution gĂ©nĂ©rative des exemples sur lesquels sera employĂ© le classificateur. GrĂące Ă une analyse thĂ©orique â qui se rĂ©vĂšle ĂȘtre la premiĂšre approche PAC-bayĂ©sienne de ce cadre dâapprentissage â, nous concevons un algorithme dâapprentissage automatique dĂ©diĂ© Ă lâadaptation de domaine. Nos expĂ©rimentations empiriques montrent que notre algorithme est compĂ©titif avec lâĂ©tat de lâart.In machine learning, the PAC-Bayesian approach provides statistical guarantees on the risk of a weighted majority vote of many classifiers (named voters). The âclassicalâ PAC-Bayesian theory, initiated by McAllester (1999), studies the inductive learning framework under the assumption that the learning examples are independently generated and are identically distributed (i.i.d.) according to an unknown but fixed probability distribution. The thesis contributions are divided in two major parts. First, we present an analysis of majority votes based on the study of the margin as a random variable. It follows a new conceptualization of the PAC-Bayesian theory. Our very general approach allows us to recover several existing results for the inductive PAC-Bayesian framework, and link them in a whole. Among other things, we highlight the notion of expected disagreement between the voters. Building upon an improved understanding of the PAC-Bayesian theory, gained by studying the inductive framework, we then extend it to two other learning frameworks. On the one hand, we study the transductive framework, where the learning algorithm knows the description of the examples to be classified. In this context, we state risk bounds on majority votes that improve those from the current literature. On the other hand, we study the domain adaptation framework, where the generating distribution of the labelled learning examples differs from the generating distribution of the examples to be classified. Our theoretical analysis is the first PAC-Bayesian approach of this learning framework, and allows us to conceive a new machine learning algorithm for domain adaptation. Our empirical experiments show that our algorithm is competitive with other state-of-the-art algorithms
Synergies between machine learning and reasoning - An introduction by the Kay R. Amel group
This paper proposes a tentative and original survey of meeting points between Knowledge Representation and Reasoning (KRR) and Machine Learning (ML), two areas which have been developed quite separately in the last four decades. First, some common concerns are identified and discussed such as the types of representation used, the roles of knowledge and data, the lack or the excess of information, or the need for explanations and causal understanding. Then, the survey is organised in seven sections covering most of the territory where KRR and ML meet. We start with a section dealing with prototypical approaches from the literature on learning and reasoning: Inductive Logic Programming, Statistical Relational Learning, and Neurosymbolic AI, where ideas from rule-based reasoning are combined with ML. Then we focus on the use of various forms of background knowledge in learning, ranging from additional regularisation terms in loss functions, to the problem of aligning symbolic and vector space representations, or the use of knowledge graphs for learning. Then, the next section describes how KRR notions may benefit to learning tasks. For instance, constraints can be used as in declarative data mining for influencing the learned patterns; or semantic features are exploited in low-shot learning to compensate for the lack of data; or yet we can take advantage of analogies for learning purposes. Conversely, another section investigates how ML methods may serve KRR goals. For instance, one may learn special kinds of rules such as default rules, fuzzy rules or threshold rules, or special types of information such as constraints, or preferences. The section also covers formal concept analysis and rough sets-based methods. Yet another section reviews various interactions between Automated Reasoning and ML, such as the use of ML methods in SAT solving to make reasoning faster. Then a section deals with works related to model accountability, including explainability and interpretability, fairness and robustness. Finally, a section covers works on handling imperfect or incomplete data, including the problem of learning from uncertain or coarse data, the use of belief functions for regression, a revision-based view of the EM algorithm, the use of possibility theory in statistics, or the learning of imprecise models. This paper thus aims at a better mutual understanding of research in KRR and ML, and how they can cooperate. The paper is completed by an abundant bibliography