35 research outputs found

    Généralisations de la théorie PAC-bayésienne pour l'apprentissage inductif, l'apprentissage transductif et l'adaptation de domaine

    Get PDF
    Tableau d’honneur de la FacultĂ© des Ă©tudes supĂ©rieures et postdoctorales, 2015-2016En apprentissage automatique, l’approche PAC-bayĂ©sienne permet d’obtenir des garanties statistiques sur le risque de votes de majoritĂ© pondĂ©rĂ©s de plusieurs classificateurs (nommĂ©s votants). La thĂ©orie PAC-bayĂ©sienne «classique», initiĂ©e par McAllester (1999), Ă©tudie le cadre d’apprentissage inductif, sous l’hypothĂšse que les exemples d’apprentissage sont gĂ©nĂ©rĂ©s de maniĂšre indĂ©pendante et qu’ils sont identiquement distribuĂ©s (i.i.d.) selon une distribution de probabilitĂ© inconnue mais fixe. Les contributions de la thĂšse se divisent en deux parties. Nous prĂ©sentons d’abord une analyse des votes de majoritĂ©, fondĂ©e sur l’étude de la marge comme variable alĂ©atoire. Il en dĂ©coule une conceptualisation originale de la thĂ©orie PACbayĂ©sienne. Notre approche, trĂšs gĂ©nĂ©rale, permet de retrouver plusieurs rĂ©sultats existants pour le cadre d’apprentissage inductif, ainsi que de les relier entre eux. Nous mettons notamment en lumiĂšre l’importance de la notion d’espĂ©rance de dĂ©saccord entre les votants. BĂątissant sur une comprĂ©hension approfondie de la thĂ©orie PAC-bayĂ©sienne, acquise dans le cadre inductif, nous l’étendons ensuite Ă  deux autres cadres d’apprentissage. D’une part, nous Ă©tudions le cadre d’apprentissage transductif, dans lequel les descriptions des exemples Ă  classifier sont connues de l’algorithme d’apprentissage. Dans ce contexte, nous formulons des bornes sur le risque du vote de majoritĂ© qui amĂ©liorent celles de la littĂ©rature. D’autre part, nous Ă©tudions le cadre de l’adaptation de domaine, dans lequel la distribution gĂ©nĂ©ratrice des exemples Ă©tiquetĂ©s de l’échantillon d’entraĂźnement diffĂšre de la distribution gĂ©nĂ©rative des exemples sur lesquels sera employĂ© le classificateur. GrĂące Ă  une analyse thĂ©orique – qui se rĂ©vĂšle ĂȘtre la premiĂšre approche PAC-bayĂ©sienne de ce cadre d’apprentissage –, nous concevons un algorithme d’apprentissage automatique dĂ©diĂ© Ă  l’adaptation de domaine. Nos expĂ©rimentations empiriques montrent que notre algorithme est compĂ©titif avec l’état de l’art.In machine learning, the PAC-Bayesian approach provides statistical guarantees on the risk of a weighted majority vote of many classifiers (named voters). The “classical” PAC-Bayesian theory, initiated by McAllester (1999), studies the inductive learning framework under the assumption that the learning examples are independently generated and are identically distributed (i.i.d.) according to an unknown but fixed probability distribution. The thesis contributions are divided in two major parts. First, we present an analysis of majority votes based on the study of the margin as a random variable. It follows a new conceptualization of the PAC-Bayesian theory. Our very general approach allows us to recover several existing results for the inductive PAC-Bayesian framework, and link them in a whole. Among other things, we highlight the notion of expected disagreement between the voters. Building upon an improved understanding of the PAC-Bayesian theory, gained by studying the inductive framework, we then extend it to two other learning frameworks. On the one hand, we study the transductive framework, where the learning algorithm knows the description of the examples to be classified. In this context, we state risk bounds on majority votes that improve those from the current literature. On the other hand, we study the domain adaptation framework, where the generating distribution of the labelled learning examples differs from the generating distribution of the examples to be classified. Our theoretical analysis is the first PAC-Bayesian approach of this learning framework, and allows us to conceive a new machine learning algorithm for domain adaptation. Our empirical experiments show that our algorithm is competitive with other state-of-the-art algorithms

    Engineering Physics and Mathematics Division progress report for period ending December 31, 1994

    Full text link

    Synergies between machine learning and reasoning - An introduction by the Kay R. Amel group

    Get PDF
    This paper proposes a tentative and original survey of meeting points between Knowledge Representation and Reasoning (KRR) and Machine Learning (ML), two areas which have been developed quite separately in the last four decades. First, some common concerns are identified and discussed such as the types of representation used, the roles of knowledge and data, the lack or the excess of information, or the need for explanations and causal understanding. Then, the survey is organised in seven sections covering most of the territory where KRR and ML meet. We start with a section dealing with prototypical approaches from the literature on learning and reasoning: Inductive Logic Programming, Statistical Relational Learning, and Neurosymbolic AI, where ideas from rule-based reasoning are combined with ML. Then we focus on the use of various forms of background knowledge in learning, ranging from additional regularisation terms in loss functions, to the problem of aligning symbolic and vector space representations, or the use of knowledge graphs for learning. Then, the next section describes how KRR notions may benefit to learning tasks. For instance, constraints can be used as in declarative data mining for influencing the learned patterns; or semantic features are exploited in low-shot learning to compensate for the lack of data; or yet we can take advantage of analogies for learning purposes. Conversely, another section investigates how ML methods may serve KRR goals. For instance, one may learn special kinds of rules such as default rules, fuzzy rules or threshold rules, or special types of information such as constraints, or preferences. The section also covers formal concept analysis and rough sets-based methods. Yet another section reviews various interactions between Automated Reasoning and ML, such as the use of ML methods in SAT solving to make reasoning faster. Then a section deals with works related to model accountability, including explainability and interpretability, fairness and robustness. Finally, a section covers works on handling imperfect or incomplete data, including the problem of learning from uncertain or coarse data, the use of belief functions for regression, a revision-based view of the EM algorithm, the use of possibility theory in statistics, or the learning of imprecise models. This paper thus aims at a better mutual understanding of research in KRR and ML, and how they can cooperate. The paper is completed by an abundant bibliography
    corecore