Search CORE

65,125 research outputs found

Online Learning of k-CNF Boolean Functions

Author: Hutter Marcus
Veness Joel
Publication venue
Publication date: 26/03/2014
Field of study

This paper revisits the problem of learning a k-CNF Boolean function from examples in the context of online learning under the logarithmic loss. In doing so, we give a Bayesian interpretation to one of Valiant's celebrated PAC learning algorithms, which we then build upon to derive two efficient, online, probabilistic, supervised learning algorithms for predicting the output of an unknown k-CNF Boolean function. We analyze the loss of our methods, and show that the cumulative log-loss can be upper bounded, ignoring logarithmic factors, by a polynomial function of the size of each example.Comment: 20 LaTeX pages. 2 Algorithms. Some Theorem

arXiv.org e-Print Archive

CiteSeerX

Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation

Author: Chen Huadong
Chen Jiajun
Chiang David
Dai Xinyu
Huang Shujian
Publication venue
Publication date: 01/01/2017
Field of study

Pairwise ranking methods are the basis of many widely used discriminative training approaches for structure prediction problems in natural language processing(NLP). Decomposing the problem of ranking hypotheses into pairwise comparisons enables simple and efficient solutions. However, neglecting the global ordering of the hypothesis list may hinder learning. We propose a listwise learning framework for structure prediction problems such as machine translation. Our framework directly models the entire translation list's ordering to learn parameters which may better fit the given listwise samples. Furthermore, we propose top-rank enhanced loss functions, which are more sensitive to ranking errors at higher positions. Experiments on a large-scale Chinese-English translation task show that both our listwise learning framework and top-rank enhanced listwise losses lead to significant improvements in translation quality.Comment: Accepted to CONLL 201

arXiv.org e-Print Archive

Crossref

Algorithmic statistics, prediction and machine learning

Author: Milovanov Alexey
Publication venue
Publication date: 17/09/2015
Field of study

Algorithmic statistics considers the following problem: given a binary string

x

(e.g., some experimental data), find a "good" explanation of this data. It uses algorithmic information theory to define formally what is a good explanation. In this paper we extend this framework in two directions. First, the explanations are not only interesting in themselves but also used for prediction: we want to know what kind of data we may reasonably expect in similar situations (repeating the same experiment). We show that some kind of hierarchy can be constructed both in terms of algorithmic statistics and using the notion of a priori probability, and these two approaches turn out to be equivalent. Second, a more realistic approach that goes back to machine learning theory, assumes that we have not a single data string

x

but some set of "positive examples"

x_1,\ldots,x_l

that all belong to some unknown set

A

, a property that we want to learn. We want this set

A

to contain all positive examples and to be as small and simple as possible. We show how algorithmic statistic can be extended to cover this situation.Comment: 22 page

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Learning and Interpreting Multi-Multi-Instance Learning Networks

Author: Frasconi Paolo
Jaeger Manfred
Tibo Alessandro
Publication venue
Publication date: 01/10/2020
Field of study

We introduce an extension of the multi-instance learning problem where examples are organized as nested bags of instances (e.g., a document could be represented as a bag of sentences, which in turn are bags of words). This framework can be useful in various scenarios, such as text and image classification, but also supervised learning over graphs. As a further advantage, multi-multi instance learning enables a particular way of interpreting predictions and the decision function. Our approach is based on a special neural network layer, called bag-layer, whose units aggregate bags of inputs of arbitrary size. We prove theoretically that the associated class of functions contains all Boolean functions over sets of sets of instances and we provide empirical evidence that functions of this kind can be actually learned on semi-synthetic datasets. We finally present experiments on text classification, on citation graphs, and social graph data, which show that our model obtains competitive results with respect to accuracy when compared to other approaches such as convolutional networks on graphs, while at the same time it supports a general approach to interpret the learnt model, as well as explain individual predictions.Comment: JML

arXiv.org e-Print Archive

VBN