67 research outputs found
Reduction Scheme for Empirical Risk Minimization and Its Applications to Multiple-Instance Learning
In this paper, we propose a simple reduction scheme for empirical risk
minimization (ERM) that preserves empirical Rademacher complexity. The
reduction allows us to transfer known generalization bounds and algorithms for
ERM to the target learning problems in a straightforward way. In particular, we
apply our reduction scheme to the multiple-instance learning (MIL) problem, for
which generalization bounds and ERM algorithms have been extensively studied.
We show that various learning problems can be reduced to MIL. Examples include
top-1 ranking learning, multi-class learning, and labeled and complementarily
labeled learning. It turns out that, some of the generalization bounds derived
are, despite the simplicity of derivation, incomparable or competitive with the
existing bounds. Moreover, in some setting of labeled and complementarily
labeled learning, the algorithm derived is the first polynomial-time algorithm
Boosting as Frank-Wolfe
Some boosting algorithms, such as LPBoost, ERLPBoost, and C-ERLPBoost, aim to
solve the soft margin optimization problem with the -norm
regularization. LPBoost rapidly converges to an -approximate solution
in practice, but it is known to take iterations in the worst case,
where is the sample size. On the other hand, ERLPBoost and C-ERLPBoost are
guaranteed to converge to an -approximate solution in
iterations. However, the
computation per iteration is very high compared to LPBoost.
To address this issue, we propose a generic boosting scheme that combines the
Frank-Wolfe algorithm and any secondary algorithm and switches one to the other
iteratively. We show that the scheme retains the same convergence guarantee as
ERLPBoost and C-ERLPBoost. One can incorporate any secondary algorithm to
improve in practice. This scheme comes from a unified view of boosting
algorithms for soft margin optimization. More specifically, we show that
LPBoost, ERLPBoost, and C-ERLPBoost are instances of the Frank-Wolfe algorithm.
In experiments on real datasets, one of the instances of our scheme exploits
the better updates of the secondary algorithm and performs comparably with
LPBoost
Online Combinatorial Linear Optimization via a Frank-Wolfe-based Metarounding Algorithm
Metarounding is an approach to convert an approximation algorithm for linear
optimization over some combinatorial classes to an online linear optimization
algorithm for the same class. We propose a new metarounding algorithm under a
natural assumption that a relax-based approximation algorithm exists for the
combinatorial class. Our algorithm is much more efficient in both theoretical
and practical aspects
Boosting-based Construction of BDDs for Linear Threshold Functions and Its Application to Verification of Neural Networks
Understanding the characteristics of neural networks is important but
difficult due to their complex structures and behaviors. Some previous work
proposes to transform neural networks into equivalent Boolean expressions and
apply verification techniques for characteristics of interest. This approach is
promising since rich results of verification techniques for circuits and other
Boolean expressions can be readily applied. The bottleneck is the time
complexity of the transformation. More precisely, (i) each neuron of the
network, i.e., a linear threshold function, is converted to a Binary Decision
Diagram (BDD), and (ii) they are further combined into some final form, such as
Boolean circuits. For a linear threshold function with variables, an
existing method takes time to construct an ordered BDD of
size consistent with some variable ordering. However, it
is non-trivial to choose a variable ordering producing a small BDD among
candidates.
We propose a method to convert a linear threshold function to a specific form
of a BDD based on the boosting approach in the machine learning literature. Our
method takes time and outputs BDD of size
, where is the margin of some
consistent linear threshold function. Our method does not need to search for
good variable orderings and produces a smaller expression when the margin of
the linear threshold function is large. More precisely, our method is based on
our new boosting algorithm, which is of independent interest. We also propose a
method to combine them into the final Boolean expression representing the
neural network
Proper Learning Algorithm for Functions of k Terms under Smooth Distributions
AbstractIn this paper, we introduce a probabilistic distribution, called a smooth distribution, which is a generalization of variants of the uniform distribution such as q-bounded distribution and product distribution. Then, we give an algorithm that, under the smooth distribution, properly learns the class of functions of k terms given as Fk∘Tkn={g(f1(v), …, fk(v))|g∈Fk, f1, …, fk∈Tn} in polynomial time for constant k, where Fk is the class of all Boolean functions of k variables and Tn is the class of terms over n variables. Although class Fk∘Tkn was shown by Blum and Singh to be learned using DNF as the hypothesis class, it has remained open whether it is properly learnable under a distribution-free setting
Decision Diagrams for Solving a Job Scheduling Problem Under Precedence Constraints
We consider a job scheduling problem under precedence constraints, a classical problem for a single processor and multiple jobs to be done. The goal is, given processing time of n fixed jobs and precedence constraints over jobs, to find a permutation of n jobs that minimizes the total flow time, i.e., the sum of total wait time and processing times of all jobs, while satisfying the precedence constraints. The problem is an integer program and is NP-hard in general. We propose a decision diagram pi-MDD, for solving the scheduling problem exactly. Our diagram is suitable for solving linear optimization over permutations with precedence constraints. We show the effectiveness of our approach on the experiments on large scale artificial scheduling problems
Editors’ Introduction to [Algorithmic Learning Theory: 18th International Conference, ALT 2007, Sendai, Japan, October 1-4, 2007. Proceedings]
Learning theory is an active research area that incorporates ideas,
problems, and techniques from a wide range of disciplines including
statistics, artificial intelligence, information theory, pattern
recognition, and theoretical computer science. The research reported
at the 18th International Conference on Algorithmic Learning Theory
(ALT 2007) ranges over areas such as unsupervised learning,
inductive inference, complexity and learning, boosting and
reinforcement learning, query learning models, grammatical
inference, online learning and defensive forecasting, and kernel
methods. In this introduction we give an overview of the five
invited talks and the regular contributions of ALT 2007
Pure exploration in multi-armed bandits with low rank structure using oblivious sampler
In this paper, we consider the low rank structure of the reward sequence of
the pure exploration problems. Firstly, we propose the separated setting in
pure exploration problem, where the exploration strategy cannot receive the
feedback of its explorations. Due to this separation, it requires that the
exploration strategy to sample the arms obliviously. By involving the kernel
information of the reward vectors, we provide efficient algorithms for both
time-varying and fixed cases with regret bound . Then, we
show the lower bound to the pure exploration in multi-armed bandits with low
rank sequence. There is an gap between our upper bound and
the lower bound.Comment: 15 page
- …