Search CORE

67 research outputs found

Reduction Scheme for Empirical Risk Minimization and Its Applications to Multiple-Instance Learning

Author: Suehiro Daiki
Takimoto Eiji
Publication venue
Publication date: 05/06/2020
Field of study

In this paper, we propose a simple reduction scheme for empirical risk minimization (ERM) that preserves empirical Rademacher complexity. The reduction allows us to transfer known generalization bounds and algorithms for ERM to the target learning problems in a straightforward way. In particular, we apply our reduction scheme to the multiple-instance learning (MIL) problem, for which generalization bounds and ERM algorithms have been extensively studied. We show that various learning problems can be reduced to MIL. Examples include top-1 ranking learning, multi-class learning, and labeled and complementarily labeled learning. It turns out that, some of the generalization bounds derived are, despite the simplicity of derivation, incomparable or competitive with the existing bounds. Moreover, in some setting of labeled and complementarily labeled learning, the algorithm derived is the first polynomial-time algorithm

arXiv.org e-Print Archive

Boosting as Frank-Wolfe

Author: Hatano Kohei
Mitsuboshi Ryotaro
Takimoto Eiji
Publication venue
Publication date: 30/09/2022
Field of study

Some boosting algorithms, such as LPBoost, ERLPBoost, and C-ERLPBoost, aim to solve the soft margin optimization problem with the

\ell_1

-norm regularization. LPBoost rapidly converges to an

\epsilon

-approximate solution in practice, but it is known to take

\Omega(m)

iterations in the worst case, where

m

is the sample size. On the other hand, ERLPBoost and C-ERLPBoost are guaranteed to converge to an

\epsilon

-approximate solution in

O(\frac{1}{\epsilon^2} \ln \frac{m}{\nu})

iterations. However, the computation per iteration is very high compared to LPBoost. To address this issue, we propose a generic boosting scheme that combines the Frank-Wolfe algorithm and any secondary algorithm and switches one to the other iteratively. We show that the scheme retains the same convergence guarantee as ERLPBoost and C-ERLPBoost. One can incorporate any secondary algorithm to improve in practice. This scheme comes from a unified view of boosting algorithms for soft margin optimization. More specifically, we show that LPBoost, ERLPBoost, and C-ERLPBoost are instances of the Frank-Wolfe algorithm. In experiments on real datasets, one of the instances of our scheme exploits the better updates of the secondary algorithm and performs comparably with LPBoost

arXiv.org e-Print Archive

Online Combinatorial Linear Optimization via a Frank-Wolfe-based Metarounding Algorithm

Author: Hatano Kohei
Mitsuboshi Ryotaro
Takimoto Eiji
Publication venue
Publication date: 19/11/2023
Field of study

Metarounding is an approach to convert an approximation algorithm for linear optimization over some combinatorial classes to an online linear optimization algorithm for the same class. We propose a new metarounding algorithm under a natural assumption that a relax-based approximation algorithm exists for the combinatorial class. Our algorithm is much more efficient in both theoretical and practical aspects

arXiv.org e-Print Archive

Boosting-based Construction of BDDs for Linear Threshold Functions and Its Application to Verification of Neural Networks

Author: Hatano Kohei
Takimoto Eiji
Tang Yiping
Publication venue
Publication date: 08/06/2023
Field of study

Understanding the characteristics of neural networks is important but difficult due to their complex structures and behaviors. Some previous work proposes to transform neural networks into equivalent Boolean expressions and apply verification techniques for characteristics of interest. This approach is promising since rich results of verification techniques for circuits and other Boolean expressions can be readily applied. The bottleneck is the time complexity of the transformation. More precisely, (i) each neuron of the network, i.e., a linear threshold function, is converted to a Binary Decision Diagram (BDD), and (ii) they are further combined into some final form, such as Boolean circuits. For a linear threshold function with

n

variables, an existing method takes

O(n2^{\frac{n}{2}})

time to construct an ordered BDD of size

O(2^{\frac{n}{2}})

consistent with some variable ordering. However, it is non-trivial to choose a variable ordering producing a small BDD among

n!

candidates. We propose a method to convert a linear threshold function to a specific form of a BDD based on the boosting approach in the machine learning literature. Our method takes

O(2^n \text{poly}(1/\rho))

time and outputs BDD of size

O(\frac{n^2}{\rho^4}\ln{\frac{1}{\rho}})

, where

\rho

is the margin of some consistent linear threshold function. Our method does not need to search for good variable orderings and produces a smaller expression when the margin of the linear threshold function is large. More precisely, our method is based on our new boosting algorithm, which is of independent interest. We also propose a method to combine them into the final Boolean expression representing the neural network

arXiv.org e-Print Archive

Proper Learning Algorithm for Functions of k Terms under Smooth Distributions

Author: Maruoka Akira
Sakai Yoshifumi
Takimoto Eiji
Publication venue: Academic Press.
Publication date: 01/08/1999
Field of study

AbstractIn this paper, we introduce a probabilistic distribution, called a smooth distribution, which is a generalization of variants of the uniform distribution such as q-bounded distribution and product distribution. Then, we give an algorithm that, under the smooth distribution, properly learns the class of functions of k terms given as Fk∘Tkn={g(f1(v), …, fk(v))|g∈Fk, f1, …, fk∈Tn} in polynomial time for constant k, where Fk is the class of all Boolean functions of k variables and Tn is the class of terms over n variables. Although class Fk∘Tkn was shown by Blum and Singh to be learned using DNF as the hypothesis class, it has remained open whether it is properly learnable under a distribution-free setting

Elsevier - Publisher Connector

Decision Diagrams for Solving a Job Scheduling Problem Under Precedence Constraints

Author: Hatano Kohei
Matsumoto Kosuke
Takimoto Eiji
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th International Symposium on Experimental Algorithms (SEA 2018)
Publication date: 01/01/2018
Field of study

We consider a job scheduling problem under precedence constraints, a classical problem for a single processor and multiple jobs to be done. The goal is, given processing time of n fixed jobs and precedence constraints over jobs, to find a permutation of n jobs that minimizes the total flow time, i.e., the sum of total wait time and processing times of all jobs, while satisfying the precedence constraints. The problem is an integer program and is NP-hard in general. We propose a decision diagram pi-MDD, for solving the scheduling problem exactly. Our diagram is suitable for solving linear optimization over permutations with precedence constraints. We show the effectiveness of our approach on the experiments on large scale artificial scheduling problems

Dagstuhl Research Online Publication Server

Editors’ Introduction to [Algorithmic Learning Theory: 18th International Conference, ALT 2007, Sendai, Japan, October 1-4, 2007. Proceedings]

Author: Hutter Marcus
Servedio Rocco A.
Takimoto Eiji
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2007
Field of study

Learning theory is an active research area that incorporates ideas, problems, and techniques from a wide range of disciplines including statistics, artificial intelligence, information theory, pattern recognition, and theoretical computer science. The research reported at the 18th International Conference on Algorithmic Learning Theory (ALT 2007) ranges over areas such as unsupervised learning, inductive inference, complexity and learning, boosting and reinforcement learning, query learning models, grammatical inference, online learning and defensive forecasting, and kernel methods. In this introduction we give an overview of the five invited talks and the regular contributions of ALT 2007

The Australian National University

Pure exploration in multi-armed bandits with low rank structure using oblivious sampler

Author: Hatano Kohei
Liu Yaxiong
Nakamura Atsuyoshi
Takimoto Eiji
Publication venue
Publication date: 27/06/2023
Field of study

In this paper, we consider the low rank structure of the reward sequence of the pure exploration problems. Firstly, we propose the separated setting in pure exploration problem, where the exploration strategy cannot receive the feedback of its explorations. Due to this separation, it requires that the exploration strategy to sample the arms obliviously. By involving the kernel information of the reward vectors, we provide efficient algorithms for both time-varying and fixed cases with regret bound

O(d\sqrt{(\ln N)/n})

. Then, we show the lower bound to the pure exploration in multi-armed bandits with low rank sequence. There is an

O(\sqrt{\ln N})

gap between our upper bound and the lower bound.Comment: 15 page

arXiv.org e-Print Archive