67 research outputs found

    Reduction Scheme for Empirical Risk Minimization and Its Applications to Multiple-Instance Learning

    Full text link
    In this paper, we propose a simple reduction scheme for empirical risk minimization (ERM) that preserves empirical Rademacher complexity. The reduction allows us to transfer known generalization bounds and algorithms for ERM to the target learning problems in a straightforward way. In particular, we apply our reduction scheme to the multiple-instance learning (MIL) problem, for which generalization bounds and ERM algorithms have been extensively studied. We show that various learning problems can be reduced to MIL. Examples include top-1 ranking learning, multi-class learning, and labeled and complementarily labeled learning. It turns out that, some of the generalization bounds derived are, despite the simplicity of derivation, incomparable or competitive with the existing bounds. Moreover, in some setting of labeled and complementarily labeled learning, the algorithm derived is the first polynomial-time algorithm

    Boosting as Frank-Wolfe

    Full text link
    Some boosting algorithms, such as LPBoost, ERLPBoost, and C-ERLPBoost, aim to solve the soft margin optimization problem with the 1\ell_1-norm regularization. LPBoost rapidly converges to an ϵ\epsilon-approximate solution in practice, but it is known to take Ω(m)\Omega(m) iterations in the worst case, where mm is the sample size. On the other hand, ERLPBoost and C-ERLPBoost are guaranteed to converge to an ϵ\epsilon-approximate solution in O(1ϵ2lnmν)O(\frac{1}{\epsilon^2} \ln \frac{m}{\nu}) iterations. However, the computation per iteration is very high compared to LPBoost. To address this issue, we propose a generic boosting scheme that combines the Frank-Wolfe algorithm and any secondary algorithm and switches one to the other iteratively. We show that the scheme retains the same convergence guarantee as ERLPBoost and C-ERLPBoost. One can incorporate any secondary algorithm to improve in practice. This scheme comes from a unified view of boosting algorithms for soft margin optimization. More specifically, we show that LPBoost, ERLPBoost, and C-ERLPBoost are instances of the Frank-Wolfe algorithm. In experiments on real datasets, one of the instances of our scheme exploits the better updates of the secondary algorithm and performs comparably with LPBoost

    Online Combinatorial Linear Optimization via a Frank-Wolfe-based Metarounding Algorithm

    Full text link
    Metarounding is an approach to convert an approximation algorithm for linear optimization over some combinatorial classes to an online linear optimization algorithm for the same class. We propose a new metarounding algorithm under a natural assumption that a relax-based approximation algorithm exists for the combinatorial class. Our algorithm is much more efficient in both theoretical and practical aspects

    Boosting-based Construction of BDDs for Linear Threshold Functions and Its Application to Verification of Neural Networks

    Full text link
    Understanding the characteristics of neural networks is important but difficult due to their complex structures and behaviors. Some previous work proposes to transform neural networks into equivalent Boolean expressions and apply verification techniques for characteristics of interest. This approach is promising since rich results of verification techniques for circuits and other Boolean expressions can be readily applied. The bottleneck is the time complexity of the transformation. More precisely, (i) each neuron of the network, i.e., a linear threshold function, is converted to a Binary Decision Diagram (BDD), and (ii) they are further combined into some final form, such as Boolean circuits. For a linear threshold function with nn variables, an existing method takes O(n2n2)O(n2^{\frac{n}{2}}) time to construct an ordered BDD of size O(2n2)O(2^{\frac{n}{2}}) consistent with some variable ordering. However, it is non-trivial to choose a variable ordering producing a small BDD among n!n! candidates. We propose a method to convert a linear threshold function to a specific form of a BDD based on the boosting approach in the machine learning literature. Our method takes O(2npoly(1/ρ))O(2^n \text{poly}(1/\rho)) time and outputs BDD of size O(n2ρ4ln1ρ)O(\frac{n^2}{\rho^4}\ln{\frac{1}{\rho}}), where ρ\rho is the margin of some consistent linear threshold function. Our method does not need to search for good variable orderings and produces a smaller expression when the margin of the linear threshold function is large. More precisely, our method is based on our new boosting algorithm, which is of independent interest. We also propose a method to combine them into the final Boolean expression representing the neural network

    Proper Learning Algorithm for Functions of k Terms under Smooth Distributions

    Get PDF
    AbstractIn this paper, we introduce a probabilistic distribution, called a smooth distribution, which is a generalization of variants of the uniform distribution such as q-bounded distribution and product distribution. Then, we give an algorithm that, under the smooth distribution, properly learns the class of functions of k terms given as Fk∘Tkn={g(f1(v), …, fk(v))|g∈Fk, f1, …, fk∈Tn} in polynomial time for constant k, where Fk is the class of all Boolean functions of k variables and Tn is the class of terms over n variables. Although class Fk∘Tkn was shown by Blum and Singh to be learned using DNF as the hypothesis class, it has remained open whether it is properly learnable under a distribution-free setting

    Decision Diagrams for Solving a Job Scheduling Problem Under Precedence Constraints

    Get PDF
    We consider a job scheduling problem under precedence constraints, a classical problem for a single processor and multiple jobs to be done. The goal is, given processing time of n fixed jobs and precedence constraints over jobs, to find a permutation of n jobs that minimizes the total flow time, i.e., the sum of total wait time and processing times of all jobs, while satisfying the precedence constraints. The problem is an integer program and is NP-hard in general. We propose a decision diagram pi-MDD, for solving the scheduling problem exactly. Our diagram is suitable for solving linear optimization over permutations with precedence constraints. We show the effectiveness of our approach on the experiments on large scale artificial scheduling problems

    Editors’ Introduction to [Algorithmic Learning Theory: 18th International Conference, ALT 2007, Sendai, Japan, October 1-4, 2007. Proceedings]

    No full text
    Learning theory is an active research area that incorporates ideas, problems, and techniques from a wide range of disciplines including statistics, artificial intelligence, information theory, pattern recognition, and theoretical computer science. The research reported at the 18th International Conference on Algorithmic Learning Theory (ALT 2007) ranges over areas such as unsupervised learning, inductive inference, complexity and learning, boosting and reinforcement learning, query learning models, grammatical inference, online learning and defensive forecasting, and kernel methods. In this introduction we give an overview of the five invited talks and the regular contributions of ALT 2007

    Pure exploration in multi-armed bandits with low rank structure using oblivious sampler

    Full text link
    In this paper, we consider the low rank structure of the reward sequence of the pure exploration problems. Firstly, we propose the separated setting in pure exploration problem, where the exploration strategy cannot receive the feedback of its explorations. Due to this separation, it requires that the exploration strategy to sample the arms obliviously. By involving the kernel information of the reward vectors, we provide efficient algorithms for both time-varying and fixed cases with regret bound O(d(lnN)/n)O(d\sqrt{(\ln N)/n}). Then, we show the lower bound to the pure exploration in multi-armed bandits with low rank sequence. There is an O(lnN)O(\sqrt{\ln N}) gap between our upper bound and the lower bound.Comment: 15 page
    corecore