167 research outputs found

    Distribution-Independent Evolvability of Linear Threshold Functions

    Full text link
    Valiant's (2007) model of evolvability models the evolutionary process of acquiring useful functionality as a restricted form of learning from random examples. Linear threshold functions and their various subclasses, such as conjunctions and decision lists, play a fundamental role in learning theory and hence their evolvability has been the primary focus of research on Valiant's framework (2007). One of the main open problems regarding the model is whether conjunctions are evolvable distribution-independently (Feldman and Valiant, 2008). We show that the answer is negative. Our proof is based on a new combinatorial parameter of a concept class that lower-bounds the complexity of learning from correlations. We contrast the lower bound with a proof that linear threshold functions having a non-negligible margin on the data points are evolvable distribution-independently via a simple mutation algorithm. Our algorithm relies on a non-linear loss function being used to select the hypotheses instead of 0-1 loss in Valiant's (2007) original definition. The proof of evolvability requires that the loss function satisfies several mild conditions that are, for example, satisfied by the quadratic loss function studied in several other works (Michael, 2007; Feldman, 2009; Valiant, 2010). An important property of our evolution algorithm is monotonicity, that is the algorithm guarantees evolvability without any decreases in performance. Previously, monotone evolvability was only shown for conjunctions with quadratic loss (Feldman, 2009) or when the distribution on the domain is severely restricted (Michael, 2007; Feldman, 2009; Kanade et al., 2010

    A Complete Characterization of Statistical Query Learning with Applications to Evolvability

    Get PDF
    Statistical query (SQ) learning model of Kearns (1993) is a natural restriction of the PAC learning model in which a learning algorithm is allowed to obtain estimates of statistical properties of the examples but cannot see the examples themselves. We describe a new and simple characterization of the query complexity of learning in the SQ learning model. Unlike the previously known bounds on SQ learning our characterization preserves the accuracy and the efficiency of learning. The preservation of accuracy implies that that our characterization gives the first characterization of SQ learning in the agnostic learning framework. The preservation of efficiency is achieved using a new boosting technique and allows us to derive a new approach to the design of evolutionary algorithms in Valiant's (2006) model of evolvability. We use this approach to demonstrate the existence of a large class of monotone evolutionary learning algorithms based on square loss performance estimation. These results differ significantly from the few known evolutionary algorithms and give evidence that evolvability in Valiant's model is a more versatile phenomenon than there had been previous reason to suspect.Comment: Simplified Lemma 3.8 and it's application

    Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Connections to Evolvability

    Full text link
    In this paper we revisit some classic problems on classification under misspecification. In particular, we study the problem of learning halfspaces under Massart noise with rate η\eta. In a recent work, Diakonikolas, Goulekakis, and Tzamos resolved a long-standing problem by giving the first efficient algorithm for learning to accuracy η+ϵ\eta + \epsilon for any ϵ>0\epsilon > 0. However, their algorithm outputs a complicated hypothesis, which partitions space into poly(d,1/ϵ)\text{poly}(d,1/\epsilon) regions. Here we give a much simpler algorithm and in the process resolve a number of outstanding open questions: (1) We give the first proper learner for Massart halfspaces that achieves η+ϵ\eta + \epsilon. We also give improved bounds on the sample complexity achievable by polynomial time algorithms. (2) Based on (1), we develop a blackbox knowledge distillation procedure to convert an arbitrarily complex classifier to an equally good proper classifier. (3) By leveraging a simple but overlooked connection to evolvability, we show any SQ algorithm requires super-polynomially many queries to achieve OPT+ϵ\mathsf{OPT} + \epsilon. Moreover we study generalized linear models where E[YX]=σ(w,X)\mathbb{E}[Y|\mathbf{X}] = \sigma(\langle \mathbf{w}^*, \mathbf{X}\rangle) for any odd, monotone, and Lipschitz function σ\sigma. This family includes the previously mentioned halfspace models as a special case, but is much richer and includes other fundamental models like logistic regression. We introduce a challenging new corruption model that generalizes Massart noise, and give a general algorithm for learning in this setting. Our algorithms are based on a small set of core recipes for learning to classify in the presence of misspecification. Finally we study our algorithm for learning halfspaces under Massart noise empirically and find that it exhibits some appealing fairness properties.Comment: 51 pages, comments welcom

    Robust Learning under Strong Noise via {SQs}

    Get PDF
    This work provides several new insights on the robustness of Kearns' statistical query framework against challenging label-noise models. First, we build on a recent result by \cite{DBLP:journals/corr/abs-2006-04787} that showed noise tolerance of distribution-independently evolvable concept classes under Massart noise. Specifically, we extend their characterization to more general noise models, including the Tsybakov model which considerably generalizes the Massart condition by allowing the flipping probability to be arbitrarily close to 12\frac{1}{2} for a subset of the domain. As a corollary, we employ an evolutionary algorithm by \cite{DBLP:conf/colt/KanadeVV10} to obtain the first polynomial time algorithm with arbitrarily small excess error for learning linear threshold functions over any spherically symmetric distribution in the presence of spherically symmetric Tsybakov noise. Moreover, we posit access to a stronger oracle, in which for every labeled example we additionally obtain its flipping probability. In this model, we show that every SQ learnable class admits an efficient learning algorithm with OPT + ϵ\epsilon misclassification error for a broad class of noise models. This setting substantially generalizes the widely-studied problem of classification under RCN with known noise rate, and corresponds to a non-convex optimization problem even when the noise function -- i.e. the flipping probabilities of all points -- is known in advance

    Evolvability-guided Optimization of Linear Deformation Setups for Evolutionary Design Optimization

    Get PDF
    Richter A. Evolvability-guided Optimization of Linear Deformation Setups for Evolutionary Design Optimization. Bielefeld: Universität Bielefeld; 2019.Andreas Richter gratefully acknowledges the financial support from Honda Research Institute Europe (HRI-EU).This thesis targets efficient solutions for optimal representation setups for evolutionary design optimization problems. The representation maps the abstract parameters of an optimizer to a meaningful variation of the design model, e.g., the shape of a car. Thereby, it determines the convergence speed to and the quality of the final result. Thus, engineers are eager to employ well-tuned representations to achieve high-quality design solutions. But, setting up optimal representations is a cumbersome process because the setup procedure requires detailed knowledge about the objective functions, e.g., a fluid dynamics simulation, and the parameters of the employed representation itself. Thus, we target efficient routines to set up representations automatically to support engineers from their tedious, partly manual work. Inspired by the concept of evolvability, we present novel quality criteria for the evaluation of linear deformations commonly applied as representations. We define and analyze the criteria variability, regularity, and improvement potential which measure the expected quality and convergence speed of an evolutionary design optimization process based on the linear deformation setup. Moreover, we target the efficient optimization of deformation setups with respect to these three criteria. In dynamic design optimization scenarios a suitable compromise between exploration and exploitation is crucial for efficient solutions. We discuss the construction of optimal compromises for these dynamic scenarios with our criteria because they characterize exploration and exploitation. As a result an engineer can initialize and adjust the deformation setup for improved convergence speed of a design process and for enhanced quality of the design solutions with our methods

    Evolvability-guided Optimization of Linear Deformation Setups for Evolutionary Design Optimization

    Get PDF
    Richter A. Evolvability-guided Optimization of Linear Deformation Setups for Evolutionary Design Optimization. Bielefeld: Universität Bielefeld; 2019.Andreas Richter gratefully acknowledges the financial support from Honda Research Institute Europe (HRI-EU).This thesis targets efficient solutions for optimal representation setups for evolutionary design optimization problems. The representation maps the abstract parameters of an optimizer to a meaningful variation of the design model, e.g., the shape of a car. Thereby, it determines the convergence speed to and the quality of the final result. Thus, engineers are eager to employ well-tuned representations to achieve high-quality design solutions. But, setting up optimal representations is a cumbersome process because the setup procedure requires detailed knowledge about the objective functions, e.g., a fluid dynamics simulation, and the parameters of the employed representation itself. Thus, we target efficient routines to set up representations automatically to support engineers from their tedious, partly manual work. Inspired by the concept of evolvability, we present novel quality criteria for the evaluation of linear deformations commonly applied as representations. We define and analyze the criteria variability, regularity, and improvement potential which measure the expected quality and convergence speed of an evolutionary design optimization process based on the linear deformation setup. Moreover, we target the efficient optimization of deformation setups with respect to these three criteria. In dynamic design optimization scenarios a suitable compromise between exploration and exploitation is crucial for efficient solutions. We discuss the construction of optimal compromises for these dynamic scenarios with our criteria because they characterize exploration and exploitation. As a result an engineer can initialize and adjust the deformation setup for improved convergence speed of a design process and for enhanced quality of the design solutions with our methods

    Distribution-Independent Regression for Generalized Linear Models with Oblivious Corruptions

    Full text link
    We demonstrate the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise. We assume we have sample access to examples (x,y)(x, y) where yy is a noisy measurement of g(wx)g(w^* \cdot x). In particular, \new{the noisy labels are of the form} y=g(wx)+ξ+ϵy = g(w^* \cdot x) + \xi + \epsilon, where ξ\xi is the oblivious noise drawn independently of xx \new{and satisfies} Pr[ξ=0]o(1)\Pr[\xi = 0] \geq o(1), and ϵN(0,σ2)\epsilon \sim \mathcal N(0, \sigma^2). Our goal is to accurately recover a \new{parameter vector ww such that the} function g(wx)g(w \cdot x) \new{has} arbitrarily small error when compared to the true values g(wx)g(w^* \cdot x), rather than the noisy measurements yy. We present an algorithm that tackles \new{this} problem in its most general distribution-independent setting, where the solution may not \new{even} be identifiable. \new{Our} algorithm returns \new{an accurate estimate of} the solution if it is identifiable, and otherwise returns a small list of candidates, one of which is close to the true solution. Furthermore, we \new{provide} a necessary and sufficient condition for identifiability, which holds in broad settings. \new{Specifically,} the problem is identifiable when the quantile at which ξ+ϵ=0\xi + \epsilon = 0 is known, or when the family of hypotheses does not contain candidates that are nearly equal to a translated g(wx)+Ag(w^* \cdot x) + A for some real number AA, while also having large error when compared to g(wx)g(w^* \cdot x). This is the first \new{algorithmic} result for GLM regression \new{with oblivious noise} which can handle more than half the samples being arbitrarily corrupted. Prior work focused largely on the setting of linear regression, and gave algorithms under restrictive assumptions.Comment: Published in COLT 202
    corecore