1,478 research outputs found

    Low-discrepancy Sampling in the Expanded Dimensional Space: An Acceleration Technique for Particle Swarm Optimization

    Full text link
    Compared with random sampling, low-discrepancy sampling is more effective in covering the search space. However, the existing research cannot definitely state whether the impact of a low-discrepancy sample on particle swarm optimization (PSO) is positive or negative. Using Niderreiter's theorem, this study completes an error analysis of PSO, which reveals that the error bound of PSO at each iteration depends on the dispersion of the sample set in an expanded dimensional space. Based on this error analysis, an acceleration technique for PSO-type algorithms is proposed with low-discrepancy sampling in the expanded dimensional space. The acceleration technique can generate a low-discrepancy sample set with a smaller dispersion, compared with a random sampling, in the expanded dimensional space; it also reduces the error at each iteration, and hence improves the convergence speed. The acceleration technique is combined with the standard PSO and the comprehensive learning particle swarm optimization, and the performance of the improved algorithm is compared with the original algorithm. The experimental results show that the two improved algorithms have significantly faster convergence speed under the same accuracy requirement.Comment: 29 pages, 0 figure

    Deep learning that scales: leveraging compute and data

    Get PDF
    Deep learning has revolutionized the field of artificial intelligence in the past decade. Although the development of these techniques spans over several years, the recent advent of deep learning is explained by an increased availability of data and compute that have unlocked the potential of deep neural networks. They have become ubiquitous in domains such as natural language processing, computer vision, speech processing, and control, where enough training data is available. Recent years have seen continuous progress driven by ever-growing neural networks that benefited from large amounts of data and computing power. This thesis is motivated by the observation that scale is one of the key factors driving progress in deep learning research, and aims at devising deep learning methods that scale gracefully with the available data and compute. We narrow down this scope into two main research directions. The first of them is concerned with designing hardware-aware methods which can make the most of the computing resources in current high performance computing facilities. We then study bottlenecks preventing existing methods from scaling up as more data becomes available, providing solutions that contribute towards enabling training of more complex models. This dissertation studies the aforementioned research questions for two different learning paradigms, each with its own algorithmic and computational characteristics. The first part of this thesis studies the paradigm where the model needs to learn from a collection of examples, extracting as much information as possible from the given data. The second part is concerned with training agents that learn by interacting with a simulated environment, which introduces unique challenges such as efficient exploration and simulation

    Robustness analysis of VEGA launcher model based on effective sampling strategy

    Get PDF
    An efficient robustness analysis for the VEGA launch vehicle is essential to minimize the potential system failure during the ascending phase. Monte Carlo sampling method is usually considered as a reliable strategy in industry if the sampling size is large enough. However, due to a large number of uncertainties and a long response time for a single simulation, exploring the entire uncertainties sufficiently through Monte Carlo sampling method is impractical for VEGA launch vehicle. In order to make the robustness analysis more efficient when the number of simulation is limited, the quasi-Monte Carlo(Sobol, Faure, Halton sequence) and heuristic algorithm(Differential Evolution) are proposed. Nevertheless, the reasonable number of samples for simulation is still much smaller than the minimal number of samples for sufficient exploration. To further improve the efficiency of robustness analysis, the redundant uncertainties are sorted out by sensitivity analysis. Only the dominant uncertainties are remained in the robustness analysis. As all samples for simulation are discrete, many uncertainty spaces are not explored with respect to its objective function by sampling or optimization methods. To study these latent information, the meta-model trained by Gaussian Process is introduced. Based on the meta-model, the expected maximum objective value and expected sensitivity of each uncertainties can be analyzed for robustness analysis with much higher efficiency but without loss much accuracy

    On the Combined Impact of Population Size and Sub-problem Selection in MOEA/D

    Get PDF
    This paper intends to understand and to improve the working principle of decomposition-based multi-objective evolutionary algorithms. We review the design of the well-established Moea/d framework to support the smooth integration of different strategies for sub-problem selection, while emphasizing the role of the population size and of the number of offspring created at each generation. By conducting a comprehensive empirical analysis on a wide range of multi-and many-objective combinatorial NK landscapes, we provide new insights into the combined effect of those parameters on the anytime performance of the underlying search process. In particular, we show that even a simple random strategy selecting sub-problems at random outperforms existing sophisticated strategies. We also study the sensitivity of such strategies with respect to the ruggedness and the objective space dimension of the target problem.Comment: European Conference on Evolutionary Computation in Combinatorial Optimization, Apr 2020, Seville, Spai

    A General Framework for Fast Stagewise Algorithms

    Full text link
    Forward stagewise regression follows a very simple strategy for constructing a sequence of sparse regression estimates: it starts with all coefficients equal to zero, and iteratively updates the coefficient (by a small amount ϵ\epsilon) of the variable that achieves the maximal absolute inner product with the current residual. This procedure has an interesting connection to the lasso: under some conditions, it is known that the sequence of forward stagewise estimates exactly coincides with the lasso path, as the step size ϵ\epsilon goes to zero. Furthermore, essentially the same equivalence holds outside of least squares regression, with the minimization of a differentiable convex loss function subject to an ℓ1\ell_1 norm constraint (the stagewise algorithm now updates the coefficient corresponding to the maximal absolute component of the gradient). Even when they do not match their ℓ1\ell_1-constrained analogues, stagewise estimates provide a useful approximation, and are computationally appealing. Their success in sparse modeling motivates the question: can a simple, effective strategy like forward stagewise be applied more broadly in other regularization settings, beyond the ℓ1\ell_1 norm and sparsity? The current paper is an attempt to do just this. We present a general framework for stagewise estimation, which yields fast algorithms for problems such as group-structured learning, matrix completion, image denoising, and more.Comment: 56 pages, 15 figure

    Performance assessment of Surrogate model integrated with sensitivity analysis in multi-objective optimization

    Get PDF
    This Thesis develops a new multi-objective heuristic algorithm. The optimum searching task is performed by a standard genetic algorithm. Furthermore, it is assisted by the Response Surface Methodology surrogate model and by two sensitivity analysis methods: the Variance-based, also known as Sobol’ analysis, and the Elementary Effects. Once built the entire method, it is compared on several multi-objective problems with some other algorithms

    Robust and Fair Machine Learning under Distribution Shift

    Get PDF
    Machine learning algorithms have been widely used in real world applications. The development of these techniques has brought huge benefits for many AI-related tasks, such as natural language processing, image classification, video analysis, and so forth. In traditional machine learning algorithms, we usually assume that the training data and test data are independently and identically distributed (iid), indicating that the model learned from the training data can be well applied to the test data with good prediction performance. However, this assumption is quite restrictive because the distribution shift can exist from the training data to the test data in many scenarios. In addition, the goal of traditional machine learning model is to maximize the prediction performance, e.g., accuracy, based on the historical training data, which may tend to make unfair predictions for some particular individual or groups. In the literature, researchers either focus on building robust machine learning models under data distribution shift or achieving fairness separately, without considering to solve them simultaneously. The goal of this dissertation is to solve the above challenging issues in fair machine learning under distribution shift. We start from building an agnostic fair framework in federated learning as the data distribution is more diversified and distribution shift exists from the training data to the test data. Then we build a robust framework to address the sample selection bias for fair classification. Next we solve the sample selection bias issue for fair regression. Finally, we propose an adversarial framework to build a personalized model in the distributed setting where the distribution shift exists between different users. In this dissertation, we conduct the following research for fair machine learning under distribution shift. • We develop a fairness-aware agnostic federated learning framework (AgnosticFair) to deal with the challenge of unknown testing distribution; • We propose a framework for robust and fair learning under sample selection bias; • We develop a framework for fair regression under sample selection bias when dependent variable values of a set of samples from the training data are missing as a result of another hidden process; • We propose a learning framework that allows an individual user to build a personalized model in a distributed setting, where the distribution shift exists among different users
    • …
    corecore