648 research outputs found

    Runtime Analysis for the NSGA-II: Proving, Quantifying, and Explaining the Inefficiency For Many Objectives

    Full text link
    The NSGA-II is one of the most prominent algorithms to solve multi-objective optimization problems. Despite numerous successful applications, several studies have shown that the NSGA-II is less effective for larger numbers of objectives. In this work, we use mathematical runtime analyses to rigorously demonstrate and quantify this phenomenon. We show that even on the simple mm-objective generalization of the discrete OneMinMax benchmark, where every solution is Pareto optimal, the NSGA-II also with large population sizes cannot compute the full Pareto front (objective vectors of all Pareto optima) in sub-exponential time when the number of objectives is at least three. The reason for this unexpected behavior lies in the fact that in the computation of the crowding distance, the different objectives are regarded independently. This is not a problem for two objectives, where any sorting of a pair-wise incomparable set of solutions according to one objective is also such a sorting according to the other objective (in the inverse order)

    Sharp Bounds for Genetic Drift in EDAs

    Full text link
    Estimation of Distribution Algorithms (EDAs) are one branch of Evolutionary Algorithms (EAs) in the broad sense that they evolve a probabilistic model instead of a population. Many existing algorithms fall into this category. Analogous to genetic drift in EAs, EDAs also encounter the phenomenon that updates of the probabilistic model not justified by the fitness move the sampling frequencies to the boundary values. This can result in a considerable performance loss. This paper proves the first sharp estimates of the boundary hitting time of the sampling frequency of a neutral bit for several univariate EDAs. For the UMDA that selects μ\mu best individuals from λ\lambda offspring each generation, we prove that the expected first iteration when the frequency of the neutral bit leaves the middle range [14,34][\tfrac 14, \tfrac 34] and the expected first time it is absorbed in 0 or 1 are both Θ(μ)\Theta(\mu). The corresponding hitting times are Θ(K2)\Theta(K^2) for the cGA with hypothetical population size KK. This paper further proves that for PBIL with parameters μ\mu, λ\lambda, and ρ\rho, in an expected number of Θ(μ/ρ2)\Theta(\mu/\rho^2) iterations the sampling frequency of a neutral bit leaves the interval [Θ(ρ/μ),1Θ(ρ/μ)][\Theta(\rho/\mu),1-\Theta(\rho/\mu)] and then always the same value is sampled for this bit, that is, the frequency approaches the corresponding boundary value with maximum speed. For the lower bounds implicit in these statements, we also show exponential tail bounds. If a bit is not neutral, but neutral or has a preference for ones, then the lower bounds on the times to reach a low frequency value still hold. An analogous statement holds for bits that are neutral or prefer the value zero

    From Understanding Genetic Drift to a Smart-Restart Mechanism for Estimation-of-Distribution Algorithms

    Full text link
    Estimation-of-distribution algorithms (EDAs) are optimization algorithms that learn a distribution on the search space from which good solutions can be sampled easily. A key parameter of most EDAs is the sample size (population size). If the population size is too small, the update of the probabilistic model builds on few samples, leading to the undesired effect of genetic drift. Too large population sizes avoid genetic drift, but slow down the process. Building on a recent quantitative analysis of how the population size leads to genetic drift, we design a smart-restart mechanism for EDAs. By stopping runs when the risk for genetic drift is high, it automatically runs the EDA in good parameter regimes. Via a mathematical runtime analysis, we prove a general performance guarantee for this smart-restart scheme. This in particular shows that in many situations where the optimal (problem-specific) parameter values are known, the restart scheme automatically finds these, leading to the asymptotically optimal performance. We also conduct an extensive experimental analysis. On four classic benchmark problems, we clearly observe the critical influence of the population size on the performance, and we find that the smart-restart scheme leads to a performance close to the one obtainable with optimal parameter values. Our results also show that previous theory-based suggestions for the optimal population size can be far from the optimal ones, leading to a performance clearly inferior to the one obtained via the smart-restart scheme. We also conduct experiments with PBIL (cross-entropy algorithm) on two combinatorial optimization problems from the literature, the max-cut problem and the bipartition problem. Again, we observe that the smart-restart mechanism finds much better values for the population size than those suggested in the literature, leading to a much better performance.Comment: Accepted for publication in "Journal of Machine Learning Research". Extended version of our GECCO 2020 paper. This article supersedes arXiv:2004.0714

    Building K-Anonymous User Cohorts with\\ Consecutive Consistent Weighted Sampling (CCWS)

    Full text link
    To retrieve personalized campaigns and creatives while protecting user privacy, digital advertising is shifting from member-based identity to cohort-based identity. Under such identity regime, an accurate and efficient cohort building algorithm is desired to group users with similar characteristics. In this paper, we propose a scalable KK-anonymous cohort building algorithm called {\em consecutive consistent weighted sampling} (CCWS). The proposed method combines the spirit of the (pp-powered) consistent weighted sampling and hierarchical clustering, so that the KK-anonymity is ensured by enforcing a lower bound on the size of cohorts. Evaluations on a LinkedIn dataset consisting of >70>70M users and ads campaigns demonstrate that CCWS achieves substantial improvements over several hashing-based methods including sign random projections (SignRP), minwise hashing (MinHash), as well as the vanilla CWS

    Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction

    Full text link
    User response prediction, which models the user preference w.r.t. the presented items, plays a key role in online services. With two-decade rapid development, nowadays the cumulated user behavior sequences on mature Internet service platforms have become extremely long since the user's first registration. Each user not only has intrinsic tastes, but also keeps changing her personal interests during lifetime. Hence, it is challenging to handle such lifelong sequential modeling for each individual user. Existing methodologies for sequential modeling are only capable of dealing with relatively recent user behaviors, which leaves huge space for modeling long-term especially lifelong sequential patterns to facilitate user modeling. Moreover, one user's behavior may be accounted for various previous behaviors within her whole online activity history, i.e., long-term dependency with multi-scale sequential patterns. In order to tackle these challenges, in this paper, we propose a Hierarchical Periodic Memory Network for lifelong sequential modeling with personalized memorization of sequential patterns for each user. The model also adopts a hierarchical and periodical updating mechanism to capture multi-scale sequential patterns of user interests while supporting the evolving user behavior logs. The experimental results over three large-scale real-world datasets have demonstrated the advantages of our proposed model with significant improvement in user response prediction performance against the state-of-the-arts.Comment: SIGIR 2019. Reproducible codes and datasets: https://github.com/alimamarankgroup/HPM

    A novel deflection shape function for rectangular capacitive micromachined ultrasonic transducer diaphragms

    Get PDF
    The final publication is available at Elsevier via http://dx.doi.org/10.2174/1874347101206010001. © 2015. This version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/A highly accurate analytical deflection shape function that describes the deflection profiles of capacitive micromachined ultrasonic transducers (CMUTs) with rectangular membranes under electrostatic pressure has been formulated. The rectangular diaphragms have a thickness range of 0.6–1.5 μm and a side length range of 100–1000 μm. The new deflection shape function generates deflection profiles that are in excellent agreement with finite element analysis (FEA) results for a wide range of geometry dimensions and loading conditions. The deflection shape function is used to analyze membrane deformations and to calculate the capacitances between the deformed membranes and the fixed back plates. In 50 groups of random tests, compared with FEA results, the calculated capacitance values have a maximum deviation of 1.486% for rectangular membranes. The new analytical deflection function can provide designers with a simple way of gaining insight into the effects of designed parameters for CMUTs and other MEMS-based capacitive type sensors.National Basic Research Program of China under Grant 2014CB845302 and by National Natural Science Foundation (NNSF) of China under Grants 61374036, 61273121, and Natural Science Foundation of Guangdong Province under Grant 2014A030313237, and by Natural Science and Engineering Research Council of Canada