648 research outputs found
Runtime Analysis for the NSGA-II: Proving, Quantifying, and Explaining the Inefficiency For Many Objectives
The NSGA-II is one of the most prominent algorithms to solve multi-objective
optimization problems. Despite numerous successful applications, several
studies have shown that the NSGA-II is less effective for larger numbers of
objectives. In this work, we use mathematical runtime analyses to rigorously
demonstrate and quantify this phenomenon. We show that even on the simple
-objective generalization of the discrete OneMinMax benchmark, where every
solution is Pareto optimal, the NSGA-II also with large population sizes cannot
compute the full Pareto front (objective vectors of all Pareto optima) in
sub-exponential time when the number of objectives is at least three. The
reason for this unexpected behavior lies in the fact that in the computation of
the crowding distance, the different objectives are regarded independently.
This is not a problem for two objectives, where any sorting of a pair-wise
incomparable set of solutions according to one objective is also such a sorting
according to the other objective (in the inverse order)
Sharp Bounds for Genetic Drift in EDAs
Estimation of Distribution Algorithms (EDAs) are one branch of Evolutionary
Algorithms (EAs) in the broad sense that they evolve a probabilistic model
instead of a population. Many existing algorithms fall into this category.
Analogous to genetic drift in EAs, EDAs also encounter the phenomenon that
updates of the probabilistic model not justified by the fitness move the
sampling frequencies to the boundary values. This can result in a considerable
performance loss.
This paper proves the first sharp estimates of the boundary hitting time of
the sampling frequency of a neutral bit for several univariate EDAs. For the
UMDA that selects best individuals from offspring each
generation, we prove that the expected first iteration when the frequency of
the neutral bit leaves the middle range and the
expected first time it is absorbed in 0 or 1 are both . The
corresponding hitting times are for the cGA with hypothetical
population size . This paper further proves that for PBIL with parameters
, , and , in an expected number of
iterations the sampling frequency of a neutral bit leaves the interval
and then always the same value is
sampled for this bit, that is, the frequency approaches the corresponding
boundary value with maximum speed.
For the lower bounds implicit in these statements, we also show exponential
tail bounds. If a bit is not neutral, but neutral or has a preference for ones,
then the lower bounds on the times to reach a low frequency value still hold.
An analogous statement holds for bits that are neutral or prefer the value
zero
From Understanding Genetic Drift to a Smart-Restart Mechanism for Estimation-of-Distribution Algorithms
Estimation-of-distribution algorithms (EDAs) are optimization algorithms that
learn a distribution on the search space from which good solutions can be
sampled easily. A key parameter of most EDAs is the sample size (population
size). If the population size is too small, the update of the probabilistic
model builds on few samples, leading to the undesired effect of genetic drift.
Too large population sizes avoid genetic drift, but slow down the process.
Building on a recent quantitative analysis of how the population size leads
to genetic drift, we design a smart-restart mechanism for EDAs. By stopping
runs when the risk for genetic drift is high, it automatically runs the EDA in
good parameter regimes.
Via a mathematical runtime analysis, we prove a general performance guarantee
for this smart-restart scheme. This in particular shows that in many situations
where the optimal (problem-specific) parameter values are known, the restart
scheme automatically finds these, leading to the asymptotically optimal
performance.
We also conduct an extensive experimental analysis. On four classic benchmark
problems, we clearly observe the critical influence of the population size on
the performance, and we find that the smart-restart scheme leads to a
performance close to the one obtainable with optimal parameter values. Our
results also show that previous theory-based suggestions for the optimal
population size can be far from the optimal ones, leading to a performance
clearly inferior to the one obtained via the smart-restart scheme. We also
conduct experiments with PBIL (cross-entropy algorithm) on two combinatorial
optimization problems from the literature, the max-cut problem and the
bipartition problem. Again, we observe that the smart-restart mechanism finds
much better values for the population size than those suggested in the
literature, leading to a much better performance.Comment: Accepted for publication in "Journal of Machine Learning Research".
Extended version of our GECCO 2020 paper. This article supersedes
arXiv:2004.0714
Building K-Anonymous User Cohorts with\\ Consecutive Consistent Weighted Sampling (CCWS)
To retrieve personalized campaigns and creatives while protecting user
privacy, digital advertising is shifting from member-based identity to
cohort-based identity. Under such identity regime, an accurate and efficient
cohort building algorithm is desired to group users with similar
characteristics. In this paper, we propose a scalable -anonymous cohort
building algorithm called {\em consecutive consistent weighted sampling}
(CCWS). The proposed method combines the spirit of the (-powered) consistent
weighted sampling and hierarchical clustering, so that the -anonymity is
ensured by enforcing a lower bound on the size of cohorts. Evaluations on a
LinkedIn dataset consisting of M users and ads campaigns demonstrate that
CCWS achieves substantial improvements over several hashing-based methods
including sign random projections (SignRP), minwise hashing (MinHash), as well
as the vanilla CWS
Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction
User response prediction, which models the user preference w.r.t. the
presented items, plays a key role in online services. With two-decade rapid
development, nowadays the cumulated user behavior sequences on mature Internet
service platforms have become extremely long since the user's first
registration. Each user not only has intrinsic tastes, but also keeps changing
her personal interests during lifetime. Hence, it is challenging to handle such
lifelong sequential modeling for each individual user. Existing methodologies
for sequential modeling are only capable of dealing with relatively recent user
behaviors, which leaves huge space for modeling long-term especially lifelong
sequential patterns to facilitate user modeling. Moreover, one user's behavior
may be accounted for various previous behaviors within her whole online
activity history, i.e., long-term dependency with multi-scale sequential
patterns. In order to tackle these challenges, in this paper, we propose a
Hierarchical Periodic Memory Network for lifelong sequential modeling with
personalized memorization of sequential patterns for each user. The model also
adopts a hierarchical and periodical updating mechanism to capture multi-scale
sequential patterns of user interests while supporting the evolving user
behavior logs. The experimental results over three large-scale real-world
datasets have demonstrated the advantages of our proposed model with
significant improvement in user response prediction performance against the
state-of-the-arts.Comment: SIGIR 2019. Reproducible codes and datasets:
https://github.com/alimamarankgroup/HPM
A novel deflection shape function for rectangular capacitive micromachined ultrasonic transducer diaphragms
The final publication is available at Elsevier via http://dx.doi.org/10.2174/1874347101206010001. © 2015. This version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/A highly accurate analytical deflection shape function that describes the deflection profiles of capacitive micromachined ultrasonic transducers (CMUTs) with rectangular membranes under electrostatic pressure has been formulated. The rectangular diaphragms have a thickness range of 0.6–1.5 μm and a side length range of 100–1000 μm. The new deflection shape function generates deflection profiles that are in excellent agreement with finite element analysis (FEA) results for a wide range of geometry dimensions and loading conditions. The deflection shape function is used to analyze membrane deformations and to calculate the capacitances between the deformed membranes and the fixed back plates. In 50 groups of random tests, compared with FEA results, the calculated capacitance values have a maximum deviation of 1.486% for rectangular membranes. The new analytical deflection function can provide designers with a simple way of gaining insight into the effects of designed parameters for CMUTs and other MEMS-based capacitive type sensors.National Basic Research Program of China under Grant 2014CB845302 and by National Natural Science Foundation (NNSF) of China under Grants 61374036, 61273121, and Natural Science Foundation of Guangdong Province under Grant 2014A030313237, and by Natural Science and Engineering Research Council of Canada
- …