9,870 research outputs found
Insights into the feature selection problem using local optima networks
The binary feature selection problem is investigated in this paper. Feature selection fitness landscape analysis is done, which allows for a better understanding of the behaviour of feature selection algorithms. Local optima networks are employed as a tool to visualise and characterise the fitness landscapes of the feature selection problem in the context of classification. An analysis of the fitness landscape global structure is provided, based on seven real-world datasets with up to 17 features. Formation of neutral global optima plateaus are shown to indicate the existence of irrelevant features in the datasets. Removal of irrelevant features resulted in a reduction of neutrality and the ratio of local optima to the size of the search space, resulting in improved performance of genetic algorithm search in finding the global optimum
Adaptive Normalized Risk-Averting Training For Deep Neural Networks
This paper proposes a set of new error criteria and learning approaches,
Adaptive Normalized Risk-Averting Training (ANRAT), to attack the non-convex
optimization problem in training deep neural networks (DNNs). Theoretically, we
demonstrate its effectiveness on global and local convexity lower-bounded by
the standard -norm error. By analyzing the gradient on the convexity index
, we explain the reason why to learn adaptively using
gradient descent works. In practice, we show how this method improves training
of deep neural networks to solve visual recognition tasks on the MNIST and
CIFAR-10 datasets. Without using pretraining or other tricks, we obtain results
comparable or superior to those reported in recent literature on the same tasks
using standard ConvNets + MSE/cross entropy. Performance on deep/shallow
multilayer perceptrons and Denoised Auto-encoders is also explored. ANRAT can
be combined with other quasi-Newton training methods, innovative network
variants, regularization techniques and other specific tricks in DNNs. Other
than unsupervised pretraining, it provides a new perspective to address the
non-convex optimization problem in DNNs.Comment: AAAI 2016, 0.39%~0.4% ER on MNIST with single 32-32-256-10 ConvNets,
code available at https://github.com/cauchyturing/ANRA
Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks
Transferability captures the ability of an attack against a machine-learning
model to be effective against a different, potentially unknown, model.
Empirical evidence for transferability has been shown in previous work, but the
underlying reasons why an attack transfers or not are not yet well understood.
In this paper, we present a comprehensive analysis aimed to investigate the
transferability of both test-time evasion and training-time poisoning attacks.
We provide a unifying optimization framework for evasion and poisoning attacks,
and a formal definition of transferability of such attacks. We highlight two
main factors contributing to attack transferability: the intrinsic adversarial
vulnerability of the target model, and the complexity of the surrogate model
used to optimize the attack. Based on these insights, we define three metrics
that impact an attack's transferability. Interestingly, our results derived
from theoretical analysis hold for both evasion and poisoning attacks, and are
confirmed experimentally using a wide range of linear and non-linear
classifiers and datasets
A multiobjective optimization approach to statistical mechanics
Optimization problems have been the subject of statistical physics
approximations. A specially relevant and general scenario is provided by
optimization methods considering tradeoffs between cost and efficiency, where
optimal solutions involve a compromise between both. The theory of Pareto (or
multi objective) optimization provides a general framework to explore these
problems and find the space of possible solutions compatible with the
underlying tradeoffs, known as the {\em Pareto front}. Conflicts between
constraints can lead to complex landscapes of Pareto optimal solutions with
interesting implications in economy, engineering, or evolutionary biology.
Despite their disparate nature, here we show how the structure of the Pareto
front uncovers profound universal features that can be understood in the
context of thermodynamics. In particular, our study reveals that different
fronts are connected to different classes of phase transitions, which we can
define robustly, along with critical points and thermodynamic potentials. These
equivalences are illustrated with classic thermodynamic examples.Comment: 14 pages, 8 figure
Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions
Generative Adversarial Networks (GANs) is a novel class of deep generative
models which has recently gained significant attention. GANs learns complex and
high-dimensional distributions implicitly over images, audio, and data.
However, there exists major challenges in training of GANs, i.e., mode
collapse, non-convergence and instability, due to inappropriate design of
network architecture, use of objective function and selection of optimization
algorithm. Recently, to address these challenges, several solutions for better
design and optimization of GANs have been investigated based on techniques of
re-engineered network architectures, new objective functions and alternative
optimization algorithms. To the best of our knowledge, there is no existing
survey that has particularly focused on broad and systematic developments of
these solutions. In this study, we perform a comprehensive survey of the
advancements in GANs design and optimization solutions proposed to handle GANs
challenges. We first identify key research issues within each design and
optimization technique and then propose a new taxonomy to structure solutions
by key research issues. In accordance with the taxonomy, we provide a detailed
discussion on different GANs variants proposed within each solution and their
relationships. Finally, based on the insights gained, we present the promising
research directions in this rapidly growing field.Comment: 42 pages, Figure 13, Table
Towards the Inferrence of Structural Similarity of Combinatorial Landscapes
One of the most common problem-solving heuristics is by analogy. For a given
problem, a solver can be viewed as a strategic walk on its fitness landscape.
Thus if a solver works for one problem instance, we expect it will also be
effective for other instances whose fitness landscapes essentially share
structural similarities with each other. However, due to the black-box nature
of combinatorial optimization, it is far from trivial to infer such similarity
in real-world scenarios. To bridge this gap, by using local optima network as a
proxy of fitness landscapes, this paper proposed to leverage graph data mining
techniques to conduct qualitative and quantitative analyses to explore the
latent topological structural information embedded in those landscapes. By
conducting large-scale empirical experiments on three classic combinatorial
optimization problems, we gain concrete evidence to support the existence of
structural similarity between landscapes of the same classes within neighboring
dimensions. We also interrogated the relationship between landscapes of
different problem classes
- âŠ