17,172 research outputs found
A Survey on Continuous Time Computations
We provide an overview of theories of continuous time computation. These
theories allow us to understand both the hardness of questions related to
continuous time dynamical systems and the computational power of continuous
time analog models. We survey the existing models, summarizing results, and
point to relevant references in the literature
Connections Between Adaptive Control and Optimization in Machine Learning
This paper demonstrates many immediate connections between adaptive control
and optimization methods commonly employed in machine learning. Starting from
common output error formulations, similarities in update law modifications are
examined. Concepts in stability, performance, and learning, common to both
fields are then discussed. Building on the similarities in update laws and
common concepts, new intersections and opportunities for improved algorithm
analysis are provided. In particular, a specific problem related to higher
order learning is solved through insights obtained from these intersections.Comment: 18 page
The importance of better models in stochastic optimization
Standard stochastic optimization methods are brittle, sensitive to stepsize
choices and other algorithmic parameters, and they exhibit instability outside
of well-behaved families of objectives. To address these challenges, we
investigate models for stochastic minimization and learning problems that
exhibit better robustness to problem families and algorithmic parameters. With
appropriately accurate models---which we call the aProx family---stochastic
methods can be made stable, provably convergent and asymptotically optimal;
even modeling that the objective is nonnegative is sufficient for this
stability. We extend these results beyond convexity to weakly convex
objectives, which include compositions of convex losses with smooth functions
common in modern machine learning applications. We highlight the importance of
robustness and accurate modeling with a careful experimental evaluation of
convergence time and algorithm sensitivity
Feature-Guided Black-Box Safety Testing of Deep Neural Networks
Despite the improved accuracy of deep neural networks, the discovery of
adversarial examples has raised serious safety concerns. Most existing
approaches for crafting adversarial examples necessitate some knowledge
(architecture, parameters, etc.) of the network at hand. In this paper, we
focus on image classifiers and propose a feature-guided black-box approach to
test the safety of deep neural networks that requires no such knowledge. Our
algorithm employs object detection techniques such as SIFT (Scale Invariant
Feature Transform) to extract features from an image. These features are
converted into a mutable saliency distribution, where high probability is
assigned to pixels that affect the composition of the image with respect to the
human visual system. We formulate the crafting of adversarial examples as a
two-player turn-based stochastic game, where the first player's objective is to
minimise the distance to an adversarial example by manipulating the features,
and the second player can be cooperative, adversarial, or random. We show that,
theoretically, the two-player game can con- verge to the optimal strategy, and
that the optimal strategy represents a globally minimal adversarial image. For
Lipschitz networks, we also identify conditions that provide safety guarantees
that no adversarial examples exist. Using Monte Carlo tree search we gradually
explore the game state space to search for adversarial examples. Our
experiments show that, despite the black-box setting, manipulations guided by a
perception-based saliency distribution are competitive with state-of-the-art
methods that rely on white-box saliency matrices or sophisticated optimization
procedures. Finally, we show how our method can be used to evaluate robustness
of neural networks in safety-critical applications such as traffic sign
recognition in self-driving cars.Comment: 35 pages, 5 tables, 23 figure
Design optimization applied in structural dynamics
This paper introduces the design optimization strategies, especially for structures which have dynamic constraints. Design optimization involves first the modeling and then the optimization of the problem. Utilizing the Finite Element (FE) model of a structure directly in an optimization process requires a long computation time. Therefore the Backpropagation Neural Networks (NNs) are introduced as a so called surrogate model for the FE model. Optimization techniques mentioned in this study cover the Genetic Algorithm (GA) and the Sequential Quadratic Programming (SQP) methods. For the applications of the introduced techniques, a multisegment cantilever beam problem under the constraints of its first and second natural frequency has been selected and solved using four different approaches
Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study
While first-order optimization methods such as stochastic gradient descent
(SGD) are popular in machine learning (ML), they come with well-known
deficiencies, including relatively-slow convergence, sensitivity to the
settings of hyper-parameters such as learning rate, stagnation at high training
errors, and difficulty in escaping flat regions and saddle points. These issues
are particularly acute in highly non-convex settings such as those arising in
neural networks. Motivated by this, there has been recent interest in
second-order methods that aim to alleviate these shortcomings by capturing
curvature information. In this paper, we report detailed empirical evaluations
of a class of Newton-type methods, namely sub-sampled variants of trust region
(TR) and adaptive regularization with cubics (ARC) algorithms, for non-convex
ML problems. In doing so, we demonstrate that these methods not only can be
computationally competitive with hand-tuned SGD with momentum, obtaining
comparable or better generalization performance, but also they are highly
robust to hyper-parameter settings. Further, in contrast to SGD with momentum,
we show that the manner in which these Newton-type methods employ curvature
information allows them to seamlessly escape flat regions and saddle points.Comment: 21 pages, 11 figures. Restructure the paper and add experiment
- …