31,720 research outputs found

    Slow decay of concentration variance due to no-slip walls in chaotic mixing

    Full text link
    Chaotic mixing in a closed vessel is studied experimentally and numerically in different 2-D flow configurations. For a purely hyperbolic phase space, it is well-known that concentration fluctuations converge to an eigenmode of the advection-diffusion operator and decay exponentially with time. We illustrate how the unstable manifold of hyperbolic periodic points dominates the resulting persistent pattern. We show for different physical viscous flows that, in the case of a fully chaotic Poincare section, parabolic periodic points at the walls lead to slower (algebraic) decay. A persistent pattern, the backbone of which is the unstable manifold of parabolic points, can be observed. However, slow stretching at the wall forbids the rapid propagation of stretched filaments throughout the whole domain, and hence delays the formation of an eigenmode until it is no longer experimentally observable. Inspired by the baker's map, we introduce a 1-D model with a parabolic point that gives a good account of the slow decay observed in experiments. We derive a universal decay law for such systems parametrized by the rate at which a particle approaches the no-slip wall.Comment: 17 pages, 12 figure

    Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees

    Full text link
    Asynchronous distributed algorithms are a popular way to reduce synchronization costs in large-scale optimization, and in particular for neural network training. However, for nonsmooth and nonconvex objectives, few convergence guarantees exist beyond cases where closed-form proximal operator solutions are available. As most popular contemporary deep neural networks lead to nonsmooth and nonconvex objectives, there is now a pressing need for such convergence guarantees. In this paper, we analyze for the first time the convergence of stochastic asynchronous optimization for this general class of objectives. In particular, we focus on stochastic subgradient methods allowing for block variable partitioning, where the shared-memory-based model is asynchronously updated by concurrent processes. To this end, we first introduce a probabilistic model which captures key features of real asynchronous scheduling between concurrent processes; under this model, we establish convergence with probability one to an invariant set for stochastic subgradient methods with momentum. From the practical perspective, one issue with the family of methods we consider is that it is not efficiently supported by machine learning frameworks, as they mostly focus on distributed data-parallel strategies. To address this, we propose a new implementation strategy for shared-memory based training of deep neural networks, whereby concurrent parameter servers are utilized to train a partitioned but shared model in single- and multi-GPU settings. Based on this implementation, we achieve on average 1.2x speed-up in comparison to state-of-the-art training methods for popular image classification tasks without compromising accuracy

    Hardness of parameter estimation in graphical models

    Full text link
    We consider the problem of learning the canonical parameters specifying an undirected graphical model (Markov random field) from the mean parameters. For graphical models representing a minimal exponential family, the canonical parameters are uniquely determined by the mean parameters, so the problem is feasible in principle. The goal of this paper is to investigate the computational feasibility of this statistical task. Our main result shows that parameter estimation is in general intractable: no algorithm can learn the canonical parameters of a generic pair-wise binary graphical model from the mean parameters in time bounded by a polynomial in the number of variables (unless RP = NP). Indeed, such a result has been believed to be true (see the monograph by Wainwright and Jordan (2008)) but no proof was known. Our proof gives a polynomial time reduction from approximating the partition function of the hard-core model, known to be hard, to learning approximate parameters. Our reduction entails showing that the marginal polytope boundary has an inherent repulsive property, which validates an optimization procedure over the polytope that does not use any knowledge of its structure (as required by the ellipsoid method and others).Comment: 15 pages. To appear in NIPS 201

    Solving the TTC 2011 Reengineering Case with GReTL

    Full text link
    This paper discusses the GReTL reference solution of the TTC 2011 Reengineering case. Given a Java syntax graph, a simple state machine model has to be extracted. The submitted solution covers both the core task and the two extension tasks.Comment: In Proceedings TTC 2011, arXiv:1111.440

    Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization

    Get PDF
    Due to their simplicity and excellent performance, parallel asynchronous variants of stochastic gradient descent have become popular methods to solve a wide range of large-scale optimization problems on multi-core architectures. Yet, despite their practical success, support for nonsmooth objectives is still lacking, making them unsuitable for many problems of interest in machine learning, such as the Lasso, group Lasso or empirical risk minimization with convex constraints. In this work, we propose and analyze ProxASAGA, a fully asynchronous sparse method inspired by SAGA, a variance reduced incremental gradient algorithm. The proposed method is easy to implement and significantly outperforms the state of the art on several nonsmooth, large-scale problems. We prove that our method achieves a theoretical linear speedup with respect to the sequential version under assumptions on the sparsity of gradients and block-separability of the proximal term. Empirical benchmarks on a multi-core architecture illustrate practical speedups of up to 12x on a 20-core machine.Comment: Appears in Advances in Neural Information Processing Systems 30 (NIPS 2017), 28 page

    Downward transference of mice and universality of local core models

    Full text link
    If M is a proper class inner model of ZFC and omega_2^M=omega_2, then every sound mouse projecting to omega and not past 0-pistol belongs to M. In fact, under the assumption that 0-pistol does not belong to M, K^M \| omega_2 is universal for all countable mice in V. Similarly, if M is a proper class inner model of ZFC, delta>omega_1 is regular, (delta^+)^M = delta^+, and in V there is no proper class inner model with a Woodin cardinal, then K^M \| delta is universal for all mice in V of cardinality less than delta.Comment: Revised version, incorporating the referee's suggestion

    Efficient First Order Methods for Linear Composite Regularizers

    Get PDF
    A wide class of regularization problems in machine learning and statistics employ a regularization term which is obtained by composing a simple convex function \omega with a linear transformation. This setting includes Group Lasso methods, the Fused Lasso and other total variation methods, multi-task learning methods and many more. In this paper, we present a general approach for computing the proximity operator of this class of regularizers, under the assumption that the proximity operator of the function \omega is known in advance. Our approach builds on a recent line of research on optimal first order optimization methods and uses fixed point iterations for numerically computing the proximity operator. It is more general than current approaches and, as we show with numerical simulations, computationally more efficient than available first order methods which do not achieve the optimal rate. In particular, our method outperforms state of the art O(1/T) methods for overlapping Group Lasso and matches optimal O(1/T^2) methods for the Fused Lasso and tree structured Group Lasso.Comment: 19 pages, 8 figure
    • …
    corecore