19,638 research outputs found

    Neighborhood Variants of the KKM Lemma, Lebesgue Covering Theorem, and Sperner's Lemma on the Cube

    Full text link
    We establish a "neighborhood" variant of the cubical KKM lemma and the Lebesgue covering theorem and deduce a discretized version which is a "neighborhood" variant of Sperner's lemma on the cube. The main result is the following: for any coloring of the unit dd-cube [0,1]d[0,1]^d in which points on opposite faces must be given different colors, and for any Δ>0\varepsilon>0, there is an ℓ∞\ell_\infty Δ\varepsilon-ball which contains points of at least (1+Δ1+Δ)d(1+\frac{\varepsilon}{1+\varepsilon})^d different colors, (so in particular, at least (1+23Δ)d(1+\frac{2}{3}\varepsilon)^d different colors for all sensible Δ∈(0,12]\varepsilon\in(0,\frac12]).Comment: 18 pages plus appendices (30 pages total), 3 figure

    Implicit Loss of Surjectivity and Facial Reduction: Theory and Applications

    Get PDF
    Facial reduction, pioneered by Borwein and Wolkowicz, is a preprocessing method that is commonly used to obtain strict feasibility in the reformulated, reduced constraint system. The importance of strict feasibility is often addressed in the context of the convergence results for interior point methods. Beyond the theoretical properties that the facial reduction conveys, we show that facial reduction, not only limited to interior point methods, leads to strong numerical performances in different classes of algorithms. In this thesis we study various consequences and the broad applicability of facial reduction. The thesis is organized in two parts. In the first part, we show the instabilities accompanied by the absence of strict feasibility through the lens of facially reduced systems. In particular, we exploit the implicit redundancies, revealed by each nontrivial facial reduction step, resulting in the implicit loss of surjectivity. This leads to the two-step facial reduction and two novel related notions of singularity. For the area of semidefinite programming, we use these singularities to strengthen a known bound on the solution rank, the Barvinok-Pataki bound. For the area of linear programming, we reveal degeneracies caused by the implicit redundancies. Furthermore, we propose a preprocessing tool that uses the simplex method. In the second part of this thesis, we continue with the semidefinite programs that do not have strictly feasible points. We focus on the doubly-nonnegative relaxation of the binary quadratic program and a semidefinite program with a nonlinear objective function. We closely work with two classes of algorithms, the splitting method and the Gauss-Newton interior point method. We elaborate on the advantages in building models from facial reduction. Moreover, we develop algorithms for real-world problems including the quadratic assignment problem, the protein side-chain positioning problem, and the key rate computation for quantum key distribution. Facial reduction continues to play an important role for providing robust reformulated models in both the theoretical and the practical aspects, resulting in successful numerical performances

    Modular lifelong machine learning

    Get PDF
    Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge. Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand. This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems. First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures. Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations. Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods. Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer

    Unveiling the anatomy of mode-coupling theory

    Full text link
    The mode-coupling theory of the glass transition (MCT) has been at the forefront of fundamental glass research for decades, yet the theory's underlying approximations remain obscure. Here we quantify and critically assess the effect of each MCT approximation separately. Using Brownian dynamics simulations, we compute the memory kernel predicted by MCT after each approximation in its derivation, and compare it with the exact one. We find that some often-criticized approximations are in fact very accurate, while the opposite is true for others, providing new guiding cues for further theory development

    Old and New Minimalism: a Hopf algebra comparison

    Full text link
    In this paper we compare some old formulations of Minimalism, in particular Stabler's computational minimalism, and Chomsky's new formulation of Merge and Minimalism, from the point of view of their mathematical description in terms of Hopf algebras. We show that the newer formulation has a clear advantage purely in terms of the underlying mathematical structure. More precisely, in the case of Stabler's computational minimalism, External Merge can be described in terms of a partially defined operated algebra with binary operation, while Internal Merge determines a system of right-ideal coideals of the Loday-Ronco Hopf algebra and corresponding right-module coalgebra quotients. This mathematical structure shows that Internal and External Merge have significantly different roles in the old formulations of Minimalism, and they are more difficult to reconcile as facets of a single algebraic operation, as desirable linguistically. On the other hand, we show that the newer formulation of Minimalism naturally carries a Hopf algebra structure where Internal and External Merge directly arise from the same operation. We also compare, at the level of algebraic properties, the externalization model of the new Minimalism with proposals for assignments of planar embeddings based on heads of trees.Comment: 27 pages, LaTeX, 3 figure

    Reinforcement learning in large state action spaces

    Get PDF
    Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios. This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory). In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications

    Reconfiguration of the Union of Arborescences

    Full text link
    An arborescence in a digraph is an acyclic arc subset in which every vertex execpt a root has exactly one incoming arc. In this paper, we reveal the reconfigurability of the union of kk arborescences for fixed kk in the following sense: for any pair of arc subsets that can be partitioned into kk arborescences, one can be transformed into the other by exchanging arcs one by one so that every intermediate arc subset can also be partitioned into kk arborescences. This generalizes the result by Ito et al. (2023), who showed the case with k=1k=1. Since the union of kk arborescences can be represented as a common matroid basis of two matroids, our result gives a new non-trivial example of matroid pairs for which two common bases are always reconfigurable to each other

    Many Physical Design Problems are Sparse QCQPs

    Full text link
    Physical design refers to mathematical optimization of a desired objective (e.g. strong light--matter interactions, or complete quantum state transfer) subject to the governing dynamical equations, such as Maxwell's or Schrodinger's differential equations. Computing an optimal design is challenging: generically, these problems are highly nonconvex and finding global optima is NP hard. Here we show that for linear-differential-equation dynamics (as in linear electromagnetism, elasticity, quantum mechanics, etc.), the physical-design optimization problem can be transformed to a sparse-matrix, quadratically constrained quadratic program (QCQP). Sparse QCQPs can be tackled with convex optimization techniques (such as semidefinite programming) that have thrived for identifying global bounds and high-performance designs in other areas of science and engineering, but seemed inapplicable to the design problems of wave physics. We apply our formulation to prototypical photonic design problems, showing the possibility to compute fundamental limits for large-area metasurfaces, as well as the identification of designs approaching global optimality. Looking forward, our approach highlights the promise of developing bespoke algorithms tailored to specific physical design problems.Comment: 9 pages, 4 figures, plus references and Supplementary Material

    Hilbert-Burch virtual resolutions for points in P1×P1\mathbb{P}^1\times\mathbb{P}^1

    Full text link
    Building off of work of Harada, Nowroozi, and Van Tuyl which provided particular length two virtual resolutions for finite sets of points in P1×P1\mathbb{P}^1\times\mathbb{P}^1, we prove that the vast majority of virtual resolutions of a pair for minimal elements of the multigraded regularity in this setting are of Hilbert-Burch type. We give explicit descriptions of these short virtual resolutions that depend only on the number of points. Moreover, despite initial evidence, we show that these virtual resolutions are not always short, and we give sufficient conditions for when they are length three.Comment: 21 pages, comments welcome

    Endogenous measures for contextualising large-scale social phenomena: a corpus-based method for mediated public discourse

    Get PDF
    This work presents an interdisciplinary methodology for developing endogenous measures of group membership through analysis of pervasive linguistic patterns in public discourse. Focusing on political discourse, this work critiques the conventional approach to the study of political participation, which is premised on decontextualised, exogenous measures to characterise groups. Considering the theoretical and empirical weaknesses of decontextualised approaches to large-scale social phenomena, this work suggests that contextualisation using endogenous measures might provide a complementary perspective to mitigate such weaknesses. This work develops a sociomaterial perspective on political participation in mediated discourse as affiliatory action performed through language. While the affiliatory function of language is often performed consciously (such as statements of identity), this work is concerned with unconscious features (such as patterns in lexis and grammar). This work argues that pervasive patterns in such features that emerge through socialisation are resistant to change and manipulation, and thus might serve as endogenous measures of sociopolitical contexts, and thus of groups. In terms of method, the work takes a corpus-based approach to the analysis of data from the Twitter messaging service whereby patterns in users’ speech are examined statistically in order to trace potential community membership. The method is applied in the US state of Michigan during the second half of 2018—6 November having been the date of midterm (i.e. non-Presidential) elections in the United States. The corpus is assembled from the original posts of 5,889 users, who are nominally geolocalised to 417 municipalities. These users are clustered according to pervasive language features. Comparing the linguistic clusters according to the municipalities they represent finds that there are regular sociodemographic differentials across clusters. This is understood as an indication of social structure, suggesting that endogenous measures derived from pervasive patterns in language may indeed offer a complementary, contextualised perspective on large-scale social phenomena
    • 

    corecore