19,638 research outputs found
Neighborhood Variants of the KKM Lemma, Lebesgue Covering Theorem, and Sperner's Lemma on the Cube
We establish a "neighborhood" variant of the cubical KKM lemma and the
Lebesgue covering theorem and deduce a discretized version which is a
"neighborhood" variant of Sperner's lemma on the cube. The main result is the
following: for any coloring of the unit -cube in which points on
opposite faces must be given different colors, and for any ,
there is an -ball which contains points of at least
different colors, (so in particular,
at least different colors for all sensible
).Comment: 18 pages plus appendices (30 pages total), 3 figure
Implicit Loss of Surjectivity and Facial Reduction: Theory and Applications
Facial reduction, pioneered by Borwein and Wolkowicz, is a preprocessing method that is commonly used to obtain strict feasibility in the reformulated, reduced constraint system.
The importance of strict feasibility is often addressed in the context of the convergence results for interior point methods.
Beyond the theoretical properties that the facial reduction conveys, we show that facial reduction, not only limited to interior point methods, leads to strong numerical performances in different classes of algorithms.
In this thesis we study various consequences and the broad applicability of facial reduction.
The thesis is organized in two parts.
In the first part, we show the instabilities accompanied by the absence
of strict feasibility through the lens of facially reduced systems.
In particular, we exploit the implicit redundancies, revealed by each nontrivial facial reduction step, resulting in the implicit loss of surjectivity.
This leads to the two-step facial reduction and two novel related notions of singularity.
For the area of semidefinite programming, we use these singularities to strengthen a known bound on the solution rank, the Barvinok-Pataki bound.
For the area of linear programming, we reveal degeneracies caused by the implicit redundancies.
Furthermore, we propose a preprocessing tool that uses the simplex method.
In the second part of this thesis, we continue with the semidefinite programs that do not have strictly feasible points.
We focus on the doubly-nonnegative relaxation of the binary quadratic program and a semidefinite program with a nonlinear objective function.
We closely work with two classes of algorithms, the splitting method and the Gauss-Newton interior point method.
We elaborate on the advantages in building models from facial reduction. Moreover, we develop algorithms for real-world problems including the quadratic assignment problem, the protein side-chain positioning problem, and the key rate computation for quantum key distribution.
Facial reduction continues to play an important role for
providing robust reformulated models in both the theoretical and the practical aspects, resulting in successful numerical performances
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
Unveiling the anatomy of mode-coupling theory
The mode-coupling theory of the glass transition (MCT) has been at the
forefront of fundamental glass research for decades, yet the theory's
underlying approximations remain obscure. Here we quantify and critically
assess the effect of each MCT approximation separately. Using Brownian dynamics
simulations, we compute the memory kernel predicted by MCT after each
approximation in its derivation, and compare it with the exact one. We find
that some often-criticized approximations are in fact very accurate, while the
opposite is true for others, providing new guiding cues for further theory
development
Old and New Minimalism: a Hopf algebra comparison
In this paper we compare some old formulations of Minimalism, in particular
Stabler's computational minimalism, and Chomsky's new formulation of Merge and
Minimalism, from the point of view of their mathematical description in terms
of Hopf algebras. We show that the newer formulation has a clear advantage
purely in terms of the underlying mathematical structure. More precisely, in
the case of Stabler's computational minimalism, External Merge can be described
in terms of a partially defined operated algebra with binary operation, while
Internal Merge determines a system of right-ideal coideals of the Loday-Ronco
Hopf algebra and corresponding right-module coalgebra quotients. This
mathematical structure shows that Internal and External Merge have
significantly different roles in the old formulations of Minimalism, and they
are more difficult to reconcile as facets of a single algebraic operation, as
desirable linguistically. On the other hand, we show that the newer formulation
of Minimalism naturally carries a Hopf algebra structure where Internal and
External Merge directly arise from the same operation. We also compare, at the
level of algebraic properties, the externalization model of the new Minimalism
with proposals for assignments of planar embeddings based on heads of trees.Comment: 27 pages, LaTeX, 3 figure
Reinforcement learning in large state action spaces
Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios.
This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory).
In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications
Reconfiguration of the Union of Arborescences
An arborescence in a digraph is an acyclic arc subset in which every vertex
execpt a root has exactly one incoming arc. In this paper, we reveal the
reconfigurability of the union of arborescences for fixed in the
following sense: for any pair of arc subsets that can be partitioned into
arborescences, one can be transformed into the other by exchanging arcs one by
one so that every intermediate arc subset can also be partitioned into
arborescences. This generalizes the result by Ito et al. (2023), who showed the
case with . Since the union of arborescences can be represented as a
common matroid basis of two matroids, our result gives a new non-trivial
example of matroid pairs for which two common bases are always reconfigurable
to each other
Many Physical Design Problems are Sparse QCQPs
Physical design refers to mathematical optimization of a desired objective
(e.g. strong light--matter interactions, or complete quantum state transfer)
subject to the governing dynamical equations, such as Maxwell's or
Schrodinger's differential equations. Computing an optimal design is
challenging: generically, these problems are highly nonconvex and finding
global optima is NP hard. Here we show that for linear-differential-equation
dynamics (as in linear electromagnetism, elasticity, quantum mechanics, etc.),
the physical-design optimization problem can be transformed to a sparse-matrix,
quadratically constrained quadratic program (QCQP). Sparse QCQPs can be tackled
with convex optimization techniques (such as semidefinite programming) that
have thrived for identifying global bounds and high-performance designs in
other areas of science and engineering, but seemed inapplicable to the design
problems of wave physics. We apply our formulation to prototypical photonic
design problems, showing the possibility to compute fundamental limits for
large-area metasurfaces, as well as the identification of designs approaching
global optimality. Looking forward, our approach highlights the promise of
developing bespoke algorithms tailored to specific physical design problems.Comment: 9 pages, 4 figures, plus references and Supplementary Material
Hilbert-Burch virtual resolutions for points in
Building off of work of Harada, Nowroozi, and Van Tuyl which provided
particular length two virtual resolutions for finite sets of points in
, we prove that the vast majority of virtual
resolutions of a pair for minimal elements of the multigraded regularity in
this setting are of Hilbert-Burch type. We give explicit descriptions of these
short virtual resolutions that depend only on the number of points. Moreover,
despite initial evidence, we show that these virtual resolutions are not always
short, and we give sufficient conditions for when they are length three.Comment: 21 pages, comments welcome
Endogenous measures for contextualising large-scale social phenomena: a corpus-based method for mediated public discourse
This work presents an interdisciplinary methodology for developing endogenous measures of group membership through analysis of pervasive linguistic patterns in public discourse. Focusing on political discourse, this work critiques the conventional approach to the study of political participation, which is premised on decontextualised, exogenous measures to characterise groups. Considering the theoretical and empirical weaknesses of decontextualised approaches to large-scale social phenomena, this work suggests that contextualisation using endogenous measures might provide a complementary perspective to mitigate such weaknesses.
This work develops a sociomaterial perspective on political participation in mediated discourse as affiliatory action performed through language. While the affiliatory function of language is often performed consciously (such as statements of identity), this work is concerned with unconscious features (such as patterns in lexis and grammar). This work argues that pervasive patterns in such features that emerge through socialisation are resistant to change and manipulation, and thus might serve as endogenous measures of sociopolitical contexts, and thus of groups.
In terms of method, the work takes a corpus-based approach to the analysis of data from the Twitter messaging service whereby patterns in usersâ speech are examined statistically in order to trace potential community membership. The method is applied in the US state of Michigan during the second half of 2018â6 November having been the date of midterm (i.e. non-Presidential) elections in the United States. The corpus is assembled from the original posts of 5,889 users, who are nominally geolocalised to 417 municipalities. These users are clustered according to pervasive language features. Comparing the linguistic clusters according to the municipalities they represent finds that there are regular sociodemographic differentials across clusters. This is understood as an indication of social structure, suggesting that endogenous measures derived from pervasive patterns in language may indeed offer a complementary, contextualised perspective on large-scale social phenomena
- âŠ