4,408 research outputs found
Exploiting Structural Properties in the Analysis of High-dimensional Dynamical Systems
The physical and cyber domains with which we interact are filled with high-dimensional dynamical systems. In machine learning, for instance, the evolution of overparametrized neural networks can be seen as a dynamical system. In networked systems, numerous agents or nodes dynamically interact with each other. A deep understanding of these systems can enable us to predict their behavior, identify potential pitfalls, and devise effective solutions for optimal outcomes. In this dissertation, we will discuss two classes of high-dimensional dynamical systems with specific structural properties that aid in understanding their dynamic behavior.
In the first scenario, we consider the training dynamics of multi-layer neural networks. The high dimensionality comes from overparametrization: a typical network has a large depth and hidden layer width. We are interested in the following question regarding convergence: Do network weights converge to an equilibrium point corresponding to a global minimum of our training loss, and how fast is the convergence rate? The key to those questions is the symmetry of the weights, a critical property induced by the multi-layer architecture. Such symmetry leads to a set of time-invariant quantities, called weight imbalance, that restrict the training trajectory to a low-dimensional manifold defined by the weight initialization. A tailored convergence analysis is developed over this low-dimensional manifold, showing improved rate bounds for several multi-layer network models studied in the literature, leading to novel characterizations of the effect of weight imbalance on the convergence rate.
In the second scenario, we consider large-scale networked systems with multiple weakly-connected groups. Such a multi-cluster structure leads to a time-scale separation between the fast intra-group interaction due to high intra-group connectivity, and the slow inter-group oscillation, due to the weak inter-group connection. We develop a novel frequency-domain network coherence analysis that captures both the coherent behavior within each group, and the dynamical interaction between groups, leading to a structure-preserving model-reduction methodology for large-scale dynamic networks with multiple clusters under general node dynamics assumptions
A foundation for synthesising programming language semantics
Programming or scripting languages used in real-world systems are seldom designed
with a formal semantics in mind from the outset. Therefore, the first step for developing well-founded analysis tools for these systems is to reverse-engineer a formal
semantics. This can take months or years of effort.
Could we automate this process, at least partially? Though desirable, automatically reverse-engineering semantics rules from an implementation is very challenging,
as found by Krishnamurthi, Lerner and Elberty. They propose automatically learning
desugaring translation rules, mapping the language whose semantics we seek to a simplified, core version, whose semantics are much easier to write. The present thesis
contains an analysis of their challenge, as well as the first steps towards a solution.
Scaling methods with the size of the language is very difficult due to state space
explosion, so this thesis proposes an incremental approach to learning the translation
rules. I present a formalisation that both clarifies the informal description of the challenge by Krishnamurthi et al, and re-formulates the problem, shifting the focus to the
conditions for incremental learning. The central definition of the new formalisation is
the desugaring extension problem, i.e. extending a set of established translation rules
by synthesising new ones.
In a synthesis algorithm, the choice of search space is important and non-trivial,
as it needs to strike a good balance between expressiveness and efficiency. The rest
of the thesis focuses on defining search spaces for translation rules via typing rules.
Two prerequisites are required for comparing search spaces. The first is a series of
benchmarks, a set of source and target languages equipped with intended translation
rules between them. The second is an enumerative synthesis algorithm for efficiently
enumerating typed programs. I show how algebraic enumeration techniques can be applied to enumerating well-typed translation rules, and discuss the properties expected
from a type system for ensuring that typed programs be efficiently enumerable.
The thesis presents and empirically evaluates two search spaces. A baseline search
space yields the first practical solution to the challenge. The second search space is
based on a natural heuristic for translation rules, limiting the usage of variables so that
they are used exactly once. I present a linear type system designed to efficiently enumerate translation rules, where this heuristic is enforced. Through informal analysis
and empirical comparison to the baseline, I then show that using linear types can speed
up the synthesis of translation rules by an order of magnitude
UMSL Bulletin 2023-2024
The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp
Classical and quantum algorithms for scaling problems
This thesis is concerned with scaling problems, which have a plethora of connections to different areas of mathematics, physics and computer science. Although many structural aspects of these problems are understood by now, we only know how to solve them efficiently in special cases.We give new algorithms for non-commutative scaling problems with complexity guarantees that match the prior state of the art. To this end, we extend the well-known (self-concordance based) interior-point method (IPM) framework to Riemannian manifolds, motivated by its success in the commutative setting. Moreover, the IPM framework does not obviously suffer from the same obstructions to efficiency as previous methods. It also yields the first high-precision algorithms for other natural geometric problems in non-positive curvature.For the (commutative) problems of matrix scaling and balancing, we show that quantum algorithms can outperform the (already very efficient) state-of-the-art classical algorithms. Their time complexity can be sublinear in the input size; in certain parameter regimes they are also optimal, whereas in others we show no quantum speedup over the classical methods is possible. Along the way, we provide improvements over the long-standing state of the art for searching for all marked elements in a list, and computing the sum of a list of numbers.We identify a new application in the context of tensor networks for quantum many-body physics. We define a computable canonical form for uniform projected entangled pair states (as the solution to a scaling problem), circumventing previously known undecidability results. We also show, by characterizing the invariant polynomials, that the canonical form is determined by evaluating the tensor network contractions on networks of bounded size
Recommended from our members
Interpretable Machine Learning Architectures for Efficient Signal Detection with Applications to Gravitational Wave Astronomy
Deep learning has seen rapid evolution in the past decade, accomplishing tasks that were previously unimaginable. At the same time, researchers strive to better understand and interpret the underlying mechanisms of the deep models, which are often justifiably regarded as "black boxes". Overcoming this deficiency will not only serve to suggest better learning architectures and training methods, but also extend deep learning to scenarios where interpretability is key to the application. One such scenario is signal detection and estimation, with gravitational wave detection as a specific example, where classic methods are often preferred for their interpretability. Nonetheless, while classic statistical detection methods such as matched filtering excel in their simplicity and intuitiveness, they can be suboptimal in terms of both accuracy and computational efficiency. Therefore, it is appealing to have methods that achieve ``the best of both worlds'', namely enjoying simultaneously excellent performance and interpretability.
In this thesis, we aim to bridge this gap between modern deep learning and classic statistical detection, by revisiting the signal detection problem from a new perspective. First, to address the perceived distinction in interpretability between classic matched filtering and deep learning, we state the intrinsic connections between the two families of methods, and identify how trainable networks can address the structural limitations of matched filtering. Based on these ideas, we propose two trainable architectures that are constructed based on matched filtering, but with learnable templates and adaptivity to unknown noise distributions, and therefore higher detection accuracy. We next turn our attention toward improving the computational efficiency of detection, where we aim to design architectures that leverage structures within the problem for efficiency gains. By leveraging the statistical structure of class imbalance, we integrate hierarchical detection into trainable networks, and use a novel loss function which explicitly encodes both detection accuracy and efficiency. Furthermore, by leveraging the geometric structure of the signal set, we consider using signal space optimization as an alternative computational primitive for detection, which is intuitively more efficient than covering with a template bank. We theoretical prove the efficiency gain by analyzing Riemannian gradient descent on the signal manifold, which reveals an exponential improvement in efficiency over matched filtering. We also propose a practical trainable architecture for template optimization, which makes use of signal embedding and kernel interpolation.
We demonstrate the performance of all proposed architectures on the task of gravitational wave detection in astrophysics, where matched filtering is the current method of choice. The architectures are also widely applicable to general signal or pattern detection tasks, which we exemplify with the handwritten digit recognition task using the template optimization architecture. Together, we hope the this work useful to scientists and engineers seeking machine learning architectures with high performance and interpretability, and contribute to our understanding of deep learning as a whole
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
UMSL Bulletin 2022-2023
The 2022-2023 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1087/thumbnail.jp
A Practitioner's Guide to Bayesian Inference in Pharmacometrics using Pumas
This paper provides a comprehensive tutorial for Bayesian practitioners in
pharmacometrics using Pumas workflows. We start by giving a brief motivation of
Bayesian inference for pharmacometrics highlighting limitations in existing
software that Pumas addresses. We then follow by a description of all the steps
of a standard Bayesian workflow for pharmacometrics using code snippets and
examples. This includes: model definition, prior selection, sampling from the
posterior, prior and posterior simulations and predictions, counter-factual
simulations and predictions, convergence diagnostics, visual predictive checks,
and finally model comparison with cross-validation. Finally, the background and
intuition behind many advanced concepts in Bayesian statistics are explained in
simple language. This includes many important ideas and precautions that users
need to keep in mind when performing Bayesian analysis. Many of the algorithms,
codes, and ideas presented in this paper are highly applicable to clinical
research and statistical learning at large but we chose to focus our
discussions on pharmacometrics in this paper to have a narrower scope in mind
and given the nature of Pumas as a software primarily for pharmacometricians
- …