2,145 research outputs found
UMSL Bulletin 2023-2024
The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp
Projection-free methods for solving smooth convex bilevel optimisation problems
When faced with multiple minima of an "inner-level" convex optimisation problem, the convex bilevel optimisation problem selects an optimal solution which also minimises an auxiliary "outer-level" convex objective of interest. Bilevel optimisation requires a different approach compared to single-level optimisation problems since the set of minimisers for the inner-level objective is not given explicitly. In this thesis, we propose new projection-free methods for convex bilevel optimisation which require only a linear optimisation oracle over the base domain. We provide convergence guarantees for both inner- and outer-level objectives that hold under our proposed projection-free methods. In particular, we highlight how our guarantees are affected by the presence or absence of an optimal dual solution. Lastly, we conduct numerical experiments that demonstrate the performance of the proposed methods
Less is More -- Towards parsimonious multi-task models using structured sparsity
Model sparsification in deep learning promotes simpler, more interpretable
models with fewer parameters. This not only reduces the model's memory
footprint and computational needs but also shortens inference time. This work
focuses on creating sparse models optimized for multiple tasks with fewer
parameters. These parsimonious models also possess the potential to match or
outperform dense models in terms of performance. In this work, we introduce
channel-wise l1/l2 group sparsity in the shared convolutional layers parameters
(or weights) of the multi-task learning model. This approach facilitates the
removal of extraneous groups i.e., channels (due to l1 regularization) and also
imposes a penalty on the weights, further enhancing the learning efficiency for
all tasks (due to l2 regularization). We analyzed the results of group sparsity
in both single-task and multi-task settings on two widely-used Multi-Task
Learning (MTL) datasets: NYU-v2 and CelebAMask-HQ. On both datasets, which
consist of three different computer vision tasks each, multi-task models with
approximately 70% sparsity outperform their dense equivalents. We also
investigate how changing the degree of sparsification influences the model's
performance, the overall sparsity percentage, the patterns of sparsity, and the
inference time.Comment: Under revie
NEMISA Digital Skills Conference (Colloquium) 2023
The purpose of the colloquium and events centred around the central role that data plays
today as a desirable commodity that must become an important part of massifying digital
skilling efforts. Governments amass even more critical data that, if leveraged, could
change the way public services are delivered, and even change the social and economic
fortunes of any country. Therefore, smart governments and organisations increasingly
require data skills to gain insights and foresight, to secure themselves, and for improved
decision making and efficiency. However, data skills are scarce, and even more
challenging is the inconsistency of the associated training programs with most curated for
the Science, Technology, Engineering, and Mathematics (STEM) disciplines.
Nonetheless, the interdisciplinary yet agnostic nature of data means that there is
opportunity to expand data skills into the non-STEM disciplines as well.College of Engineering, Science and Technolog
On solving a rank regularized minimization problem via equivalent factorized column-sparse regularized models
Rank regularized minimization problem is an ideal model for the low-rank
matrix completion/recovery problem. The matrix factorization approach can
transform the high-dimensional rank regularized problem to a low-dimensional
factorized column-sparse regularized problem. The latter can greatly facilitate
fast computations in applicable algorithms, but needs to overcome the
simultaneous non-convexity of the loss and regularization functions. In this
paper, we consider the factorized column-sparse regularized model. Firstly, we
optimize this model with bound constraints, and establish a certain equivalence
between the optimized factorization problem and rank regularized problem.
Further, we strengthen the optimality condition for stationary points of the
factorization problem and define the notion of strong stationary point.
Moreover, we establish the equivalence between the factorization problem and
its a nonconvex relaxation in the sense of global minimizers and strong
stationary points. To solve the factorization problem, we design two types of
algorithms and give an adaptive method to reduce their computation. The first
algorithm is from the relaxation point of view and its iterates own some
properties from global minimizers of the factorization problem after finite
iterations. We give some analysis on the convergence of its iterates to the
strong stationary point. The second algorithm is designed for directly solving
the factorization problem. We improve the PALM algorithm introduced by Bolte et
al. (Math Program Ser A 146:459-494, 2014) for the factorization problem and
give its improved convergence results. Finally, we conduct numerical
experiments to show the promising performance of the proposed model and
algorithms for low-rank matrix completion
CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity
With distributed machine learning being a prominent technique for large-scale
machine learning tasks, communication complexity has become a major bottleneck
for speeding up training and scaling up machine numbers. In this paper, we
propose a new technique named Common randOm REconstruction(CORE), which can be
used to compress the information transmitted between machines in order to
reduce communication complexity without other strict conditions. Especially,
our technique CORE projects the vector-valued information to a low-dimensional
one through common random vectors and reconstructs the information with the
same random noises after communication. We apply CORE to two distributed tasks,
respectively convex optimization on linear models and generic non-convex
optimization, and design new distributed algorithms, which achieve provably
lower communication complexities. For example, we show for linear models
CORE-based algorithm can encode the gradient vector to -bits
(against ), with the convergence rate not worse, preceding the
existing results
ReLOAD: reinforcement learning with optimistic ascent-descent for last-iterate convergence in constrained MDPs
In recent years, reinforcement learning (RL) has been applied to real-world problems with increasing success. Such applications often require to put constraints on the agent’s behavior. Existing algorithms for constrained RL (CRL) rely on gradient descent-ascent, but this approach comes with a caveat. While these algorithms are guaranteed to converge on average, they do not guarantee last-iterate convergence, i.e., the current policy of the agent may never converge to the optimal solution. In practice, it is often observed that the policy alternates between satisfying the constraints and maximizing the reward, rarely accomplishing both objectives simultaneously. Here, we address this problem by introducing Reinforcement Learning with Optimistic Ascent-Descent (ReLOAD), a principled CRL method with guaranteed last-iterate convergence. We demonstrate its empirical effectiveness on a wide variety of CRL problems including discrete MDPs and continuous control. In the process we establish a benchmark of challenging CRL problems
Leveraging elasticity theory to calculate cell forces: From analytical insights to machine learning
Living cells possess capabilities to detect and respond to mechanical features of their surroundings. In traction force microscopy, the traction of cells on an elastic substrate is made visible by observing substrate deformation as measured by the movement of embedded marker beads. Describing the substrates by means of elasticity theory, we can calculate the adhesive forces, improving our understanding of cellular function and behavior. In this dissertation, I combine analytical solutions with numerical methods and machine learning techniques to improve traction prediction in a range of experimental applications. I describe how to include the normal traction component in regularization-based Fourier approaches, which I apply to experimental data. I compare the dominant strategies for traction reconstruction, the direct method and inverse, regularization-based approaches and find, that the latter are more precise while the former is more stress resilient to noise. I find that a point-force based reconstruction can be used to study the force balance evolution in response to microneedle pulling showing a transition from a dipolar into a monopolar force arrangement. Finally, I show how a conditional invertible neural network not only reconstructs adhesive areas more localized, but also reveals spatial correlations and variations in reliability of traction reconstructions
Implicit Loss of Surjectivity and Facial Reduction: Theory and Applications
Facial reduction, pioneered by Borwein and Wolkowicz, is a preprocessing method that is commonly used to obtain strict feasibility in the reformulated, reduced constraint system.
The importance of strict feasibility is often addressed in the context of the convergence results for interior point methods.
Beyond the theoretical properties that the facial reduction conveys, we show that facial reduction, not only limited to interior point methods, leads to strong numerical performances in different classes of algorithms.
In this thesis we study various consequences and the broad applicability of facial reduction.
The thesis is organized in two parts.
In the first part, we show the instabilities accompanied by the absence
of strict feasibility through the lens of facially reduced systems.
In particular, we exploit the implicit redundancies, revealed by each nontrivial facial reduction step, resulting in the implicit loss of surjectivity.
This leads to the two-step facial reduction and two novel related notions of singularity.
For the area of semidefinite programming, we use these singularities to strengthen a known bound on the solution rank, the Barvinok-Pataki bound.
For the area of linear programming, we reveal degeneracies caused by the implicit redundancies.
Furthermore, we propose a preprocessing tool that uses the simplex method.
In the second part of this thesis, we continue with the semidefinite programs that do not have strictly feasible points.
We focus on the doubly-nonnegative relaxation of the binary quadratic program and a semidefinite program with a nonlinear objective function.
We closely work with two classes of algorithms, the splitting method and the Gauss-Newton interior point method.
We elaborate on the advantages in building models from facial reduction. Moreover, we develop algorithms for real-world problems including the quadratic assignment problem, the protein side-chain positioning problem, and the key rate computation for quantum key distribution.
Facial reduction continues to play an important role for
providing robust reformulated models in both the theoretical and the practical aspects, resulting in successful numerical performances
- …