568 research outputs found
Meta-learning algorithms and applications
Meta-learning in the broader context concerns how an agent learns about their own learning, allowing them to improve their learning process. Learning how to learn is not only beneficial for humans, but it has also shown vast benefits for improving how machines learn. In the context of machine learning, meta-learning enables models to improve their learning process by selecting suitable meta-parameters that influence the learning. For deep learning specifically, the meta-parameters typically describe details of the training of the model but can also include description of the model itself - the architecture. Meta-learning is usually done with specific goals in mind, for example trying to improve ability to generalize or learn new concepts from only a few examples.
Meta-learning can be powerful, but it comes with a key downside: it is often computationally costly. If the costs would be alleviated, meta-learning could be more accessible to developers of new artificial intelligence models, allowing them to achieve greater goals or save resources. As a result, one key focus of our research is on significantly improving the efficiency of meta-learning. We develop two approaches: EvoGrad and PASHA, both of which significantly improve meta-learning efficiency in two common scenarios. EvoGrad allows us to efficiently optimize the value of a large number of differentiable meta-parameters, while PASHA enables us to efficiently optimize any type of meta-parameters but fewer in number.
Meta-learning is a tool that can be applied to solve various problems. Most commonly it is applied for learning new concepts from only a small number of examples (few-shot learning), but other applications exist too. To showcase the practical impact that meta-learning can make in the context of neural networks, we use meta-learning as a novel solution for two selected problems: more accurate uncertainty quantification (calibration) and general-purpose few-shot learning. Both are practically important problems and using meta-learning approaches we can obtain better solutions than the ones obtained using existing approaches. Calibration is important for safety-critical applications of neural networks, while general-purpose few-shot learning tests model's ability to generalize few-shot learning abilities across diverse tasks such as recognition, segmentation and keypoint estimation.
More efficient algorithms as well as novel applications enable the field of meta-learning to make more significant impact on the broader area of deep learning and potentially solve problems that were too challenging before. Ultimately both of them allow us to better utilize the opportunities that artificial intelligence presents
Optimization for Energy Management in the Community Microgrids
This thesis focuses on improving the energy management strategies for Community Microgrids (CMGs), which are expected to play a crucial role in the future smart grid. CMGs bring many benefits, including increased use of renewable energy, improved reliability, resiliency, and energy efficiency. An Energy Management System (EMS) is a key tool that helps in monitoring, controlling, and optimizing the operations of the CMG in a cost-effective manner. The EMS can include various functionalities like day-ahead generation scheduling, real-time scheduling, uncertainty management, and demand response programs.
Generation scheduling in a microgrid is a challenging optimization problem, especially due to the intermittent nature of renewable energy. The power balance constraint, which is the balance between energy demand and generation, is difficult to satisfy due to prediction errors in energy demand and generation. Real-time scheduling, which is based on a shorter prediction horizon, reduces these errors, but the impact of uncertainties cannot be completely eliminated. In regards to demand response programs, it is challenging to design an effective model that motivates customers to voluntarily participate while benefiting the system operator.
Mathematical optimization techniques have been widely used to solve power system problems, but their application is limited by the need for specific mathematical properties. Metaheuristic techniques, particularly Evolutionary Algorithms (EAs), have gained popularity for their ability to solve complex and non-linear problems. However, the traditional form of EAs may require significant computational effort for complex energy management problems in the CMG.
This thesis aims to enhance the existing methods of EMS in CMGs. Improved techniques are developed for day-ahead generation scheduling, multi-stage real-time scheduling, and demand response implementation. For generation scheduling, the performance of conventional EAs is improved through an efficient heuristic. A new multi-stage scheduling framework is proposed to minimize the impact of uncertainties in real-time operations. In regards to demand response, a memetic algorithm is proposed to solve an incentive-based scheme from the perspective of an aggregator, and a price-based demand response driven by dynamic price optimization is proposed to enhance the electric vehicle hosting capacity. The proposed methods are validated through extensive numerical experiments and comparison with state-of-the-art approaches. The results confirm the effectiveness of the proposed methods in improving energy management in CMGs
Adjustable robust optimization with nonlinear recourses
Over the last century, mathematical optimization has become a prominent tool for decision making. Its systematic application in practical fields such as economics, logistics or defense led to the development of algorithmic methods with ever increasing efficiency. Indeed, for a variety of real-world problems, finding an optimal decision among a set of (implicitly or explicitly) predefined alternatives has become conceivable in reasonable time. In the last decades, however, the research community raised more and more attention to the role of uncertainty in the optimization process. In particular, one may question the notion of optimality, and even feasibility, when studying decision problems with unknown or imprecise input parameters. This concern is even more critical in a world becoming more and more complex —by which we intend, interconnected —where each individual variation inside a system inevitably causes other variations in the system itself.
In this dissertation, we study a class of optimization problems which suffer from imprecise input data and feature a two-stage decision process, i.e., where decisions are made in a sequential order —called stages —and where unknown parameters are revealed throughout the stages. The applications of such problems are plethora in practical fields such as, e.g., facility location problems with uncertain demands, transportation problems with uncertain costs or scheduling under uncertain processing times. The uncertainty is dealt with a robust optimization (RO) viewpoint (also known as "worst-case perspective") and we present original contributions to the RO literature on both the theoretical and practical side
A study of distributionally robust mixed-integer programming with Wasserstein metric: on the value of incomplete data
This study addresses a class of linear mixed-integer programming (MILP)
problems that involve uncertainty in the objective function parameters. The
parameters are assumed to form a random vector, whose probability distribution
can only be observed through a finite training data set. Unlike most of the
related studies in the literature, we also consider uncertainty in the
underlying data set. The data uncertainty is described by a set of linear
constraints for each random sample, and the uncertainty in the distribution
(for a fixed realization of data) is defined using a type-1 Wasserstein ball
centered at the empirical distribution of the data. The overall problem is
formulated as a three-level distributionally robust optimization (DRO) problem.
First, we prove that the three-level problem admits a single-level MILP
reformulation, if the class of loss functions is restricted to biaffine
functions. Secondly, it turns out that for several particular forms of data
uncertainty, the outlined problem can be solved reasonably fast by leveraging
the nominal MILP problem. Finally, we conduct a computational study, where the
out-of-sample performance of our model and computational complexity of the
proposed MILP reformulation are explored numerically for several application
domains
Multi-objective resource optimization in space-aerial-ground-sea integrated networks
Space-air-ground-sea integrated (SAGSI) networks are envisioned to connect satellite, aerial, ground,
and sea networks to provide connectivity everywhere and all the time in sixth-generation (6G) networks. However, the success of SAGSI networks is constrained by several challenges including
resource optimization when the users have diverse requirements and applications. We present a
comprehensive review of SAGSI networks from a resource optimization perspective. We discuss
use case scenarios and possible applications of SAGSI networks. The resource optimization discussion considers the challenges associated with SAGSI networks. In our review, we categorized
resource optimization techniques based on throughput and capacity maximization, delay minimization, energy consumption, task offloading, task scheduling, resource allocation or utilization,
network operation cost, outage probability, and the average age of information, joint optimization (data rate difference, storage or caching, CPU cycle frequency), the overall performance of
network and performance degradation, software-defined networking, and intelligent surveillance
and relay communication. We then formulate a mathematical framework for maximizing energy
efficiency, resource utilization, and user association. We optimize user association while satisfying
the constraints of transmit power, data rate, and user association with priority. The binary decision
variable is used to associate users with system resources. Since the decision variable is binary and
constraints are linear, the formulated problem is a binary linear programming problem. Based on
our formulated framework, we simulate and analyze the performance of three different algorithms
(branch and bound algorithm, interior point method, and barrier simplex algorithm) and compare
the results. Simulation results show that the branch and bound algorithm shows the best results,
so this is our benchmark algorithm. The complexity of branch and bound increases exponentially
as the number of users and stations increases in the SAGSI network. We got comparable results
for the interior point method and barrier simplex algorithm to the benchmark algorithm with low
complexity. Finally, we discuss future research directions and challenges of resource optimization
in SAGSI networks
A Safe Approximation Based on Mixed-Integer Optimization for Non-Convex Distributional Robustness Governed by Univariate Indicator Functions
In this work, we present algorithmically tractable safe approximations of
distributionally robust optimization (DRO) problems. The considered ambiguity
sets can exploit information on moments as well as confidence sets. Typically,
reformulation approaches using duality theory need to make strong assumptions
on the structure of the underlying constraints, such as convexity in the
decisions or concavity in the uncertainty. In contrast, here we present a
duality-based reformulation approach for DRO problems, where the objective of
the adverserial is allowed to depend on univariate indicator functions. This
renders the problem nonlinear and nonconvex. In order to be able to reformulate
the semiinfinite constraints nevertheless, an exact reformulation is presented
that is approximated by a discretized counterpart. The approximation is
realized as a mixed-integer linear problem that yields sufficient conditions
for distributional robustness of the original problem. Furthermore, it is
proven that with increasingly fine discretizations, the discretized
reformulation converges to the original distributionally robust problem. The
approach is made concrete for a challenging, fundamental task in particle
separation that appears in material design. Computational results for realistic
settings show that the safe approximation yields robust solutions of
high-quality and can be computed within short time.Comment: 28 pages, 7 figure
Decision-making with gaussian processes: sampling strategies and monte carlo methods
We study Gaussian processes and their application to decision-making in the real world. We begin by reviewing the foundations of Bayesian decision theory and show how these ideas give rise to methods such as Bayesian optimization. We investigate practical techniques for carrying out these strategies, with an emphasis on estimating and maximizing acquisition functions. Finally, we introduce pathwise approaches to conditioning Gaussian processes and demonstrate key benefits for representing random variables in this manner.Open Acces
On Tilted Losses in Machine Learning: Theory and Applications
Exponential tilting is a technique commonly used in fields such as
statistics, probability, information theory, and optimization to create
parametric distribution shifts. Despite its prevalence in related fields,
tilting has not seen widespread use in machine learning. In this work, we aim
to bridge this gap by exploring the use of tilting in risk minimization. We
study a simple extension to ERM -- tilted empirical risk minimization (TERM) --
which uses exponential tilting to flexibly tune the impact of individual
losses. The resulting framework has several useful properties: We show that
TERM can increase or decrease the influence of outliers, respectively, to
enable fairness or robustness; has variance-reduction properties that can
benefit generalization; and can be viewed as a smooth approximation to the tail
probability of losses. Our work makes rigorous connections between TERM and
related objectives, such as Value-at-Risk, Conditional Value-at-Risk, and
distributionally robust optimization (DRO). We develop batch and stochastic
first-order optimization methods for solving TERM, provide convergence
guarantees for the solvers, and show that the framework can be efficiently
solved relative to common alternatives. Finally, we demonstrate that TERM can
be used for a multitude of applications in machine learning, such as enforcing
fairness between subgroups, mitigating the effect of outliers, and handling
class imbalance. Despite the straightforward modification TERM makes to
traditional ERM objectives, we find that the framework can consistently
outperform ERM and deliver competitive performance with state-of-the-art,
problem-specific approaches.Comment: arXiv admin note: substantial text overlap with arXiv:2007.0116
Policy Gradient Algorithms for Robust MDPs with Non-Rectangular Uncertainty Sets
We propose a policy gradient algorithm for robust infinite-horizon Markov
Decision Processes (MDPs) with non-rectangular uncertainty sets, thereby
addressing an open challenge in the robust MDP literature. Indeed, uncertainty
sets that display statistical optimality properties and make optimal use of
limited data often fail to be rectangular. Unfortunately, the corresponding
robust MDPs cannot be solved with dynamic programming techniques and are in
fact provably intractable. This prompts us to develop a projected Langevin
dynamics algorithm tailored to the robust policy evaluation problem, which
offers global optimality guarantees. We also propose a deterministic policy
gradient method that solves the robust policy evaluation problem approximately,
and we prove that the approximation error scales with a new measure of
non-rectangularity of the uncertainty set. Numerical experiments showcase that
our projected Langevin dynamics algorithm can escape local optima, while
algorithms tailored to rectangular uncertainty fail to do so.Comment: 20 pages, 2 figure
- …