568 research outputs found

    Meta-learning algorithms and applications

    Get PDF
    Meta-learning in the broader context concerns how an agent learns about their own learning, allowing them to improve their learning process. Learning how to learn is not only beneficial for humans, but it has also shown vast benefits for improving how machines learn. In the context of machine learning, meta-learning enables models to improve their learning process by selecting suitable meta-parameters that influence the learning. For deep learning specifically, the meta-parameters typically describe details of the training of the model but can also include description of the model itself - the architecture. Meta-learning is usually done with specific goals in mind, for example trying to improve ability to generalize or learn new concepts from only a few examples. Meta-learning can be powerful, but it comes with a key downside: it is often computationally costly. If the costs would be alleviated, meta-learning could be more accessible to developers of new artificial intelligence models, allowing them to achieve greater goals or save resources. As a result, one key focus of our research is on significantly improving the efficiency of meta-learning. We develop two approaches: EvoGrad and PASHA, both of which significantly improve meta-learning efficiency in two common scenarios. EvoGrad allows us to efficiently optimize the value of a large number of differentiable meta-parameters, while PASHA enables us to efficiently optimize any type of meta-parameters but fewer in number. Meta-learning is a tool that can be applied to solve various problems. Most commonly it is applied for learning new concepts from only a small number of examples (few-shot learning), but other applications exist too. To showcase the practical impact that meta-learning can make in the context of neural networks, we use meta-learning as a novel solution for two selected problems: more accurate uncertainty quantification (calibration) and general-purpose few-shot learning. Both are practically important problems and using meta-learning approaches we can obtain better solutions than the ones obtained using existing approaches. Calibration is important for safety-critical applications of neural networks, while general-purpose few-shot learning tests model's ability to generalize few-shot learning abilities across diverse tasks such as recognition, segmentation and keypoint estimation. More efficient algorithms as well as novel applications enable the field of meta-learning to make more significant impact on the broader area of deep learning and potentially solve problems that were too challenging before. Ultimately both of them allow us to better utilize the opportunities that artificial intelligence presents

    Optimization for Energy Management in the Community Microgrids

    Full text link
    This thesis focuses on improving the energy management strategies for Community Microgrids (CMGs), which are expected to play a crucial role in the future smart grid. CMGs bring many benefits, including increased use of renewable energy, improved reliability, resiliency, and energy efficiency. An Energy Management System (EMS) is a key tool that helps in monitoring, controlling, and optimizing the operations of the CMG in a cost-effective manner. The EMS can include various functionalities like day-ahead generation scheduling, real-time scheduling, uncertainty management, and demand response programs. Generation scheduling in a microgrid is a challenging optimization problem, especially due to the intermittent nature of renewable energy. The power balance constraint, which is the balance between energy demand and generation, is difficult to satisfy due to prediction errors in energy demand and generation. Real-time scheduling, which is based on a shorter prediction horizon, reduces these errors, but the impact of uncertainties cannot be completely eliminated. In regards to demand response programs, it is challenging to design an effective model that motivates customers to voluntarily participate while benefiting the system operator. Mathematical optimization techniques have been widely used to solve power system problems, but their application is limited by the need for specific mathematical properties. Metaheuristic techniques, particularly Evolutionary Algorithms (EAs), have gained popularity for their ability to solve complex and non-linear problems. However, the traditional form of EAs may require significant computational effort for complex energy management problems in the CMG. This thesis aims to enhance the existing methods of EMS in CMGs. Improved techniques are developed for day-ahead generation scheduling, multi-stage real-time scheduling, and demand response implementation. For generation scheduling, the performance of conventional EAs is improved through an efficient heuristic. A new multi-stage scheduling framework is proposed to minimize the impact of uncertainties in real-time operations. In regards to demand response, a memetic algorithm is proposed to solve an incentive-based scheme from the perspective of an aggregator, and a price-based demand response driven by dynamic price optimization is proposed to enhance the electric vehicle hosting capacity. The proposed methods are validated through extensive numerical experiments and comparison with state-of-the-art approaches. The results confirm the effectiveness of the proposed methods in improving energy management in CMGs

    Adjustable robust optimization with nonlinear recourses

    Get PDF
    Over the last century, mathematical optimization has become a prominent tool for decision making. Its systematic application in practical fields such as economics, logistics or defense led to the development of algorithmic methods with ever increasing efficiency. Indeed, for a variety of real-world problems, finding an optimal decision among a set of (implicitly or explicitly) predefined alternatives has become conceivable in reasonable time. In the last decades, however, the research community raised more and more attention to the role of uncertainty in the optimization process. In particular, one may question the notion of optimality, and even feasibility, when studying decision problems with unknown or imprecise input parameters. This concern is even more critical in a world becoming more and more complex —by which we intend, interconnected —where each individual variation inside a system inevitably causes other variations in the system itself. In this dissertation, we study a class of optimization problems which suffer from imprecise input data and feature a two-stage decision process, i.e., where decisions are made in a sequential order —called stages —and where unknown parameters are revealed throughout the stages. The applications of such problems are plethora in practical fields such as, e.g., facility location problems with uncertain demands, transportation problems with uncertain costs or scheduling under uncertain processing times. The uncertainty is dealt with a robust optimization (RO) viewpoint (also known as "worst-case perspective") and we present original contributions to the RO literature on both the theoretical and practical side

    A study of distributionally robust mixed-integer programming with Wasserstein metric: on the value of incomplete data

    Full text link
    This study addresses a class of linear mixed-integer programming (MILP) problems that involve uncertainty in the objective function parameters. The parameters are assumed to form a random vector, whose probability distribution can only be observed through a finite training data set. Unlike most of the related studies in the literature, we also consider uncertainty in the underlying data set. The data uncertainty is described by a set of linear constraints for each random sample, and the uncertainty in the distribution (for a fixed realization of data) is defined using a type-1 Wasserstein ball centered at the empirical distribution of the data. The overall problem is formulated as a three-level distributionally robust optimization (DRO) problem. First, we prove that the three-level problem admits a single-level MILP reformulation, if the class of loss functions is restricted to biaffine functions. Secondly, it turns out that for several particular forms of data uncertainty, the outlined problem can be solved reasonably fast by leveraging the nominal MILP problem. Finally, we conduct a computational study, where the out-of-sample performance of our model and computational complexity of the proposed MILP reformulation are explored numerically for several application domains

    Multi-objective resource optimization in space-aerial-ground-sea integrated networks

    Get PDF
    Space-air-ground-sea integrated (SAGSI) networks are envisioned to connect satellite, aerial, ground, and sea networks to provide connectivity everywhere and all the time in sixth-generation (6G) networks. However, the success of SAGSI networks is constrained by several challenges including resource optimization when the users have diverse requirements and applications. We present a comprehensive review of SAGSI networks from a resource optimization perspective. We discuss use case scenarios and possible applications of SAGSI networks. The resource optimization discussion considers the challenges associated with SAGSI networks. In our review, we categorized resource optimization techniques based on throughput and capacity maximization, delay minimization, energy consumption, task offloading, task scheduling, resource allocation or utilization, network operation cost, outage probability, and the average age of information, joint optimization (data rate difference, storage or caching, CPU cycle frequency), the overall performance of network and performance degradation, software-defined networking, and intelligent surveillance and relay communication. We then formulate a mathematical framework for maximizing energy efficiency, resource utilization, and user association. We optimize user association while satisfying the constraints of transmit power, data rate, and user association with priority. The binary decision variable is used to associate users with system resources. Since the decision variable is binary and constraints are linear, the formulated problem is a binary linear programming problem. Based on our formulated framework, we simulate and analyze the performance of three different algorithms (branch and bound algorithm, interior point method, and barrier simplex algorithm) and compare the results. Simulation results show that the branch and bound algorithm shows the best results, so this is our benchmark algorithm. The complexity of branch and bound increases exponentially as the number of users and stations increases in the SAGSI network. We got comparable results for the interior point method and barrier simplex algorithm to the benchmark algorithm with low complexity. Finally, we discuss future research directions and challenges of resource optimization in SAGSI networks

    A Safe Approximation Based on Mixed-Integer Optimization for Non-Convex Distributional Robustness Governed by Univariate Indicator Functions

    Full text link
    In this work, we present algorithmically tractable safe approximations of distributionally robust optimization (DRO) problems. The considered ambiguity sets can exploit information on moments as well as confidence sets. Typically, reformulation approaches using duality theory need to make strong assumptions on the structure of the underlying constraints, such as convexity in the decisions or concavity in the uncertainty. In contrast, here we present a duality-based reformulation approach for DRO problems, where the objective of the adverserial is allowed to depend on univariate indicator functions. This renders the problem nonlinear and nonconvex. In order to be able to reformulate the semiinfinite constraints nevertheless, an exact reformulation is presented that is approximated by a discretized counterpart. The approximation is realized as a mixed-integer linear problem that yields sufficient conditions for distributional robustness of the original problem. Furthermore, it is proven that with increasingly fine discretizations, the discretized reformulation converges to the original distributionally robust problem. The approach is made concrete for a challenging, fundamental task in particle separation that appears in material design. Computational results for realistic settings show that the safe approximation yields robust solutions of high-quality and can be computed within short time.Comment: 28 pages, 7 figure

    Decision-making with gaussian processes: sampling strategies and monte carlo methods

    Get PDF
    We study Gaussian processes and their application to decision-making in the real world. We begin by reviewing the foundations of Bayesian decision theory and show how these ideas give rise to methods such as Bayesian optimization. We investigate practical techniques for carrying out these strategies, with an emphasis on estimating and maximizing acquisition functions. Finally, we introduce pathwise approaches to conditioning Gaussian processes and demonstrate key benefits for representing random variables in this manner.Open Acces

    On Tilted Losses in Machine Learning: Theory and Applications

    Full text link
    Exponential tilting is a technique commonly used in fields such as statistics, probability, information theory, and optimization to create parametric distribution shifts. Despite its prevalence in related fields, tilting has not seen widespread use in machine learning. In this work, we aim to bridge this gap by exploring the use of tilting in risk minimization. We study a simple extension to ERM -- tilted empirical risk minimization (TERM) -- which uses exponential tilting to flexibly tune the impact of individual losses. The resulting framework has several useful properties: We show that TERM can increase or decrease the influence of outliers, respectively, to enable fairness or robustness; has variance-reduction properties that can benefit generalization; and can be viewed as a smooth approximation to the tail probability of losses. Our work makes rigorous connections between TERM and related objectives, such as Value-at-Risk, Conditional Value-at-Risk, and distributionally robust optimization (DRO). We develop batch and stochastic first-order optimization methods for solving TERM, provide convergence guarantees for the solvers, and show that the framework can be efficiently solved relative to common alternatives. Finally, we demonstrate that TERM can be used for a multitude of applications in machine learning, such as enforcing fairness between subgroups, mitigating the effect of outliers, and handling class imbalance. Despite the straightforward modification TERM makes to traditional ERM objectives, we find that the framework can consistently outperform ERM and deliver competitive performance with state-of-the-art, problem-specific approaches.Comment: arXiv admin note: substantial text overlap with arXiv:2007.0116

    Policy Gradient Algorithms for Robust MDPs with Non-Rectangular Uncertainty Sets

    Full text link
    We propose a policy gradient algorithm for robust infinite-horizon Markov Decision Processes (MDPs) with non-rectangular uncertainty sets, thereby addressing an open challenge in the robust MDP literature. Indeed, uncertainty sets that display statistical optimality properties and make optimal use of limited data often fail to be rectangular. Unfortunately, the corresponding robust MDPs cannot be solved with dynamic programming techniques and are in fact provably intractable. This prompts us to develop a projected Langevin dynamics algorithm tailored to the robust policy evaluation problem, which offers global optimality guarantees. We also propose a deterministic policy gradient method that solves the robust policy evaluation problem approximately, and we prove that the approximation error scales with a new measure of non-rectangularity of the uncertainty set. Numerical experiments showcase that our projected Langevin dynamics algorithm can escape local optima, while algorithms tailored to rectangular uncertainty fail to do so.Comment: 20 pages, 2 figure
    • …
    corecore