980 research outputs found

    Meta-learning algorithms and applications

    Get PDF
    Meta-learning in the broader context concerns how an agent learns about their own learning, allowing them to improve their learning process. Learning how to learn is not only beneficial for humans, but it has also shown vast benefits for improving how machines learn. In the context of machine learning, meta-learning enables models to improve their learning process by selecting suitable meta-parameters that influence the learning. For deep learning specifically, the meta-parameters typically describe details of the training of the model but can also include description of the model itself - the architecture. Meta-learning is usually done with specific goals in mind, for example trying to improve ability to generalize or learn new concepts from only a few examples. Meta-learning can be powerful, but it comes with a key downside: it is often computationally costly. If the costs would be alleviated, meta-learning could be more accessible to developers of new artificial intelligence models, allowing them to achieve greater goals or save resources. As a result, one key focus of our research is on significantly improving the efficiency of meta-learning. We develop two approaches: EvoGrad and PASHA, both of which significantly improve meta-learning efficiency in two common scenarios. EvoGrad allows us to efficiently optimize the value of a large number of differentiable meta-parameters, while PASHA enables us to efficiently optimize any type of meta-parameters but fewer in number. Meta-learning is a tool that can be applied to solve various problems. Most commonly it is applied for learning new concepts from only a small number of examples (few-shot learning), but other applications exist too. To showcase the practical impact that meta-learning can make in the context of neural networks, we use meta-learning as a novel solution for two selected problems: more accurate uncertainty quantification (calibration) and general-purpose few-shot learning. Both are practically important problems and using meta-learning approaches we can obtain better solutions than the ones obtained using existing approaches. Calibration is important for safety-critical applications of neural networks, while general-purpose few-shot learning tests model's ability to generalize few-shot learning abilities across diverse tasks such as recognition, segmentation and keypoint estimation. More efficient algorithms as well as novel applications enable the field of meta-learning to make more significant impact on the broader area of deep learning and potentially solve problems that were too challenging before. Ultimately both of them allow us to better utilize the opportunities that artificial intelligence presents

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    A Comprehensive Survey of Artificial Intelligence Techniques for Talent Analytics

    Full text link
    In today's competitive and fast-evolving business environment, it is a critical time for organizations to rethink how to make talent-related decisions in a quantitative manner. Indeed, the recent development of Big Data and Artificial Intelligence (AI) techniques have revolutionized human resource management. The availability of large-scale talent and management-related data provides unparalleled opportunities for business leaders to comprehend organizational behaviors and gain tangible knowledge from a data science perspective, which in turn delivers intelligence for real-time decision-making and effective talent management at work for their organizations. In the last decade, talent analytics has emerged as a promising field in applied data science for human resource management, garnering significant attention from AI communities and inspiring numerous research efforts. To this end, we present an up-to-date and comprehensive survey on AI technologies used for talent analytics in the field of human resource management. Specifically, we first provide the background knowledge of talent analytics and categorize various pertinent data. Subsequently, we offer a comprehensive taxonomy of relevant research efforts, categorized based on three distinct application-driven scenarios: talent management, organization management, and labor market analysis. In conclusion, we summarize the open challenges and potential prospects for future research directions in the domain of AI-driven talent analytics.Comment: 30 pages, 15 figure

    Distributed Energy Resource Management: All-Time Resource-Demand Feasibility, Delay-Tolerance, Nonlinearity, and Beyond

    Full text link
    In this work, we propose distributed and networked energy management scenarios to optimize the production and reservation of energy among a set of distributed energy nodes. In other words, the idea is to optimally allocate the generated and reserved powers based on nodes' local cost gradient information while meeting the demand energy. One main concern is the all-time (or anytime) resource-demand feasibility, implying that at all iterations of the scheduling algorithm, the balance between the produced power and demand plus reserved power must hold. The other concern is to design algorithms to tolerate communication time-delays and changes in the network. Further, one can incorporate possible model nonlinearity in the algorithm to address both inherent (e.g., saturation and quantization) and purposefully-added (e.g., signum-based) nonlinearities in the model. The proposed optimal allocation algorithm addresses all the above concerns, while it benefits from possible features of the distributed (or networked) solutions such as no-single-node-of-failure and distributed information processing. We show both the all-time feasibility of the proposed scheme and its convergence under certain bound on the step-rate using Lyapunov-type proofs.Comment: IEEE LCSS 202

    Efficient Model Checking: The Power of Randomness

    Get PDF

    Modular lifelong machine learning

    Get PDF
    Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge. Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand. This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems. First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures. Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations. Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods. Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer

    Runway Safety Improvements Through a Data Driven Approach for Risk Flight Prediction and Simulation

    Get PDF
    Runway overrun is one of the most frequently occurring flight accident types threatening the safety of aviation. Sensors have been improved with recent technological advancements and allow data collection during flights. The recorded data helps to better identify the characteristics of runway overruns. The improved technological capabilities and the growing air traffic led to increased momentum for reducing flight risk using artificial intelligence. Discussions on incorporating artificial intelligence to enhance flight safety are timely and critical. Using artificial intelligence, we may be able to develop the tools we need to better identify runway overrun risk and increase awareness of runway overruns. This work seeks to increase attitude, skill, and knowledge (ASK) of runway overrun risks by predicting the flight states near touchdown and simulating the flight exposed to runway overrun precursors. To achieve this, the methodology develops a prediction model and a simulation model. During the flight training process, the prediction model is used in flight to identify potential risks and the simulation model is used post-flight to review the flight behavior. The prediction model identifies potential risks by predicting flight parameters that best characterize the landing performance during the final approach phase. The predicted flight parameters are used to alert the pilots for any runway overrun precursors that may pose a threat. The predictions and alerts are made when thresholds of various flight parameters are exceeded. The flight simulation model simulates the final approach trajectory with an emphasis on capturing the effect wind has on the aircraft. The focus is on the wind since the wind is a relatively significant factor during the final approach; typically, the aircraft is stabilized during the final approach. The flight simulation is used to quickly assess the differences between fight patterns that have triggered overrun precursors and normal flights with no abnormalities. The differences are crucial in learning how to mitigate adverse flight conditions. Both of the models are created with neural network models. The main challenges of developing a neural network model are the unique assignment of each model design space and the size of a model design space. A model design space is unique to each problem and cannot accommodate multiple problems. A model design space can also be significantly large depending on the depth of the model. Therefore, a hyperparameter optimization algorithm is investigated and used to design the data and model structures to best characterize the aircraft behavior during the final approach. A series of experiments are performed to observe how the model accuracy change with different data pre-processing methods for the prediction model and different neural network models for the simulation model. The data pre-processing methods include indexing the data by different frequencies, by different window sizes, and data clustering. The neural network models include simple Recurrent Neural Networks, Gated Recurrent Units, Long Short Term Memory, and Neural Network Autoregressive with Exogenous Input. Another series of experiments are performed to evaluate the robustness of these models to adverse wind and flare. This is because different wind conditions and flares represent controls that the models need to map to the predicted flight states. The most robust models are then used to identify significant features for the prediction model and the feasible control space for the simulation model. The outcomes of the most robust models are also mapped to the required landing distance metric so that the results of the prediction and simulation are easily read. Then, the methodology is demonstrated with a sample flight exposed to an overrun precursor, and high approach speed, to show how the models can potentially increase attitude, skill, and knowledge of runway overrun risk. The main contribution of this work is on evaluating the accuracy and robustness of prediction and simulation models trained using Flight Operational Quality Assurance (FOQA) data. Unlike many studies that focused on optimizing the model structures to create the two models, this work optimized both data and model structures to ensure that the data well capture the dynamics of the aircraft it represents. To achieve this, this work introduced a hybrid genetic algorithm that combines the benefits of conventional and quantum-inspired genetic algorithms to quickly converge to an optimal configuration while exploring the design space. With the optimized model, this work identified the data features, from the final approach, with a higher contribution to predicting airspeed, vertical speed, and pitch angle near touchdown. The top contributing features are altitude, angle of attack, core rpm, and air speeds. For both the prediction and the simulation models, this study goes through the impact of various data preprocessing methods on the accuracy of the two models. The results may help future studies identify the right data preprocessing methods for their work. Another contribution from this work is on evaluating how flight control and wind affect both the prediction and the simulation models. This is achieved by mapping the model accuracy at various levels of control surface deflection, wind speeds, and wind direction change. The results saw fairly consistent prediction and simulation accuracy at different levels of control surface deflection and wind conditions. This showed that the neural network-based models are effective in creating robust prediction and simulation models of aircraft during the final approach. The results also showed that data frequency has a significant impact on the prediction and simulation accuracy so it is important to have sufficient data to train the models in the condition that the models will be used. The final contribution of this work is on demonstrating how the prediction and the simulation models can be used to increase awareness of runway overrun.Ph.D

    Contexts can be Cheap: Solving Stochastic Contextual Bandits with Linear Bandit Algorithms

    Full text link
    In this paper, we address the stochastic contextual linear bandit problem, where a decision maker is provided a context (a random set of actions drawn from a distribution). The expected reward of each action is specified by the inner product of the action and an unknown parameter. The goal is to design an algorithm that learns to play as close as possible to the unknown optimal policy after a number of action plays. This problem is considered more challenging than the linear bandit problem, which can be viewed as a contextual bandit problem with a \emph{fixed} context. Surprisingly, in this paper, we show that the stochastic contextual problem can be solved as if it is a linear bandit problem. In particular, we establish a novel reduction framework that converts every stochastic contextual linear bandit instance to a linear bandit instance, when the context distribution is known. When the context distribution is unknown, we establish an algorithm that reduces the stochastic contextual instance to a sequence of linear bandit instances with small misspecifications and achieves nearly the same worst-case regret bound as the algorithm that solves the misspecified linear bandit instances. As a consequence, our results imply a O(dTlogT)O(d\sqrt{T\log T}) high-probability regret bound for contextual linear bandits, making progress in resolving an open problem in (Li et al., 2019), (Li et al., 2021). Our reduction framework opens up a new way to approach stochastic contextual linear bandit problems, and enables improved regret bounds in a number of instances including the batch setting, contextual bandits with misspecifications, contextual bandits with sparse unknown parameters, and contextual bandits with adversarial corruption
    corecore