14 research outputs found

    Conformal Prediction: a Unified Review of Theory and New Challenges

    Full text link
    In this work we provide a review of basic ideas and novel developments about Conformal Prediction -- an innovative distribution-free, non-parametric forecasting method, based on minimal assumptions -- that is able to yield in a very straightforward way predictions sets that are valid in a statistical sense also in in the finite sample case. The in-depth discussion provided in the paper covers the theoretical underpinnings of Conformal Prediction, and then proceeds to list the more advanced developments and adaptations of the original idea.Comment: arXiv admin note: text overlap with arXiv:0706.3188, arXiv:1604.04173, arXiv:1709.06233, arXiv:1203.5422 by other author

    Few-shot Conformal Prediction with Auxiliary Tasks

    Full text link
    We develop a novel approach to conformal prediction when the target task has limited data available for training. Conformal prediction identifies a small set of promising output candidates in place of a single prediction, with guarantees that the set contains the correct answer with high probability. When training data is limited, however, the predicted set can easily become unusably large. In this work, we obtain substantially tighter prediction sets while maintaining desirable marginal guarantees by casting conformal prediction as a meta-learning paradigm over exchangeable collections of auxiliary tasks. Our conformalization algorithm is simple, fast, and agnostic to the choice of underlying model, learning algorithm, or dataset. We demonstrate the effectiveness of this approach across a number of few-shot classification and regression tasks in natural language processing, computer vision, and computational chemistry for drug discovery.Comment: ICML camera read

    Constructing Prediction Intervals with Neural Networks: An Empirical Evaluation of Bootstrapping and Conformal Inference Methods

    Get PDF
    Artificial neural networks (ANNs) are popular tools for accomplishing many machine learning tasks, including predicting continuous outcomes. However, the general lack of confidence measures provided with ANN predictions limit their applicability, especially in military settings where accuracy is paramount. Supplementing point predictions with prediction intervals (PIs) is common for other learning algorithms, but the complex structure and training of ANNs renders constructing PIs difficult. This work provides the network design choices and inferential methods for creating better performing PIs with ANNs to enable their adaptation for military use. A two-step experiment is executed across 11 datasets, including an imaged-based dataset. Two non-parametric methods for constructing PIs, bootstrapping and conformal inference, are considered. The results of the first experimental step reveal that the choices inherent to building an ANN affect PI performance. Guidance is provided for optimizing PI performance with respect to each network feature and PI method. In the second step, 20 algorithms for constructing PIs—each using the principles of bootstrapping or conformal inference—are implemented to determine which provides the best performance while maintaining reasonable computational burden. In general, this trade-off is optimized when implementing the cross-conformal method, which maintained interval coverage and efficiency with decreased computational burden

    Semi-Supervised Learning Vector Quantization method enhanced with regularization for anomaly detection in air conditioning time-series data

    Get PDF
    Researchers of semi-supervised learning methods have been developing the family of Learning Vector Quantization models which originated from the well-known Self-Organizing Map algorithm. The models of this type can be characterized as prototype-based, self-explanatory and flexible. The thesis contributes to the development of one of the LVQ models – Semi-Supervised Relational Prototype Classifier for dissimilarity data. The model implementation is developed based on the related research work and thesis author findings, and applied to the task of anomaly detection from a real-time air condition data. We propose a regularization algorithm for gradient descent in order to achieve better convergence and a new strategy for initializing prototypes. We develop an innovative framework involving a human expert as a source of labeled data. The framework detects anomalies of environment parameters in both real-time and long-run observations and updates the model according to findings. The data set used for experiments is collected in real-time from sensors installed inside the Aalto Mechanical Engineering building located at Otakaari, 4, Espoo. Installation was done as a part of the project of VTT and Korean National Research Institute. The data consists of 3 main parameters – air temperature, humidity and CO2 concentration. Total number of deployed sensors is around 150. One month recorded data observations contains approximately 1.5M of data points. The results of the project demonstrate the efficiency of the developed regularized LVQ method for classification in given settings. Its regularized version generally overperforms its parent and various baseline methods on air conditioning, synthetic and UCI data. Together with the proposed classification framework, the system has shown its robustness and efficiency and is ready for deployment to a production environment

    Deep Learning for Abstraction, Control and Monitoring of Complex Cyber-Physical Systems

    Get PDF
    Cyber-Physical Systems (CPS) consist of digital devices that interact with some physical components. Their popularity and complexity are growing exponentially, giving birth to new, previously unexplored, safety-critical application domains. As CPS permeate our daily lives, it becomes imperative to reason about their reliability. Formal methods provide rigorous techniques for verification, control and synthesis of safe and reliable CPS. However, these methods do not scale with the complexity of the system, thus their applicability to real-world problems is limited. A promising strategy is to leverage deep learning techniques to tackle the scalability issue of formal methods, transforming unfeasible problems into approximately solvable ones. The approximate models are trained over observations which are solutions of the formal problem. In this thesis, we focus on the following tasks, which are computationally challenging: the modeling and the simulation of a complex stochastic model, the design of a safe and robust control policy for a system acting in a highly uncertain environment and the runtime verification problem under full or partial observability. Our approaches, based on deep learning, are indeed applicable to real-world complex and safety-critical systems acting under strict real-time constraints and in presence of a significant amount of uncertainty.Cyber-Physical Systems (CPS) consist of digital devices that interact with some physical components. Their popularity and complexity are growing exponentially, giving birth to new, previously unexplored, safety-critical application domains. As CPS permeate our daily lives, it becomes imperative to reason about their reliability. Formal methods provide rigorous techniques for verification, control and synthesis of safe and reliable CPS. However, these methods do not scale with the complexity of the system, thus their applicability to real-world problems is limited. A promising strategy is to leverage deep learning techniques to tackle the scalability issue of formal methods, transforming unfeasible problems into approximately solvable ones. The approximate models are trained over observations which are solutions of the formal problem. In this thesis, we focus on the following tasks, which are computationally challenging: the modeling and the simulation of a complex stochastic model, the design of a safe and robust control policy for a system acting in a highly uncertain environment and the runtime verification problem under full or partial observability. Our approaches, based on deep learning, are indeed applicable to real-world complex and safety-critical systems acting under strict real-time constraints and in presence of a significant amount of uncertainty

    Enhancing Reaction-based de novo Design using Machine Learning

    Get PDF
    De novo design is a branch of chemoinformatics that is concerned with the rational design of molecular structures with desired properties, which specifically aims at achieving suitable pharmacological and safety profiles when applied to drug design. Scoring, construction, and search methods are the main components that are exploited by de novo design programs to explore the chemical space to encourage the cost-effective design of new chemical entities. In particular, construction methods are concerned with providing strategies for compound generation to address issues such as drug-likeness and synthetic accessibility. Reaction-based de novo design consists of combining building blocks according to transformation rules that are extracted from collections of known reactions, intending to restrict the enumerated chemical space into a manageable number of synthetically accessible structures. The reaction vector is an example of a representation that encodes topological changes occurring in reactions, which has been integrated within a structure generation algorithm to increase the chances of generating molecules that are synthesisable. The general aim of this study was to enhance reaction-based de novo design by developing machine learning approaches that exploit publicly available data on reactions. A series of algorithms for reaction standardisation, fingerprinting, and reaction vector database validation were introduced and applied to generate new data on which the entirety of this work relies. First, these collections were applied to the validation of a new ligand-based design tool. The tool was then used in a case study to design compounds which were eventually synthesised using very similar procedures to those suggested by the structure generator. A reaction classification model and a novel hierarchical labelling system were then developed to introduce the possibility of applying transformations by class. The model was augmented with an algorithm for confidence estimation, and was used to classify two datasets from industry and the literature. Results from the classification suggest that the model can be used effectively to gain insights on the nature of reaction collections. Classified reactions were further processed to build a reaction class recommendation model capable of suggesting appropriate reaction classes to apply to molecules according to their fingerprints. The model was validated, then integrated within the reaction vector-based design framework, which was assessed on its performance against the baseline algorithm. Results from the de novo design experiments indicate that the use of the recommendation model leads to a higher synthetic accessibility and a more efficient management of computational resources

    Emotion-aware cross-modal domain adaptation in video sequences

    Get PDF
    corecore