22 research outputs found

    PointMap: A real-time memory-based learning system with on-line and post-training pruning

    Full text link
    Also published in the International Journal of Hybrid Intelligent Systems, Volume 1, January, 2004A memory-based learning system called PointMap is a simple and computationally efficient extension of Condensed Nearest Neighbor that allows the user to limit the number of exemplars stored during incremental learning. PointMap evaluates the information value of coding nodes during training, and uses this index to prune uninformative nodes either on-line or after training. These pruning methods allow the user to control both a priori code size and sensitivity to detail in the training data, as well as to determine the code size necessary for accurate performance on a given data set. Coding and pruning computations are local in space, with only the nearest coded neighbor available for comparison with the input; and in time, with only the current input available during coding. Pruning helps solve common problems of traditional memory-based learning systems: large memory requirements, their accompanying slow on-line computations, and sensitivity to noise. PointMap copes with the curse of dimensionality by considering multiple nearest neighbors during testing without increasing the complexity of the training process or the stored code. The performance of PointMap is compared to that of a group of sixteen nearest-neighbor systems on benchmark problems.This research was supported by grants from the Air Force Office of Scientific Research (AFOSR F49620-98-l-0108, F49620-0l-l-0397, and F49620-0l-l-0423) and the Office of Naval Research (ONR N00014-0l-l-0624)

    Boosting Classifiers for Drifting Concepts

    Get PDF
    This paper proposes a boosting-like method to train a classifier ensemble from data streams. It naturally adapts to concept drift and allows to quantify the drift in terms of its base learners. The algorithm is empirically shown to outperform learning algorithms that ignore concept drift. It performs no worse than advanced adaptive time window and example selection strategies that store all the data and are thus not suited for mining massive streams. --

    Toward Automating and Systematizing the Use of Domain Knowledge in Feature Selection

    Get PDF
    University of Minnesota Ph.D. dissertation. August 2015. Major: Computer Science. Advisor: Maria Gini. 1 computer file (PDF); xi, 185 pages.Constructing prediction models for real-world domains often involves practical complexities that must be addressed to achieve good prediction results. Often, there are too many sources of data (features). Limiting the set of features in the prediction model is essential for good performance, but prediction accuracy may be degraded by the inadvertent removal of relevant features. The problem is even more acute in situations where the number of training instances is limited, as limited sample size and domain complexity are often attributes of real-world problems. This thesis explores the practical challenges of building regression models in large multivariate time-series domains with known relationships between variables. Further, we explore the conventional wisdom related to preparing datasets for model calibration in machine learning, and discuss best practices for learning time-varying concepts from data. The core contribution of this work is a novel wrapper-based feature selection framework called Developer-Guided Feature Selection (DGFS). It systematically incorporates domain knowledge for domains characterized by a large number of observable features. The observable features may be related to each other by logical, temporal, or spatial relationships, some of which are known to the model developer a priori. The approach relies on limited domain-specific knowledge but can replace or improve upon more elaborate domain specific models and on fully automated feature selection for many applications. As a wrapper-based approach, DGFS can augment existing multivariate techniques used in high-dimensional domains to produce improved modeling results particularly in situations where the volume of training data is limited. We demonstrate the viability of our method in several complex domains (natural and synthetic) that have significant temporal aspects and many observable features

    Performative Time-Series Forecasting

    Full text link
    Time-series forecasting is a critical challenge in various domains and has witnessed substantial progress in recent years. Many real-life scenarios, such as public health, economics, and social applications, involve feedback loops where predictions can influence the predicted outcome, subsequently altering the target variable's distribution. This phenomenon, known as performativity, introduces the potential for 'self-negating' or 'self-fulfilling' predictions. Despite extensive studies in classification problems across domains, performativity remains largely unexplored in the context of time-series forecasting from a machine-learning perspective. In this paper, we formalize performative time-series forecasting (PeTS), addressing the challenge of accurate predictions when performativity-induced distribution shifts are possible. We propose a novel approach, Feature Performative-Shifting (FPS), which leverages the concept of delayed response to anticipate distribution shifts and subsequently predicts targets accordingly. We provide theoretical insights suggesting that FPS can potentially lead to reduced generalization error. We conduct comprehensive experiments using multiple time-series models on COVID-19 and traffic forecasting tasks. The results demonstrate that FPS consistently outperforms conventional time-series forecasting methods, highlighting its efficacy in handling performativity-induced challenges.Comment: 12 pages (7 main text, 2 reference, 3 appendix), 3 figures, 4 table

    Lifelong Control of Off-grid Microgrid with Model Based Reinforcement Learning

    Full text link
    The lifelong control problem of an off-grid microgrid is composed of two tasks, namely estimation of the condition of the microgrid devices and operational planning accounting for the uncertainties by forecasting the future consumption and the renewable production. The main challenge for the effective control arises from the various changes that take place over time. In this paper, we present an open-source reinforcement framework for the modeling of an off-grid microgrid for rural electrification. The lifelong control problem of an isolated microgrid is formulated as a Markov Decision Process (MDP). We categorize the set of changes that can occur in progressive and abrupt changes. We propose a novel model based reinforcement learning algorithm that is able to address both types of changes. In particular the proposed algorithm demonstrates generalisation properties, transfer capabilities and better robustness in case of fast-changing system dynamics. The proposed algorithm is compared against a rule-based policy and a model predictive controller with look-ahead. The results show that the trained agent is able to outperform both benchmarks in the lifelong setting where the system dynamics are changing over time
    corecore