51 research outputs found

    Concept drift detection based on anomaly analysis

    Full text link
    © Springer International Publishing Switzerland 2014. In online machine learning, the ability to adapt to new concept quickly is highly desired. In this paper, we propose a novel concept drift detection method, which is called Anomaly Analysis Drift Detection (AADD), to improve the performance of machine learning algorithms under non-stationary environment. The proposed AADD method is based on an anomaly analysis of learner’s accuracy associate with the similarity between learners’ training domain and test data. This method first identifies whether there are conflicts between current concept and new coming data. Then the learner will incrementally learn the non conflict data, which will not decrease the accuracy of the learner on previous trained data, for concept extension. Otherwise, a new learner will be created based on the new data. Experiments illustrate that this AADD method can detect new concept quickly and learn extensional drift incrementally

    Heart failure hospitalization prediction in remote patient management systems

    Get PDF
    Healthcare systems are shifting from patient care in hospitals to monitored care at home. It is expected to improve the quality of care without exploding the costs. Remote patient management (RPM) systems offer a great potential in monitoring patients with chronic diseases, like heart failure or diabetes. Patient modeling in RPM systems opens opportunities in two broad directions: personalizing information services, and alerting medical personnel about the changing conditions of a patient. In this study we focus on heart failure hospitalization (HFH) prediction, which is a particular problem of patient modeling for alerting. We formulate a short term HFH prediction problem and show how to address it with a data mining approach. We emphasize challenges related to the heterogeneity, different types and periodicity of the data available in RPM systems. We present an experimental study on HFH prediction using, which results lay a foundation for further studies and implementation of alerting and personalization services in RPM systems

    MobilityMirror: Bias-Adjusted Transportation Datasets

    Full text link
    We describe customized synthetic datasets for publishing mobility data. Private companies are providing new transportation modalities, and their data is of high value for integrative transportation research, policy enforcement, and public accountability. However, these companies are disincentivized from sharing data not only to protect the privacy of individuals (drivers and/or passengers), but also to protect their own competitive advantage. Moreover, demographic biases arising from how the services are delivered may be amplified if released data is used in other contexts. We describe a model and algorithm for releasing origin-destination histograms that removes selected biases in the data using causality-based methods. We compute the origin-destination histogram of the original dataset then adjust the counts to remove undesirable causal relationships that can lead to discrimination or violate contractual obligations with data owners. We evaluate the utility of the algorithm on real data from a dockless bike share program in Seattle and taxi data in New York, and show that these adjusted transportation datasets can retain utility while removing bias in the underlying data.Comment: Presented at BIDU 2018 workshop and published in Springer Communications in Computer and Information Science vol 92

    Automated Adaptation Strategies for Stream Learning

    Get PDF
    Automation of machine learning model development is increasingly becoming an established research area. While automated model selection and automated data pre-processing have been studied in depth, there is, however, a gap concerning automated model adaptation strategies when multiple strategies are available. Manually developing an adaptation strategy can be time consuming and costly. In this paper we address this issue by proposing the use of flexible adaptive mechanism deployment for automated development of adaptation strategies. Experimental results after using the proposed strategies with five adaptive algorithms on 36 datasets confirm their viability. These strategies achieve better or comparable performance to the custom adaptation strategies and the repeated deployment of any single adaptive mechanism

    Identifying hidden contexts in classification

    No full text
    In this study we investigate how to identify hidden contexts from the data in classification tasks. Contexts are artifacts in the data, which do not predict the class label directly. For instance, in speech recognition task speakers might have different accents, which do not directly discriminate between the spoken words. Identifying hidden contexts is considered as data preprocessing task, which can help to build more accurate classifiers, tailored for particular contexts and give an insight into the data structure. We present three techniques to identify hidden contexts, which hide class label information from the input data and partition it using clustering techniques. We form a collection of performance measures to ensure that the resulting contexts are valid. We evaluate the performance of the proposed techniques on thirty real datasets. We present a case study illustrating how the identified contexts can be used to build specialized more accurate classifiers

    Learning with actionable attributes: Attention -- boundary cases!

    No full text
    Traditional supervised learning assumes that instances are described by observable attributes. The goal is to learn to predict the labels for unseen instances. In many real world applications the values of some attributes are not only observable, but can be proactively chosen by a decision maker. Furthermore, in some of such applications the decision maker is interested not only to generate accurate predictions, but to maximize the probability of the desired outcome. For example, a direct marketing manager can choose the color of an envelope (actionable attribute), in which the offer is sent to a client, hoping that the right choice will result in a positive response with a higher probability. We study how to learn to choose the value of an actionable attribute in order to maximize the probability of a desired outcome in supervised learning settings. We emphasize that not all instances are equally sensitive to change in actions. Accurate choice of an action is essential for those instances, which are on a borderline (e.g. do not have a strong opinion). We formulate three supervised learning approaches to select the value of an actionable attribute at an instance level. We focus the learning process to the borderline cases. The potential of the underlying ideas is demonstrated with synthetic examples and a case study with a real dataset

    Context-aware personal route recognition

    No full text
    Personal route recognition is an important element of intelligent transportation systems. The results may be used for providing personal information about location-specific events, services, emergency or disaster situations, for location-specific advertising and more. Existing real-time route recognition systems often compare the current driving trajectory against the trajectories observed in past and select the most similar route as the most likely. The problem is that such systems are inaccurate in the beginning of a trip, as typically several different routes start at the same departure point (e.g. home). In such situations the beginnings of trajectories overlap and the trajectory alone is insufficient to recognize the route. This drawback limits the utilization of route prediction systems, since accurate predictions are needed as early as possible, not at the end of the trip. To solve this problem we incorporate external contextual information (e.g. time of the day) into route recognition from trajectory. We develop a technique to determine from the historical data how the probability of a route depends on contextual features and adjust (post-correct) the route recognition output accordingly. We evaluate the proposed context-aware route recognition approach using the data on driving behavior of twenty persons residing in Aalborg, Denmark, monitored over two months. The results confirm that utilizing contextual information in the proposed way improves the accuracy of route recognition, especially in cases when the historical routes highly overlap

    Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models

    No full text
    Effective Protection of Fundamental Rights in a pluralist worl

    Introduction to the special issue on handling concept drift in adaptive information systems

    No full text
    • …
    corecore