86,640 research outputs found

    Efficient Online Decision Tree Learning with Active Feature Acquisition

    Full text link
    Constructing decision trees online is a classical machine learning problem. Existing works often assume that features are readily available for each incoming data point. However, in many real world applications, both feature values and the labels are unknown a priori and can only be obtained at a cost. For example, in medical diagnosis, doctors have to choose which tests to perform (i.e., making costly feature queries) on a patient in order to make a diagnosis decision (i.e., predicting labels). We provide a fresh perspective to tackle this practical challenge. Our framework consists of an active planning oracle embedded in an online learning scheme for which we investigate several information acquisition functions. Specifically, we employ a surrogate information acquisition function based on adaptive submodularity to actively query feature values with a minimal cost, while using a posterior sampling scheme to maintain a low regret for online prediction. We demonstrate the efficiency and effectiveness of our framework via extensive experiments on various real-world datasets. Our framework also naturally adapts to the challenging setting of online learning with concept drift and is shown to be competitive with baseline models while being more flexible

    Improving adaptive bagging methods for evolving data streams

    Get PDF
    We propose two new improvements for bagging methods on evolving data streams. Recently, two new variants of Bagging were proposed: ADWIN Bagging and Adaptive-Size Hoeffding Tree (ASHT) Bagging. ASHT Bagging uses trees of different sizes, and ADWIN Bagging uses ADWIN as a change detector to decide when to discard underperforming ensemble members. We improve ADWIN Bagging using Hoeffding Adaptive Trees, trees that can adaptively learn from data streams that change over time. To speed up the time for adapting to change of Adaptive-Size Hoeffding Tree (ASHT) Bagging, we add an error change detector for each classifier. We test our improvements by performing an evaluation study on synthetic and real-world datasets comprising up to ten million examples

    Novel proposal for prediction of CO2 course and occupancy recognition in Intelligent Buildings within IoT

    Get PDF
    Many direct and indirect methods, processes, and sensors available on the market today are used to monitor the occupancy of selected Intelligent Building (IB) premises and the living activities of IB residents. By recognizing the occupancy of individual spaces in IB, IB can be optimally automated in conjunction with energy savings. This article proposes a novel method of indirect occupancy monitoring using CO2, temperature, and relative humidity measured by means of standard operating measurements using the KNX (Konnex (standard EN 50090, ISO/IEC 14543)) technology to monitor laboratory room occupancy in an intelligent building within the Internet of Things (IoT). The article further describes the design and creation of a Software (SW) tool for ensuring connectivity of the KNX technology and the IoT IBM Watson platform in real-time for storing and visualization of the values measured using a Message Queuing Telemetry Transport (MQTT) protocol and data storage into a CouchDB type database. As part of the proposed occupancy determination method, the prediction of the course of CO2 concentration from the measured temperature and relative humidity values were performed using mathematical methods of Linear Regression, Neural Networks, and Random Tree (using IBM SPSS Modeler) with an accuracy higher than 90%. To increase the accuracy of the prediction, the application of suppression of additive noise from the CO2 signal predicted by CO2 using the Least mean squares (LMS) algorithm in adaptive filtering (AF) method was used within the newly designed method. In selected experiments, the prediction accuracy with LMS adaptive filtration was better than 95%.Web of Science1223art. no. 454

    Using Graph Properties to Speed-up GPU-based Graph Traversal: A Model-driven Approach

    Get PDF
    While it is well-known and acknowledged that the performance of graph algorithms is heavily dependent on the input data, there has been surprisingly little research to quantify and predict the impact the graph structure has on performance. Parallel graph algorithms, running on many-core systems such as GPUs, are no exception: most research has focused on how to efficiently implement and tune different graph operations on a specific GPU. However, the performance impact of the input graph has only been taken into account indirectly as a result of the graphs used to benchmark the system. In this work, we present a case study investigating how to use the properties of the input graph to improve the performance of the breadth-first search (BFS) graph traversal. To do so, we first study the performance variation of 15 different BFS implementations across 248 graphs. Using this performance data, we show that significant speed-up can be achieved by combining the best implementation for each level of the traversal. To make use of this data-dependent optimization, we must correctly predict the relative performance of algorithms per graph level, and enable dynamic switching to the optimal algorithm for each level at runtime. We use the collected performance data to train a binary decision tree, to enable high-accuracy predictions and fast switching. We demonstrate empirically that our decision tree is both fast enough to allow dynamic switching between implementations, without noticeable overhead, and accurate enough in its prediction to enable significant BFS speedup. We conclude that our model-driven approach (1) enables BFS to outperform state of the art GPU algorithms, and (2) can be adapted for other BFS variants, other algorithms, or more specific datasets
    corecore