915 research outputs found

    Ensemble Data Mining Methods

    Get PDF
    Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods that leverage the power of multiple models to achieve better prediction accuracy than any of the individual models could on their own. The basic goal when designing an ensemble is the same as when establishing a committee of people: each member of the committee should be as competent as possible, but the members should be complementary to one another. If the members are not complementary, Le., if they always agree, then the committee is unnecessary---any one member is sufficient. If the members are complementary, then when one or a few members make an error, the probability is high that the remaining members can correct this error. Research in ensemble methods has largely revolved around designing ensembles consisting of competent yet complementary models

    Classification

    Get PDF
    A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. In supervised learning, a set of training examples---examples with known output values---is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate's measurements. This chapter discusses methods to perform machine learning, with examples involving astronomy

    Online Bagging and Boosting

    Get PDF
    Bagging and boosting are two of the most well-known ensemble learning methods due to their theoretical performance guarantees and strong experimental results. However, these algorithms have been used mainly in batch mode, i.e., they require the entire training set to be available at once and, in some cases, require random access to the data. In this paper, we present online versions of bagging and boosting that require only one pass through the training data. We build on previously presented work by presenting some theoretical results. We also compare the online and batch algorithms experimentally in terms of accuracy and running time

    Electrical Properties of Organic and Organometallic Compounds

    Get PDF

    Continuous Uniform Finite Time Stabilization of Planar Controllable Systems

    Get PDF
    Continuous homogeneous controllers are utilized in a full state feedback setting for the uniform finite time stabilization of a perturbed double integrator in the presence of uniformly decaying piecewise continuous disturbances. Semiglobal strong C1\mathcal{C}^1 Lyapunov functions are identified to establish uniform asymptotic stability of the closed-loop planar system. Uniform finite time stability is then proved by extending the homogeneity principle of discontinuous systems to the continuous case with uniformly decaying piecewise continuous nonhomogeneous disturbances. A finite upper bound on the settling time is also computed. The results extend the existing literature on homogeneity and finite time stability by both presenting uniform finite time stabilization and dealing with a broader class of nonhomogeneous disturbances for planar controllable systems while also proposing a new class of homogeneous continuous controllers

    Comparison of ERBS orbit determination accuracy using batch least-squares and sequential methods

    Get PDF
    The Flight Dynamics Div. (FDD) at NASA-Goddard commissioned a study to develop the Real Time Orbit Determination/Enhanced (RTOD/E) system as a prototype system for sequential orbit determination of spacecraft on a DOS based personal computer (PC). An overview is presented of RTOD/E capabilities and the results are presented of a study to compare the orbit determination accuracy for a Tracking and Data Relay Satellite System (TDRSS) user spacecraft obtained using RTOS/E on a PC with the accuracy of an established batch least squares system, the Goddard Trajectory Determination System (GTDS), operating on a mainframe computer. RTOD/E was used to perform sequential orbit determination for the Earth Radiation Budget Satellite (ERBS), and the Goddard Trajectory Determination System (GTDS) was used to perform the batch least squares orbit determination. The estimated ERBS ephemerides were obtained for the Aug. 16 to 22, 1989, timeframe, during which intensive TDRSS tracking data for ERBS were available. Independent assessments were made to examine the consistencies of results obtained by the batch and sequential methods. Comparisons were made between the forward filtered RTOD/E orbit solutions and definitive GTDS orbit solutions for ERBS; the solution differences were less than 40 meters after the filter had reached steady state

    Ask-The-Expert: Minimizing Human Review for Big Data Analytics Through Active Learning

    Get PDF
    In this CIF project, we worked toward semi-automating knowledge discovery from anomaly detection algorithms through the use of active learning. Active learning is an area of research within machine learning that uses an "expert in the loop" to learn from large data sets that have very few annotations or labels available, and where providing such labels is expensive. In our case, the task can be defined as the identification of safety events from flight operational data. Since traditional anomaly detection algorithms cannot differentiate between operationally relevant and irrelevant statistical anomalies, Subject Matter Experts (SMEs) have a lengthy and expensive burden of investigating every example identified by the detection algorithm, classifying and labeling them as relevant or irrelevant. Active learningidentifies the unlabeled example for which a label would most improve the classifier, asks the domain expert for a label, and repeats this process until there are no more resources (time, budget) available for labeling or a minimum required performance is reached. A positive label indicates an operationally significant safety event whereas a negative label indicates otherwise. Based on these few labels we propose to build an active learning system that utilizes the SME's time in the most effective manner by iteratively asking for labels for as few informative instances as possible. Our work was proposed to be a stepping stone toward implementation and deployment of the system with user interface to be pursued by the Aviation Operations and Safety Program (AOSP) given its interest in safety monitoring and discovery of safety incidents

    Multiple Kernel Learning for Heterogeneous Anomaly Detection: Algorithm and Aviation Safety Case Study

    Get PDF
    The world-wide aviation system is one of the most complex dynamical systems ever developed and is generating data at an extremely rapid rate. Most modern commercial aircraft record several hundred flight parameters including information from the guidance, navigation, and control systems, the avionics and propulsion systems, and the pilot inputs into the aircraft. These parameters may be continuous measurements or binary or categorical measurements recorded in one second intervals for the duration of the flight. Currently, most approaches to aviation safety are reactive, meaning that they are designed to react to an aviation safety incident or accident. In this paper, we discuss a novel approach based on the theory of multiple kernel learning to detect potential safety anomalies in very large data bases of discrete and continuous data from world-wide operations of commercial fleets. We pose a general anomaly detection problem which includes both discrete and continuous data streams, where we assume that the discrete streams have a causal influence on the continuous streams. We also assume that atypical sequence of events in the discrete streams can lead to off-nominal system performance. We discuss the application domain, novel algorithms, and also discuss results on real-world data sets. Our algorithm uncovers operationally significant events in high dimensional data streams in the aviation industry which are not detectable using state of the art method

    nu-Anomica: A Fast Support Vector Based Novelty Detection Technique

    Get PDF
    In this paper we propose nu-Anomica, a novel anomaly detection technique that can be trained on huge data sets with much reduced running time compared to the benchmark one-class Support Vector Machines algorithm. In -Anomica, the idea is to train the machine such that it can provide a close approximation to the exact decision plane using fewer training points and without losing much of the generalization performance of the classical approach. We have tested the proposed algorithm on a variety of continuous data sets under different conditions. We show that under all test conditions the developed procedure closely preserves the accuracy of standard one-class Support Vector Machines while reducing both the training time and the test time by 5 - 20 times
    corecore